Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep flattened columns as extra columns #12199

Merged
merged 25 commits into from Dec 8, 2018
Merged

Keep flattened columns as extra columns #12199

merged 25 commits into from Dec 8, 2018

Conversation

sgiehl
Copy link
Member

@sgiehl sgiehl commented Oct 16, 2017

Flattened columns will be added with their dimension id. If dimension id is the same for subtables (like in pages reports) the values will be concatenated.
If a table has no dimension a numerated label will be used instead. e.g. label1

fixes #12163

@sgiehl sgiehl added the Pull Request WIP Indicates the current pull request is still work in progress and not ready yet for a review. label Oct 16, 2017
@mattab mattab added this to the 3.2.1 milestone Oct 16, 2017
mattab
mattab previously requested changes Oct 16, 2017
Copy link
Member

@mattab mattab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • For the processed report, we may need to add a new system test in case they are not tested yet

@@ -34,6 +38,8 @@
<sum_visit_length>1</sum_visit_length>
<bounce_count>0</bounce_count>
<nb_visits_converted>0</nb_visits_converted>
<Referrers.Website>www.referrer2.com</Referrers.Website>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalid XML

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually it's not invalid as . is valid for xml tags as long as it doesn't begin with .. But I'll replace that with _

@sgiehl sgiehl force-pushed the flattenexport branch 4 times, most recently from 412f707 to d5e877e Compare October 17, 2017 18:47
@sgiehl
Copy link
Member Author

sgiehl commented Oct 17, 2017

@mattab how should processed reports be handled?
Currently those additional columns aren't included as they are not defined in the reports metadata.
Tweaking the reports metadata by default is no option, as those columns aren't included in non-flat reports. So we would need to tweak the report results when the report is being processed, but I'm currently a bit lost how to do that proper.
Flat processed reports already have an incorrect label, as the dimension defined in metadata is used. Instead the label is combined with the one of the subtables. For example https://demo.piwik.org/index.php?module=API&method=API.getProcessedReport&apiModule=Referrers&apiAction=getSearchEngines&format=XML&idSite=3&period=month&date=2017-10-14&flat=1&filter_limit=25 uses Search Engine as label. Instead it would need to be Search Engine - Keyword. But actually there is no information that the keyword was merged in the report.

@sgiehl
Copy link
Member Author

sgiehl commented Oct 25, 2017

@mattab @tsteur anyone able to give me a smart hint how to proceed?

@tsteur
Copy link
Member

tsteur commented Oct 25, 2017

True, we would need to make sure to manipulate the label as well when flatten is used. Might need to post an event like API.getProcessedReport.end in ProcessedReport::getProcessedReport or handle it directly in there (would need to forward flatten parameter). In Report::buildReportMetadata() we cannot really do it as we don't know there whether a report is flattened or not (and we cannot rely on $_GET/$_POST).

I totally see what you mean...

I wonder if we can in Report::buildReportMetadata() always return all labels of all possible dimensions like

        $report['dimensions']           = [$this->getDimension()->getId() => $this->getDimension()->getName(), $this->getSubtableDimension()->getId() => $this->getSubtableDimension()->getName()]; // note subtable dimensions might be not defined, and some reports have multiple dimensions so we would need a method like `$this->getDimensions()` or so.
        $report['metrics']              = $this->getMetrics();
        $report['metricsDocumentation'] = $this->getMetricsDocumentation();
        $report['processedMetrics']     = $this->getProcessedMetrics();

So we basically add this always to the report metadata output. And maybe then in ProcessedReport::getProcessedReport() we can do like an implode(' - ', $report['dimensions']) when flatten is used for the label? The -might be needed to be adjusted as a different string can be configured in the report class or so.

We would also need to add the information to columns => $columns in the processed report class.

I just checked Piwik Mobile roughly and it will likely not break it, but needs to be double checked. Actually, I just checked again and there is quite a chance it might break Piwik Mobile as it will think that any column specified in columns is a metric.

Likely this issue can break various implementations because. Adding something to columns that is not a metric can lead to breaking behaviour, and adding new row columns that are not metrics can break behaviour as well as so far it is assumed that numbers are returned in all fields except label which will be no longer the case. Wondering if better to do in Piwik 4?

@mattab
Copy link
Member

mattab commented Oct 26, 2017

It would be good to keep BC.

An idea:

  • maybe the new columns can be added in a new <columnsLabel> eg.
<columnsLabel>
  <browser>Browser name</browser>
  <pageTitle>Page Title</pageTitle>
  <dimension3>My Custom Dimension</dimension3>
</columnsLabel>
  • and the value for these columnsLabel for each row could be added in the <reportMetadata> eg.
<reportMetadata>
<row>
  <browser>Firefox</browser>
  <pageTitle>Homepage</pageTitle>
  <dimension3>valueXYZ</dimension3>
  <url>http://piwik.org/faq/general/#faq_144</url>
  <logo>plugins/Morpheus/icons/dist/searchEngines/google.com.png</logo>
</row>
<row>
  <browser>Chome</browser>
  <pageTitle>Homepage</pageTitle>
  <dimension3>hello world</dimension3>
  <url>http://piwik.org/faq/general/#faq_144</url>
  <logo>plugins/Morpheus/icons/dist/searchEngines/bing.com.png</logo>
</row>
</reportMetadata>

@mattab
Copy link
Member

mattab commented Oct 26, 2017

Wondering if better to do in Piwik 4?

Would be great to do it before in Processed Report output 👍

@tsteur
Copy link
Member

tsteur commented Oct 27, 2017

In report metadata could work to not break API. In Piwik 4 we would then maybe actually put it into the row element (or maybe not, to be discussed).

@mattab
Copy link
Member

mattab commented Nov 19, 2017

@sgiehl Any update on this one? it is an important one for Custom Reports users

@mattab mattab modified the milestones: 3.2.1, 3.2.2 Nov 19, 2017
@sgiehl sgiehl force-pushed the flattenexport branch 4 times, most recently from ccc1cf6 to cf8e567 Compare November 24, 2017 12:44
@sgiehl
Copy link
Member Author

sgiehl commented Nov 24, 2017

I've updated the PR.
Everything is now added to the metadata.
If the report has more than one dimension a new metadata entry dimensions will be added, listing all dimensions. (Are report is considered as having more dimension if getDimensions returns an array with more than one entries, by default the array contains the reports dimension and if available the dimension of the subtable)
For flattened reports those dimensions will be additionally added to each rows metadata. This also applies for processed reports as row metadata is taken from all sub reports.

@mattab
Copy link
Member

mattab commented Dec 4, 2017

Looks good to me and should be safe and keep BC 👍

Feedback:

  • for the dimension ID in the API output, could we use the Segment string eg. <downloadUrl>... instead of <Actions_DownloadUrl>..., eg. <dimension3> instead of <CustomDimensions_dimension3>
    • (to indicate the column is part of the label/main dimension, we could maybe even name label_downloadUrl or label_dimension3? not too sure if it's better)
  • as expected there are some tests failing in CustomDimensions and MarketingCampaignReporting plugins

@sgiehl
Copy link
Member Author

sgiehl commented Dec 4, 2017

for the dimension ID in the API output, could we use the Segment string eg. ... instead of <Actions_DownloadUrl>..., eg. instead of <CustomDimensions_dimension3>
(to indicate the column is part of the label/main dimension, we could maybe even name label_downloadUrl or label_dimension3? not too sure if it's better)

Using the segment name can only work if a segment is defined. As this is a generic solution there might be plugins where no segment is defined for a subtable. In this case we would need to have a fallback

@mattab
Copy link
Member

mattab commented Dec 4, 2017

Good point. When there is no segment defined, maybe we could fallback to $Plugin_$Dimension? Btw, was this 'dimension key' already used somewhere else in Piwik?

@sgiehl sgiehl added Needs Review PRs that need a code review and removed Pull Request WIP Indicates the current pull request is still work in progress and not ready yet for a review. labels Dec 5, 2017
@sgiehl sgiehl force-pushed the flattenexport branch 2 times, most recently from 8e84778 to 18dc3b8 Compare December 9, 2017 19:46
@sgiehl
Copy link
Member Author

sgiehl commented Dec 10, 2017

It currently uses the dimension id, so probably something that might be used somewhere else in the future. So I wouldn't rename it to anything else. Also this label name needs to be used to add the translations...

sgiehl and others added 22 commits December 6, 2018 22:11
…ested (#13796)

Could fix an edge case where user is logged in, but hasn't confirmed the auth code (so the user is not actually logged in), and then an update appears.
* added fallback method for Alexa, fixes issue #13427

* do not use short array syntax for consistency with other methods

* use mini link for Alexa, use DomXPath to filter out the global ranking instead of regex
* use db sessions by default, deprecate file session handler

* trying to fix tests
* update submodule

* Update screenshots and try to get test to pass.
fixes #13272

Haven't actually tested it but should fix the issue. If tests pass, the logic would be still the same. I don't have a PHP 7.2 running here otherwise at the moment
* send bulk requests in chunks

* send requests correctly
It won't remember any hash as the hash won't be visible in the referrer etc but it would work for most other pages.

To make it work for hash it would get likely way more complicated like we would need to persist it through JS, temporarily store it somewhere and redirect accordingly. It fixes the case mentioned in the issue.

fix #13328
* Add option to opt in to use send beacon

* Fix JS tracker test.
@sgiehl
Copy link
Member Author

sgiehl commented Dec 6, 2018

slightly updated the PR to fix the url fields

@diosmosis diosmosis merged commit 33f074e into 3.x-dev Dec 8, 2018
@diosmosis diosmosis deleted the flattenexport branch December 8, 2018 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Review PRs that need a code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When exporting a flattened report, keep each flattened dimension as a separate column
5 participants