Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When exporting a flattened report, keep each flattened dimension as a separate column #12163

Closed
mattab opened this issue Oct 10, 2017 · 7 comments · Fixed by #12199
Closed

When exporting a flattened report, keep each flattened dimension as a separate column #12163

mattab opened this issue Oct 10, 2017 · 7 comments · Fixed by #12199
Assignees
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. Major Indicates the severity or impact or benefit of an issue is much higher than normal but not critical.
Milestone

Comments

@mattab
Copy link
Member

mattab commented Oct 10, 2017

Currently when we export a flattened report the CSV output (and others) looks like this:

Label | Visits | Total events | Events with a value | Total value | Minimum value | Maximum value | Unique visitors (daily sum) | The average of all values for this event
-- | -- | -- | -- | -- | -- | -- | -- | --
PluginTabs - Preview | 78 | 97 | 0 | 0 | 0 | 0 | 76 | 0
PluginTabs - Description | 35 | 41 | 0 | 0 | 0 | 0 | 34 | 0
PluginTabs - Faq | 19 | 20 | 0 | 0 | 0 | 0 | 19 | 0
PluginTabs - Documentation | 18 | 24 | 0 | 0 | 0 | 0 | 18 | 0

Instead we would like each individual columns listed like below, with their correct name and value, for all reports combining several levels and exported as flattened with flat=1

Event name | Event category | Label | Visits | Total events | Events with a value | Total value | Minimum value | Maximum value | Unique visitors (daily sum) | The average of all values for this event
-- | -- | -- | -- | -- | -- | -- | -- | --
PluginTabs | Preview | PluginTabs - Preview | 78 | 97 | 0 | 0 | 0 | 0 | 76 | 0
PluginTabs | Description | PluginTabs - Description | 35 | 41 | 0 | 0 | 0 | 0 | 34 | 0
PluginTabs | Faq | PluginTabs - Faq | 19 | 20 | 0 | 0 | 0 | 0 | 19 | 0
PluginTabs | Documentation | PluginTabs - Documentation | 18 | 24 | 0 | 0 | 0 | 0 | 18 | 0

Notes

  • it's important we leave the Label column for backward compatibility.
  • the new columns should be in both the reporting API and the processed report/metadata API

Why is this important? having the components of the Label column clearly separated (un-flattened) allows easy data analysis in the spreadsheet or in other data analysis systems. this will also be very helpful for Custom Reports export where the exported report should have all dimensions clearly listed out.

@mattab mattab added the Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. label Oct 10, 2017
@mattab mattab added this to the 3.2.1 milestone Oct 10, 2017
@tsteur
Copy link
Member

tsteur commented Oct 10, 2017

BTW: If you do it like this and this is not controlled by a new API parameter, you will be likely breaking API and needs to be mentioned in advance. Luckily mobile app is not using flatten, otherwise this could break it.

@tsteur
Copy link
Member

tsteur commented Oct 10, 2017

And for processed reports you might also need to add new elements to mention which columns are the label columns, and in which order. Similar to < metrics> and <processedMetrics> Otherwise hard to process for a computer this API output.

@sgiehl
Copy link
Member

sgiehl commented Oct 12, 2017

While trying to implement this I encountered one problem, which I'm not sure how to solve.

How should those "metrics" be named when the metrics are not returned translated?
It is possible to get the translated name of the label column using the report classes and their associated dimensions, but I don't see any useful name to use when it should be untranslated.
Using label1, label2,... instead does not make much sense, as no one could see what it stands for.

My current implementation always uses the translated name, but that might be hard to process automatically, as the name might change when a translation changes.

@mattab @tsteur Any opinion how we should solve this?

@tsteur
Copy link
Member

tsteur commented Oct 12, 2017

Not quite sure what you mean when metrics are not returned translated? From this I am reading there is a URL param to do that? I presume when metrics are not translated, we return the metric name?

For dimensions we would then return the dimensionId? Also could use label1, label2 but not sure which is best. I reckon CSV export etc is mostly for human processing and if a computer wants to process it, they almost have to use API.getProcessedReport where all the metadata is known. That's why I mentioned we need to put the label columns into the metadata output of a processed report.

Can you send a link to an example when metrics are returned untranslated?

@sgiehl
Copy link
Member

sgiehl commented Oct 12, 2017

Translated column names would be https://demo.piwik.org/index.php?module=API&method=Events.getName&idSite=3&period=year&date=yesterday&format=HTML&translateColumnNames=1

or untranslated: https://demo.piwik.org/index.php?module=API&method=Events.getName&idSite=3&period=year&date=yesterday&format=HTML

The main problem is, that in the datatables the is only a label column. This label column doesn't represent any metric in most cases. So there is only a translated name for this label column. But using the dimension id might be a good solution

@tsteur
Copy link
Member

tsteur commented Oct 12, 2017

I reckon dimension id would be consistent, and if not available for some reason, fall back to label1 and label2 (we could still show eg dimesionid_foo,label2 when we have eg the dimension id for one dimension? I reckon...)

Apart from this the output of those reports is not very good for computer processing anyway and cannot really be used, even when requesting JSON format etc. If someone wants to process data automatically, API.getProcessedReport is the way to go. So it should be totally fine to use dimension id

@mattab
Copy link
Member Author

mattab commented Oct 12, 2017

good point, It would be also needed to have the extra columns in the Processed report API output as well. (edited issue)

@sgiehl sgiehl self-assigned this Oct 19, 2017
@mattab mattab modified the milestones: 3.2.1, 3.2.2 Nov 19, 2017
@mattab mattab added the Major Indicates the severity or impact or benefit of an issue is much higher than normal but not critical. label Jun 26, 2018
@mattab mattab modified the milestones: 3.6.0, 3.7.0 Aug 6, 2018
@mattab mattab modified the milestones: 3.7.0, 3.6.1 Sep 1, 2018
@mattab mattab modified the milestones: 3.6.1, 3.7.0 Oct 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. Major Indicates the severity or impact or benefit of an issue is much higher than normal but not critical.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants