Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster flatten for some reports #7336

Merged
merged 4 commits into from Mar 9, 2015
Merged

Faster flatten for some reports #7336

merged 4 commits into from Mar 9, 2015

Conversation

tsteur
Copy link
Member

@tsteur tsteur commented Mar 3, 2015

refs #5098

  • Makes flatten of Actions.*, Events.getCategory, Events.getAction, Events.getName, Referrers.getWebsites much faster and if it failed before it should not anymore
  • Many flatten operations did not work correctly before and either not flatten or did it wrong
  • Makes all other kind of normal reports faster as well
  • Many bugfixes for flatten and for other reports (eg Reports Total Calculation)

I will rebase and squash before merging, please let me know!

@tsteur tsteur added Needs Review PRs that need a code review Pull Request WIP Indicates the current pull request is still work in progress and not ready yet for a review. and removed Needs Review PRs that need a code review labels Mar 3, 2015
* @return DataTable|DataTable\Map
*/
public function flatten($dataTable)
public function flatten($dataTable, $recursiveLabelSeparator)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$recursiveLabelSeparator is no longer hard coded / hacked in the Flattener, it can be defined by each report now.

@mattab mattab modified the milestone: Piwik 2.12.0 Mar 3, 2015
@mattab mattab added the Needs Review PRs that need a code review label Mar 3, 2015
@tsteur tsteur removed the Pull Request WIP Indicates the current pull request is still work in progress and not ready yet for a review. label Mar 3, 2015
@mnapoli
Copy link
Contributor

mnapoli commented Mar 3, 2015

Is it a collection of small fixes and improvements or one big refactoring? If it's one big change, it would be interesting to share an overview of what you did because reading the diff is not really easy. Also do you have any numbers (even rough ones) to share regarding the perf improvements?

@tsteur
Copy link
Member Author

tsteur commented Mar 4, 2015

It is a collection of small fixes and improvements see comments. No big refactoring. I only made many things faster like sorting etc see comments.

One big performance improvement was to no longer request each subtable separately but load the expanded datatable at once and then flatten the expanded table.

It is hard to show any numbers since in most cases flatten crashed after X seconds/minutes and it was often not even possible to flatten. Not even on demo etc which now loads in < 1 second. Also it depends on the structure of the datatable and of the report. Eg whether there are many first level rows, how many subtables the tables has etc. Basically one can say a flattened table with 25k rows in the UI takes less than 5 seconds, in an API call maybe in even 2 or 3 seconds. On a fast CPU maybe even < 1 or 2 seconds. The more rows a datatable has the better the benefit will be.

}
});

// TODO can we remove this one again?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this TODO should still be there?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be there. We need to discuss this end of week

@mattab
Copy link
Member

mattab commented Mar 9, 2015

I'll merge it so we can test on the demo. Lots of small CPU friendly changes 👍

@mattab
Copy link
Member

mattab commented Mar 9, 2015

@tsteur
Copy link
Member Author

tsteur commented Mar 9, 2015

The UI tests fail because of the other pull request as mentioned there. It is not a bug

@tsteur tsteur deleted the fast_flatten_2 branch March 9, 2015 20:42
@mattab
Copy link
Member

mattab commented Mar 10, 2015

the secondary column sort request is followed up in #7401

@mattab mattab added the c: Performance For when we could improve the performance / speed of Matomo. label Mar 22, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: Performance For when we could improve the performance / speed of Matomo. Needs Review PRs that need a code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants