@mattab opened this Issue on August 13th 2019 Member

This is a long known issue: Transition report can be very slow to run. Often even times out as it doesn't complete in 30 or 60 seconds. As far as I know, transitions loading slowly is the only part of Matomo that is consistently and predictably slow (on any medium or large site. does not have to be a huge website to experience this slowdown or malfunction).

We document the workaround in FAQ: https://matomo.org/faq/how-to/faq_161/

Now that we improved so much of Matomo and everything else should load really fast, it feels unacceptable whenever Transitions takes 30 or 60 or 120 seconds to load. How many hundreds/thousands of people get frustrated when using transitions? Knowing there is a solution, it feels un-necessary. The solution involves adding several INDEX which will make tracking slightly slower. But I reckon the extra INDEXes are worth the benefits of having an always-fast Matomo UI.

--> My suggestion is to add the Transitions index into core for everyone while upgrading to Matomo 4.0.0:

ALTER TABLE  `matomo_log_link_visit_action` ADD INDEX `transitions_url` (  `idaction_url` ,  `idsite` ,  `server_time` );
ALTER TABLE  `matomo_log_link_visit_action` ADD INDEX `transitions_url_ref` (  `idaction_url_ref` ,  `idsite` ,  `server_time` );
ALTER TABLE  `matomo_log_visit` ADD INDEX `transitions` (  `visit_entry_idaction_url` ,  `idsite` ,  `visit_last_action_time` );

(We would also want to create the index for all cloud customers, as even on our own instance (not large traffic) Transitions takes 60+ seconds to load for a yearly report.)

@tsteur commented on August 13th 2019 Member

I would definitely not want to see these indexes there as it makes things so much slower in the critical tracker part. There must be other ways to solve this. Be good to investigate how this could be made faster without needing new index.

Also in general in case we did as part of the hot/cold table issue https://github.com/matomo-org/matomo/issues/14119
have one log table with recent values where we track into, then we wouldn't need the indexes on that table as it would be likely fast there and would only need the index on the table that holds all other entries. It would at least not slow tracking down this way etc.

Nonetheless there might be other ways to improve this. Eg can we find an efficient way to archive this for say the top 1K page urls etc

Powered by GitHub Issue Mirror