@tsteur opened this Issue on November 9th 2021 Member

I noticed that 2/3rd of sites use neither goals nor ecommerce feature.

This means the goals archiver wouldn't need to archive anything when there is no goal and no ecommerce configured for a site.

If we didn't archive in that case, this would be especially useful because in the Goals archiver, we process two dependent archives for new visitors and returning visitors in click and in click. This can be quite slow because it means for every segment (including the All Visits segments), we need to archive two additional segments. This means we need to create two additional temporary tables for the segment to get this data and it can be quite slow to resolve these segment queries. Not doing this can greatly improve archiving overall.

I've prepared a rough fix in https://github.com/matomo-org/matomo/compare/improvegoalarchive?expand=1 but noticed many tests are now failing (see click) because they don't have the ecommerce feature enabled but were tracking ecommerce data nonetheless.

The goal of this issue would be to check if the so far implemented logic makes sense and make the tests work. If tests aren't easy fixable or if there is any issue then we could otherwise also think about only skipping the processDependentArchive in the Goals\Archiver when there is no goal and no ecommerce enabled.

Eg in the tests some fixtures would probably need to be updated that they no longer return data. In some other tests we'd want the ecommerce feature enabled for a site. This way we can test some sites where the archiver skips it and some sites where it's enabled.

The only downside to this is, once you create a goal or enable ecommerce, it will try to archive the data for the entire year for "processDependentArchive" when the yearly archive is being requested. That should be fine though I suppose. Any thoughts on this?

@sgiehl commented on November 9th 2021 Member

@tsteur sound good. Alternatively we could adjust the archiver to not fire any queries if there aren't any goals and simply insert null records. That way there might not be any rearchiving if a goal is created later

@tsteur commented on November 9th 2021 Member

Insert null records could work 👍

Powered by GitHub Issue Mirror