The goal of this issue is to make archiving faster for Date Range requests and Custom Segments requests. This follows up: #4768
Currently when a date range API query is issued, it will pre-process all reports for the given plugin being requested. For example when requesting the "Page URLs" report for a custom date range or segment, it will pre-process all Actions plugin reports including Page URLs, Page titles, Downloads, Outlinks.
In this issue we would like to improve the algorithm so that only the requested report (ie. a Blob Record) is archived.
This may need to be done at the same time as #7470
@mattab @diosmosis I think now that we have the ability to archive only requested reports (and support partial archives) maybe we could use this also for ranges (which are aggregated during browser archiving usually).
Say you request a custom report and you have 50 custom reports configured, it means it will aggregate the ranges for all 50 custom reports. If some of them need to process unique metrics it will do this also for all of them even though you might never look at the data.
Same applies to goals, forms, funnels, ...
Each plugin's archivers could support
isRequestedReport() maybe and we'd need to set
setArchiveOnlyReport() in the archive parameters.
It might not be the best idea though to reuse this logic and maybe would need to use a different logic like in https://github.com/matomo-org/matomo/blob/4.3.0-b3/core/Archive.php#L538 forward record names or so to the archiver so it will only archive requested record names. Benefit be plugin developers won't need to change anything when requesting a report and only would need to check if a specific record name was requested in the archiver.
@mattab eventually should look at this maybe. Also from the background that it would not only make ranges faster but it could maybe at the same time eventually also make browser archiving faster in general (any period). Even if we can't fix it this may be more of a major issue
@tsteur this would be a high value improvement for UI performance and reducing DB usage, significantly in some cases. Sounds good to schedule it in a 4.x.0 :rocket:
I was more meaning only adding a major issue but not scheduling it yet but added it for now to 4.5 milestone as it would be good to have but not 100% sure on it's impact yet (in some cases it be quite a big impact though). To be seen how easy it is to develop.