Segments containing log_action.name (for example page_url) contains are very slow #19581
Labels
c: Performance
For when we could improve the performance / speed of Matomo.
Milestone
refs #8850 and #8867
When someone creates a segment like
page_url contains '%...%'
orpage_title contains '%...%'
orevent_name contains '%...%'
then these kind of queries are very slow as it requires a full table scan onlog_action
. They result in a where clause for example like this( ( log_link_visit_action.idaction_url IN (SELECT idaction FROM log_action WHERE ( name LIKE CONCAT('%', 'njw', '%') AND type = 10 )) ) )
. With a lot of data inlog_action
table these queries can be extremely slow, cause a lot of CPU usage etc.As it's one of Matomo's biggest bottleneck, we have investigated a few times in the past whether we can optimise this query but couldn't find any solution.
In #8867 there was a config
enable_segments_subquery_cache
created but this doesn't help all that much as the cache might become outdated (it caches the data for 1 hour) and the cache re-usage would be limited during that hour as we already usetemporary tables
when archiving to only needing to execute this query once.I'm now creating this issue again in case there are other ways to make this faster. Like maybe we can maintain MySQL tables for each such segment definition so that we not only write
log_action
entries inlog_action
but also eg in a tablelog_action_pageurl_contains_foo
and we reuse the same log_action IDs etc in that table. This way we wouldn't need to run thecontains
query at all but could change it to join that table instead. This of course might cause more writes during tracking and not sure this is beneficial or worth it. There might be also other ideas.The text was updated successfully, but these errors were encountered: