@weixuezhang opened this Issue on August 31st 2021

When no columns specified in parameter, Visits Over Time uses all available columns which causes report generate slowly. In this situation, it should use columns from option table record optionname of viewDataTableParameters***_VisitsSummary.getEvolutionGraph.

Expected Behavior

When no columns specified in parameter, Visits Over Time should use columns from option table record optionname of viewDataTableParameters***_VisitsSummary.getEvolutionGraph.

Current Behavior

Visits Over Time uses all available columns so that generate many metrics not needed and lost time.

Possible Solution

Steps to Reproduce (for Bugs)

In global.ini.php, set "log_level = DEBUG" and "log_sql_queries = 1", we can see query for segment 'done90a5a511e1974bca37613b6daec137ba' which is in Frequency plugin, whose metrics are not shown in Visits Over Time.
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] Db::fetchAll() executing SQL: SELECT option_name, option_value FROM matomo_option WHERE option_name LIKE ?
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] Db::fetchAll() executing SQL: SELECT arc1.idarchive, arc1.value, arc1.name, arc1.ts_archived, arc1.date1 as startDate, arc2.value as nb_visits, arc3.value as nb_visits_converted
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] FROM matomo_archive_numeric_2021_08 arc1
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] LEFT JOIN matomo_archive_numeric_2021_08 arc2 on arc2.idarchive = arc1.idarchive and (arc2.name = 'nb_visits')
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] LEFT JOIN matomo_archive_numeric_2021_08 arc3 on arc3.idarchive = arc1.idarchive and (arc3.name = 'nb_visits_converted')
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] WHERE arc1.idsite = ?
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] AND arc1.date1 = ?
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] AND arc1.date2 = ?
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] AND arc1.period = ?
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] AND (((arc1.name IN ('done90a5a511e1974bca37613b6daec137ba','done90a5a511e1974bca37613b6daec137ba.VisitsSummary'))))
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2]
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] AND arc1.ts_archived IS NOT NULL
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] ORDER BY arc1.ts_archived DESC, arc1.idarchive DESC
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] Db::fetchOne() executing SQL: SELECT 1
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] FROM matomo_log_visit
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] WHERE idsite = ?
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] AND visit_last_action_time >= ?
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] AND visit_last_action_time <= ?
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] LIMIT 1

1.
2.

Context

Your Environment

  • Matomo Version:
  • PHP Version:
  • Server Operating System:
  • Additionally installed plugins:
  • Browser:
  • Operating System:
@tsteur commented on August 31st 2021 Member

@weixuezhang the solution would be to set up auto archiving see https://matomo.org/docs/setup-auto-archiving/ . When you then fetch a report it won't need to generate any reports. Fetching archived data is quite fast and there we aren't planning to change anything.

For range dates there is an issue already: https://github.com/matomo-org/matomo/issues/7573 . We will look into this more as part of #7573 and therefore marking this as a duplicate @weixuezhang

@weixuezhang commented on September 1st 2021

I have set up auto archiving with below settings in config.ini.php
browser_archiving_disabled_enforce = 1
enable_browser_archiving_triggering = 0
archiving_range_force_on_browser_request = 1

I add below code at head of method "public function query($sql, $bind = array())" inside file "/matomo-4.4.1/core/Db/Adapter/Pdo/Mysql.php"
Log::info("sql start time: " . microtime(true));
Log::info("sql: $sql");
Log::info("bind: " . var_export($bind, true));

From matomo.log, I can see below infomation, looks metrics are fetched from several plugins, but actually only ["nb_visits","nb_uniq_visitors"] for key viewDataTableParameters_admin_VisitsSummary.getEvolutionGraph inside option table, should only VisitsSummary plugin be related.

INFO VisitsSummary[2021-09-01 11:58:27 UTC] [cc6bb] sql: SELECT idsite, date1, date2,
INFO VisitsSummary[2021-09-01 11:58:27 UTC] [cc6bb] GROUP_CONCAT(CONCAT(idarchive,'|',name,'|',value) ORDER BY idarchive DESC SEPARATOR ',') AS archives
INFO VisitsSummary[2021-09-01 11:58:27 UTC] [cc6bb] FROM matomo_archive_numeric_2021_08
INFO VisitsSummary[2021-09-01 11:58:27 UTC] [cc6bb] WHERE idsite IN (1)
INFO VisitsSummary[2021-09-01 11:58:27 UTC] [cc6bb] AND ((name IN ('done','done.VisitsSummary')) AND (value IN (1,3,4,5)))
......
INFO Referrers[2021-09-01 12:12:44 UTC] [3ff11] sql: SELECT idsite, date1, date2,
INFO Referrers[2021-09-01 12:12:44 UTC] [3ff11] GROUP_CONCAT(CONCAT(idarchive,'|',name,'|',value) ORDER BY idarchive DESC SEPARATOR ',') AS archives
INFO Referrers[2021-09-01 12:12:44 UTC] [3ff11] FROM matomo_archive_numeric_2021_08
INFO Referrers[2021-09-01 12:12:44 UTC] [3ff11] WHERE idsite IN (1)
INFO Referrers[2021-09-01 12:12:44 UTC] [3ff11] AND ((name IN ('done','done.Referrers')) AND (value IN (1,3,4,5)))
......
INFO PagePerformance[2021-09-01 12:12:44 UTC] [3ff11] sql: SELECT idsite, date1, date2,
INFO PagePerformance[2021-09-01 12:12:44 UTC] [3ff11] GROUP_CONCAT(CONCAT(idarchive,'|',name,'|',value) ORDER BY idarchive DESC SEPARATOR ',') AS archives
INFO PagePerformance[2021-09-01 12:12:44 UTC] [3ff11] FROM matomo_archive_numeric_2021_08
INFO PagePerformance[2021-09-01 12:12:44 UTC] [3ff11] WHERE idsite IN (1)
INFO PagePerformance[2021-09-01 12:12:44 UTC] [3ff11] AND ((name IN ('done','done.PagePerformance')) AND (value IN (1,3,4,5)))
......
INFO Goals[2021-09-01 12:12:44 UTC] [3ff11] sql: SELECT idsite, date1, date2,
INFO Goals[2021-09-01 12:12:44 UTC] [3ff11] GROUP_CONCAT(CONCAT(idarchive,'|',name,'|',value) ORDER BY idarchive DESC SEPARATOR ',') AS archives
INFO Goals[2021-09-01 12:12:44 UTC] [3ff11] FROM matomo_archive_numeric_2021_08
INFO Goals[2021-09-01 12:12:44 UTC] [3ff11] WHERE idsite IN (1)
INFO Goals[2021-09-01 12:12:44 UTC] [3ff11] AND ((name IN ('done','done.Goals','done.VisitsSummary')) AND (value IN (1,3,4,5)))
......
INFO Actions[2021-09-01 12:12:44 UTC] [3ff11] GROUP_CONCAT(CONCAT(idarchive,'|',name,'|',value) ORDER BY idarchive DESC SEPARATOR ',') AS archives
INFO Actions[2021-09-01 12:12:44 UTC] [3ff11] FROM matomo_archive_numeric_2021_08
INFO Actions[2021-09-01 12:12:44 UTC] [3ff11] WHERE idsite IN (1)
INFO Actions[2021-09-01 12:12:44 UTC] [3ff11] AND ((name IN ('done','done.Actions')) AND (value IN (1,3,4,5)))

@tsteur commented on September 1st 2021 Member

@weixuezhang fyi if you have disabled the archiving in the config then you also need to set either enable_general_settings_admin = 0 or disable the browser archiving in the UI.

Can you maybe log more information like what the exact URL is for that request? and show how long it took to fetch the reports? Is the page slow for you?

@weixuezhang commented on September 1st 2021

@tsteur , I set enable_general_settings_admin = 0, but log result is same as previous. URL of the request is like blow. It is the URL of report "Visits Over Time"
http://localhost/matomo-4.4.1/index.php?forceView=1&viewDataTable=graphEvolution&module=VisitsSummary&action=getEvolutionGraph&idSite=1&period=range&date=2021-08-30,2021-08-31&segment=&showtitle=1&random=9682

After adding "columns=nb_visits" to the URL as below, I can see only VisitsSummary metrics is fetched, and the page shows faster several times than before. So, looks the problem is the report fetching all metrics (in stead of only that from option cache) when no columns set in URL parameters.
http://localhost.com/matomo-4.4.1/index.php?columns=nb_visits&forceView=1&viewDataTable=graphEvolution&module=VisitsSummary&action=getEvolutionGraph&idSite=1&period=range&date=2021-08-30,2021-08-31&segment=&showtitle=1&random=9682

@tsteur commented on September 1st 2021 Member

Thanks @weixuezhang . I'll reopen the issue again as this is bit different as the other referenced issue. I'm not sure we will work on it too soon though as usually fetching these reports is quite fast.

Do you have a performance problem there or how came you looked into it @weixuezhang ? be great to know to understand implications better.

@weixuezhang commented on September 1st 2021

@tsteur , when I use comparison features on long period of date, I noticed the performance problem. I also found the time consumed by SQL query of obtaining archiveId for metrics on each plugin is the major portion (about 80%) of the overall query time, and also a major portion of the whole report time. So, if we can get rid of the unneeded metrics queries, the report will be several times more faster.

@tsteur commented on September 2nd 2021 Member

BTW @weixuezhang you might want to test this change if it improves the situation maybe a little bit: https://github.com/matomo-org/matomo/pull/17943/files

@weixuezhang commented on September 2nd 2021

@tsteur , I tested your change, but matomo.log still shows the unneeded sql queries. I also tested the cases of URL parameter period equaling day, week and month, problem is the same. I tried a temp change as https://github.com/matomo-org/matomo/pull/17954/files, the unneeded queries can disappear, but I'm not sure it is the right fix, and if it causes other problems.

@tsteur commented on September 7th 2021 Member

Maybe slightly related. Noticed

        $multiSites = Request::processRequest('MultiSites.getAll', array(
            'period' => 'range',
            'date' => $from . ',' . $to,
            'showColumns' => 'nb_actions,nb_pageviews',
            'filter_limit' => -1
        ));

Can take a very long time. And in my case it created 1000 archive entries, 400 unique archive IDs (one for each site plus 3 or 4 segment archives per site).

I don't know if this can be avoided or improved. And not 100% sure it's related
image

@tsteur commented on September 7th 2021 Member

Above suggested PR would not fix that one

Powered by GitHub Issue Mirror