Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visits Over Time uses all available columns which causes the report generate slowly. #17941

Closed
weixuezhang opened this issue Aug 31, 2021 · 17 comments
Labels
c: Performance For when we could improve the performance / speed of Matomo. Major Indicates the severity or impact or benefit of an issue is much higher than normal but not critical. not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org.

Comments

@weixuezhang
Copy link

When no columns specified in parameter, Visits Over Time uses all available columns which causes report generate slowly. In this situation, it should use columns from option table record option_name of viewDataTableParameters_***_VisitsSummary.getEvolutionGraph.

Expected Behavior

When no columns specified in parameter, Visits Over Time should use columns from option table record option_name of viewDataTableParameters_***_VisitsSummary.getEvolutionGraph.

Current Behavior

Visits Over Time uses all available columns so that generate many metrics not needed and lost time.

Possible Solution

Steps to Reproduce (for Bugs)

In global.ini.php, set "log_level = DEBUG" and "log_sql_queries = 1", we can see query for segment 'done90a5a511e1974bca37613b6daec137ba' which is in Frequency plugin, whose metrics are not shown in Visits Over Time.
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] Db::fetchAll() executing SQL: SELECT option_name, option_value FROM matomo_option WHERE option_name LIKE ?
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] Db::fetchAll() executing SQL: SELECT arc1.idarchive, arc1.value, arc1.name, arc1.ts_archived, arc1.date1 as startDate, arc2.value as nb_visits, arc3.value as nb_visits_converted
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] FROM matomo_archive_numeric_2021_08 arc1
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] LEFT JOIN matomo_archive_numeric_2021_08 arc2 on arc2.idarchive = arc1.idarchive and (arc2.name = 'nb_visits')
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] LEFT JOIN matomo_archive_numeric_2021_08 arc3 on arc3.idarchive = arc1.idarchive and (arc3.name = 'nb_visits_converted')
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] WHERE arc1.idsite = ?
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] AND arc1.date1 = ?
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] AND arc1.date2 = ?
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] AND arc1.period = ?
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] AND (((arc1.name IN ('done90a5a511e1974bca37613b6daec137ba','done90a5a511e1974bca37613b6daec137ba.VisitsSummary'))))
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2]
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] AND arc1.ts_archived IS NOT NULL
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] ORDER BY arc1.ts_archived DESC, arc1.idarchive DESC
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] Db::fetchOne() executing SQL: SELECT 1
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] FROM matomo_log_visit
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] WHERE idsite = ?
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] AND visit_last_action_time >= ?
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] AND visit_last_action_time <= ?
DEBUG CoreAdminHome[2021-08-31 03:04:55 UTC] [a58b2] LIMIT 1

Context

Your Environment

  • Matomo Version:
  • PHP Version:
  • Server Operating System:
  • Additionally installed plugins:
  • Browser:
  • Operating System:
@weixuezhang weixuezhang added the Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. label Aug 31, 2021
@tsteur
Copy link
Member

tsteur commented Aug 31, 2021

@weixuezhang the solution would be to set up auto archiving see https://matomo.org/docs/setup-auto-archiving/ . When you then fetch a report it won't need to generate any reports. Fetching archived data is quite fast and there we aren't planning to change anything.

For range dates there is an issue already: #7573 . We will look into this more as part of #7573 and therefore marking this as a duplicate @weixuezhang

@tsteur tsteur closed this as completed Aug 31, 2021
@tsteur tsteur added duplicate For issues that already existed in our issue tracker and were reported previously. and removed Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. labels Aug 31, 2021
@weixuezhang
Copy link
Author

I have set up auto archiving with below settings in config.ini.php
browser_archiving_disabled_enforce = 1
enable_browser_archiving_triggering = 0
archiving_range_force_on_browser_request = 1

I add below code at head of method "public function query($sql, $bind = array())" inside file "/matomo-4.4.1/core/Db/Adapter/Pdo/Mysql.php"
Log::info("sql start time: " . microtime(true));
Log::info("sql: $sql");
Log::info("bind: " . var_export($bind, true));

From matomo.log, I can see below infomation, looks metrics are fetched from several plugins, but actually only ["nb_visits","nb_uniq_visitors"] for key viewDataTableParameters_admin_VisitsSummary.getEvolutionGraph inside option table, should only VisitsSummary plugin be related.

INFO VisitsSummary[2021-09-01 11:58:27 UTC] [cc6bb] sql: SELECT idsite, date1, date2,
INFO VisitsSummary[2021-09-01 11:58:27 UTC] [cc6bb] GROUP_CONCAT(CONCAT(idarchive,'|',name,'|',value) ORDER BY idarchive DESC SEPARATOR ',') AS archives
INFO VisitsSummary[2021-09-01 11:58:27 UTC] [cc6bb] FROM matomo_archive_numeric_2021_08
INFO VisitsSummary[2021-09-01 11:58:27 UTC] [cc6bb] WHERE idsite IN (1)
INFO VisitsSummary[2021-09-01 11:58:27 UTC] [cc6bb] AND ((name IN ('done','done.VisitsSummary')) AND (value IN (1,3,4,5)))
......
INFO Referrers[2021-09-01 12:12:44 UTC] [3ff11] sql: SELECT idsite, date1, date2,
INFO Referrers[2021-09-01 12:12:44 UTC] [3ff11] GROUP_CONCAT(CONCAT(idarchive,'|',name,'|',value) ORDER BY idarchive DESC SEPARATOR ',') AS archives
INFO Referrers[2021-09-01 12:12:44 UTC] [3ff11] FROM matomo_archive_numeric_2021_08
INFO Referrers[2021-09-01 12:12:44 UTC] [3ff11] WHERE idsite IN (1)
INFO Referrers[2021-09-01 12:12:44 UTC] [3ff11] AND ((name IN ('done','done.Referrers')) AND (value IN (1,3,4,5)))
......
INFO PagePerformance[2021-09-01 12:12:44 UTC] [3ff11] sql: SELECT idsite, date1, date2,
INFO PagePerformance[2021-09-01 12:12:44 UTC] [3ff11] GROUP_CONCAT(CONCAT(idarchive,'|',name,'|',value) ORDER BY idarchive DESC SEPARATOR ',') AS archives
INFO PagePerformance[2021-09-01 12:12:44 UTC] [3ff11] FROM matomo_archive_numeric_2021_08
INFO PagePerformance[2021-09-01 12:12:44 UTC] [3ff11] WHERE idsite IN (1)
INFO PagePerformance[2021-09-01 12:12:44 UTC] [3ff11] AND ((name IN ('done','done.PagePerformance')) AND (value IN (1,3,4,5)))
......
INFO Goals[2021-09-01 12:12:44 UTC] [3ff11] sql: SELECT idsite, date1, date2,
INFO Goals[2021-09-01 12:12:44 UTC] [3ff11] GROUP_CONCAT(CONCAT(idarchive,'|',name,'|',value) ORDER BY idarchive DESC SEPARATOR ',') AS archives
INFO Goals[2021-09-01 12:12:44 UTC] [3ff11] FROM matomo_archive_numeric_2021_08
INFO Goals[2021-09-01 12:12:44 UTC] [3ff11] WHERE idsite IN (1)
INFO Goals[2021-09-01 12:12:44 UTC] [3ff11] AND ((name IN ('done','done.Goals','done.VisitsSummary')) AND (value IN (1,3,4,5)))
......
INFO Actions[2021-09-01 12:12:44 UTC] [3ff11] GROUP_CONCAT(CONCAT(idarchive,'|',name,'|',value) ORDER BY idarchive DESC SEPARATOR ',') AS archives
INFO Actions[2021-09-01 12:12:44 UTC] [3ff11] FROM matomo_archive_numeric_2021_08
INFO Actions[2021-09-01 12:12:44 UTC] [3ff11] WHERE idsite IN (1)
INFO Actions[2021-09-01 12:12:44 UTC] [3ff11] AND ((name IN ('done','done.Actions')) AND (value IN (1,3,4,5)))

@tsteur
Copy link
Member

tsteur commented Sep 1, 2021

@weixuezhang fyi if you have disabled the archiving in the config then you also need to set either enable_general_settings_admin = 0 or disable the browser archiving in the UI.

Can you maybe log more information like what the exact URL is for that request? and show how long it took to fetch the reports? Is the page slow for you?

@weixuezhang
Copy link
Author

@tsteur , I set enable_general_settings_admin = 0, but log result is same as previous. URL of the request is like blow. It is the URL of report "Visits Over Time"
http://localhost/matomo-4.4.1/index.php?forceView=1&viewDataTable=graphEvolution&module=VisitsSummary&action=getEvolutionGraph&idSite=1&period=range&date=2021-08-30,2021-08-31&segment=&showtitle=1&random=9682

After adding "columns=nb_visits" to the URL as below, I can see only VisitsSummary metrics is fetched, and the page shows faster several times than before. So, looks the problem is the report fetching all metrics (in stead of only that from option cache) when no columns set in URL parameters.
http://localhost.com/matomo-4.4.1/index.php?columns=nb_visits&forceView=1&viewDataTable=graphEvolution&module=VisitsSummary&action=getEvolutionGraph&idSite=1&period=range&date=2021-08-30,2021-08-31&segment=&showtitle=1&random=9682

@tsteur tsteur added c: Performance For when we could improve the performance / speed of Matomo. and removed duplicate For issues that already existed in our issue tracker and were reported previously. labels Sep 1, 2021
@tsteur
Copy link
Member

tsteur commented Sep 1, 2021

Thanks @weixuezhang . I'll reopen the issue again as this is bit different as the other referenced issue. I'm not sure we will work on it too soon though as usually fetching these reports is quite fast.

Do you have a performance problem there or how came you looked into it @weixuezhang ? be great to know to understand implications better.

@weixuezhang
Copy link
Author

@tsteur , when I use comparison features on long period of date, I noticed the performance problem. I also found the time consumed by SQL query of obtaining archiveId for metrics on each plugin is the major portion (about 80%) of the overall query time, and also a major portion of the whole report time. So, if we can get rid of the unneeded metrics queries, the report will be several times more faster.

@tsteur tsteur reopened this Sep 2, 2021
@tsteur
Copy link
Member

tsteur commented Sep 2, 2021

BTW @weixuezhang you might want to test this change if it improves the situation maybe a little bit: https://github.com/matomo-org/matomo/pull/17943/files

@weixuezhang
Copy link
Author

@tsteur , I tested your change, but matomo.log still shows the unneeded sql queries. I also tested the cases of URL parameter period equaling day, week and month, problem is the same. I tried a temp change as https://github.com/matomo-org/matomo/pull/17954/files, the unneeded queries can disappear, but I'm not sure it is the right fix, and if it causes other problems.

@tsteur tsteur added the Major Indicates the severity or impact or benefit of an issue is much higher than normal but not critical. label Sep 7, 2021
@tsteur tsteur added this to the 4.7.0 milestone Sep 7, 2021
@tsteur
Copy link
Member

tsteur commented Sep 7, 2021

Maybe slightly related. Noticed

        $multiSites = Request::processRequest('MultiSites.getAll', array(
            'period' => 'range',
            'date' => $from . ',' . $to,
            'showColumns' => 'nb_actions,nb_pageviews',
            'filter_limit' => -1
        ));

Can take a very long time. And in my case it created 1000 archive entries, 400 unique archive IDs (one for each site plus 3 or 4 segment archives per site).

I don't know if this can be avoided or improved. And not 100% sure it's related
image

@tsteur
Copy link
Member

tsteur commented Sep 7, 2021

Above suggested PR would not fix that one

@utrautmann
Copy link

utrautmann commented Oct 8, 2021

I would like to add that I also have problems with the display of the widget "Visits over time" in all of my clients' installations. Whenever a user has selected any segment, this widget takes a very long time (usually the web server timeout stops).
It is always the following example POST request that takes the long time https: //myserver/index.php?segment=deviceType%3D% 3Ddesktop&date=today&forceView=1&viewDataTable=graphEvolution&module=VisitsSummary&action=getEvolutionGraph&disableLink=0&widget=1&idSite=1&period=month

Auto archiving is always active and browser archiving is inactive.

I'm not sure that is important to this problem

@tsteur
Copy link
Member

tsteur commented Oct 10, 2021

@utrautmann you might want to put this additional configuration into config/config.ini.php as part of the [General] section: browser_archiving_disabled_enforce = 1. Otherwise segments may still be archived while being requested from the browser. It's partially mentioned in https://matomo.org/docs/setup-auto-archiving/#more-information-about-matomo-archiving

@tsteur
Copy link
Member

tsteur commented Oct 10, 2021

@utrautmann do you know if the segment is configured to be "pre-processed" or to be processed in "Real time"? You can find out when editing that segment.

@utrautmann
Copy link

utrautmann commented Oct 11, 2021

@tsteur All segments in the affected client installations are configured to be "pre-processed". But browser_archiving_disabled_enforce is actually set to 0. I will make a change.

@tsteur
Copy link
Member

tsteur commented Oct 12, 2021

Thanks for this @utrautmann I'm expecting this will help. I've also created a PR in #18123 so that it won't be needed to set this config in that case.

@utrautmann
Copy link

Thanks for your answer @tsteur . The "Visits Over Time" widget now renders faster.

@mattab
Copy link
Member

mattab commented Dec 11, 2023

Thanks for contributing to this issue. As it has been a few months since the last activity and we believe this is likely not an issue anymore, we will now close this. If that's not the case, please do feel free to either reopen this issue or open a new one. We will gladly take a look again!

@mattab mattab closed this as not planned Won't fix, can't repro, duplicate, stale Dec 11, 2023
@sgiehl sgiehl added the not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org. label Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: Performance For when we could improve the performance / speed of Matomo. Major Indicates the severity or impact or benefit of an issue is much higher than normal but not critical. not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org.
Projects
None yet
Development

No branches or pull requests

5 participants