Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big memory usage during archive #18460

Open
avkarenow opened this issue Dec 7, 2021 · 17 comments
Open

Big memory usage during archive #18460

avkarenow opened this issue Dec 7, 2021 · 17 comments
Labels
c: Performance For when we could improve the performance / speed of Matomo. Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. Stability For issues that make Matomo more stable and reliable to run for sys admins.

Comments

@avkarenow
Copy link
Contributor

Process of archiving sometimes is taking too much memory:

Fatal error: Allowed memory size of 8589934592 bytes exhausted (tried to allocate 25695760 bytes) in /usr/local/www/matomo/libs/Zend/Db/Statement/Pdo.php on line 233 Fatal error: Allowed memory size of 8589934592 bytes exhausted (tried to allocate 16777224 bytes) in /usr/local/www/matomo/core/DataTable/Manager.php on line 98 ' INFO [2021-12-06 04:00:25] 99708 Error: Error unserializing the following response from ?module=API&method=CoreAdminHome.archiveReports&idSite=23&period=year&date=2021-01-01&format=json&trigger=archivephp: ' Fatal error: Allowed memory size of 8589934592 bytes exhausted (tried to allocate 25695760 bytes) in /usr/local/www/matomo/libs/Zend/Db/Statement/Pdo.php on line 233 Fatal error: Allowed memory size of 8589934592 bytes exhausted (tried to allocate 16777224 bytes) in /usr/local/www/matomo/core/DataTable/Manager.php on line 98 ' INFO [2021-12-06 04:00:25] 99708 Error: Got invalid response from API request: ?module=API&method=CoreAdminHome.archiveReports&idSite=25&period=year&date=2021-01-01&format=json&trigger=archivephp. Response was ' Fatal error: Allowed memory size of 8589934592 bytes exhausted (tried to allocate 323584 bytes) in /usr/local/www/matomo/libs/Zend/Db/Statement/Pdo.php on line 233 Fatal error: Allowed memory size of 8589934592 bytes exhausted (tried to allocate 8388616 bytes) in /usr/local/www/matomo/core/DataTable/Manager.php on line 98 ' INFO [2021-12-06 04:00:25] 99708 Error: Error unserializing the following response from ?module=API&method=CoreAdminHome.archiveReports&idSite=25&period=year&date=2021-01-01&format=json&trigger=archivephp: ' Fatal error: Allowed memory size of 8589934592 bytes exhausted (tried to allocate 323584 bytes) in /usr/local/www/matomo/libs/Zend/Db/Statement/Pdo.php on line 233 Fatal error: Allowed memory size of 8589934592 bytes exhausted (tried to allocate 8388616 bytes) in /usr/local/www/matomo/core/DataTable/Manager.php on line 98

The limit of RAM for PHP is 8G but errors also appear (but less) with 12G.
Command for archiving I'm running:
php72 console core:archive --force-all-websites --php-cli-options='-dmemory_limit=8G'

Expected Behavior

Less RAM usage ;)

Your Environment

  • Matomo Version: 4.5.0
  • PHP Version: 7.2.34
  • Server Operating System: FreeBSD 12/13
  • Additionally installed plugins:
  • Browser:
  • Operating System:
@avkarenow avkarenow added the Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. label Dec 7, 2021
@bx80
Copy link
Contributor

bx80 commented Dec 7, 2021

Hi @avkarenow, thanks for reporting this issue, we're always looking to reduce memory usage. Are you archiving a large number of websites at once?

Matomo 4.7.0 will include #18326, a community submission which provides extra options for the archive command to split up the archiving task by a number of websites or reports. This can be used to reduce the memory usage that can occur with one long running process.

@justinvelluppillai
Copy link
Contributor

Also would be good to know if you have the same issue in the latest version, currently 4.6.1?

@avkarenow
Copy link
Contributor Author

Sorry for my late reply.
Unfortunately, the problem with big RAM usage still appears.
Now I'm using Matomo 4.7.1

@bx80 I'm archiving 80-400 websites at once. I run the process once per day.

@peterbo
Copy link
Contributor

peterbo commented Apr 8, 2022

I'm experiencing a memory leak while archiving segments. I suspect a (premium) plugin to be responsible for that, because I have other, much bigger instances, that are not affected. However, I cannot tell, which one is responsible.
Creating the yearly archive for 2022 needs in peak more than 40G, while there are, for the whole Matomo instance, just 1.5M Actions for 2022. The memory usage seems to be very high for every segment, some might need more than others (but I didn't find a pattern yet).

Matomo is Version 4.6.2
PHP-CLI: 8.0
Matomo instance traffic: 1.5M Actions this year

Active plugins: API, AbTesting 4.1.6, Actions, ActivityLog 4.0.5, Annotations, BulkTracking, Cohorts 4.0.5, Contents, CoreAdminHome, CoreConsole, CoreHome, CorePluginsAdmin, CoreUpdater, CoreVisualizations, CoreVue, CustomAlerts 4.0.4, CustomDimensions, CustomJsTracker, CustomReports 4.0.12, CustomVariables 4.0.1, DBStats, Dashboard, DeviceDetectorCache 4.2.3, DevicePlugins, DevicesDetection, Diagnostics, Ecommerce, Events, FormAnalytics 4.0.8, Funnels 4.0.10, GKVTrackingLocalPrevention 1.0, GeoIp2, Goals, Heartbeat, HeatmapSessionRecording 4.4.2, ImageGraph, Insights, Installation, Intl, IntranetMeasurable, InvalidateReports 4.0.1, LanguagesManager, Live, LogViewer 4.0.1, Login, MarketingCampaignsReporting 4.1.1, Marketplace, MediaAnalytics 4.0.17, MobileAppMeasurable, MobileMessaging, Monolog, Morpheus, MultiChannelConversionAttribution 4.0.5, MultiSites, Overlay, PagePerformance, PrivacyManager, Provider 4.0.3, Proxy, QueuedTracking 4.0.2, Referrers, Resolution, RollUpReporting 4.0.3, ScheduledReports, SearchEngineKeywordsPerformance 4.3.8, SegmentEditor, SitesManager, TagManager, Transitions, UserCountry, UserCountryMap, UserId, UserLanguage, UsersFlow 4.0.4, UsersManager, VisitFrequency, VisitTime, VisitorInterest, VisitsSummary, WebsiteMeasurable, Widgetize, WooCommerceAnalytics 4.0.5

memory_leak PNG

memory_leak1

memory_usage_archiver PNG

@sgiehl
Copy link
Member

sgiehl commented Apr 11, 2022

@avkarenow Do you also have any premium plugins installed?

@avkarenow
Copy link
Contributor Author

From additional plugins I'm using only https://plugins.matomo.org/LoginTokenAuth

@Pflegusch
Copy link

Did you solve this issue? I'm having the same issue right now, with Matomo 3.14.1 16GB were enough, now I'm running 4.10.1 and not even having 20% of the archiving completed, it already consumes more than 32gb and aborts due to the memory limit. I disabled the CustomDimensions plugin, the memory usage is still insane tho, using 32gb not even 33% through the archiving. Is there anything one can do? I dont think its normal to require such insane amounts of memory.

@bx80
Copy link
Contributor

bx80 commented Jun 29, 2022

Hi @Pflegusch, thanks for reaching out. Unfortunately this issue is still ongoing as it seems to be dependent on a number of factors and is proving difficult to recreate reliably. As a workaround you could try the max-archives-to-process option detailed here: https://matomo.org/faq/how-do-i-decrease-archiving-memory-usage-when-processing-a-large-number-of-websites/

@bx80 bx80 added this to the For Prioritization milestone Jun 29, 2022
@Pflegusch
Copy link

Hi @bx80, thank you for the reply. I tried the --force-date-range parameter now, however, I'm not really sure if it really does what I'm thinking. I created a script which calls core:archive for the last 5 years, so I did something like this:

/usr/local/bin/php /var/www/html/console core:archive --force-date-range 2018-01-01,2018-12-31 --matomo-domain=https://$MATOMO_URL > /var/tmp/piwik-ego-archive-2018.log
/usr/local/bin/php /var/www/html/console core:archive --force-date-range 2019-01-01,2019-12-31 --matomo-domain=https://$MATOMO_URL > /var/tmp/piwik-ego-archive-2019.log
/usr/local/bin/php /var/www/html/console core:archive --force-date-range 2020-01-01,2020-12-31 --matomo-domain=https://$MATOMO_URL > /var/tmp/piwik-ego-archive-2020.log
/usr/local/bin/php /var/www/html/console core:archive --force-date-range 2021-01-01,2021-12-31 --matomo-domain=https://$MATOMO_URL > /var/tmp/piwik-ego-archive-2021.log
/usr/local/bin/php /var/www/html/console core:archive --force-date-range 2022-01-01,2022-12-31 --matomo-domain=https://$MATOMO_URL > /var/tmp/piwik-ego-archive-2022.log

The memory consumption now is much, much smaller (around 350 MB), but is it really the correct way of doing so? The logs above look suspiciously the same and the process takes around an hour for all of them. I'm not sure if I did it correctly. Otherwise I would try to use the --max-archives-to-process argument.

@bx80
Copy link
Contributor

bx80 commented Jul 6, 2022

Hi @Pflegusch,

Generally you wouldn't need to re-archive data from previous years unless a new segment has been added and you wanted to retrospectively calculate it or historic log data has been imported directly.

The usual process is to invalidate existing archives for a date range and then run ./console core:archive to recreate the archives. However this can result in processing a large amount of data and using a lot of memory. The --max-archives-to-process option will cause the archiver to only process a set number of archives each time it is run which prevents excessive memory usage. After several runs all the archives will be recreated.

To recreate all five years of archives using this approach you could run ./console core:invalidate-report-data --dates=2018-01-01,2022-12-31 and then modify your archive cronjob to add --max-archives-to-process=5000, the scheduled archiving task will then recreate some of the invalidated archives each time it runs until the backlog is cleared.

Perhaps a better option for you might be to modify your script to invalidate the date range bit by bit, running the archiver after each invalidation, something like:

/usr/local/bin/php /var/www/html/console core:invalidate-report-data --dates=2018-01-01,2018-06-30
/usr/local/bin/php /var/www/html/console core:archive
/usr/local/bin/php /var/www/html/console core:invalidate-report-data --dates=2018-07-01,2018-12-31
/usr/local/bin/php /var/www/html/console core:archive
...

I hope that's some help, once again this is a work-around until the high memory usage issue is tracked down and fixed 🙂

@Pflegusch
Copy link

Hi @bx80, thank you for the help. I think I slightly misunderstood something and reread the documentation about the archiving and everything should be clear now. My current approach is using the --max-archives-to-process argument which works very well now :)

@avkarenow
Copy link
Contributor Author

Hello,

I made an upgrade to the latest Matomo 4.11.0 and changed PHP version from 7.2 to 7.4.
Unfortunately, the problem with high memory usage still appears on my systems.
I also tried --max-websites-to-process/--max-archives-to-process options but it didn't help (errors like Allowed memory size of x bytes exhausted).
So far --php-cli-options='-dmemory_limit=12G' works mostly fine.

Can changing to PHP 8.0/8.1 lower the RAM usage?

@bx80
Copy link
Contributor

bx80 commented Oct 6, 2022

Hi @avkarenow,

I can't promise that upgrading to PHP 8.1 will fix your issue, but there was an improvement to a long standing memory leak issue in PHP 8.1: https://bugs.php.net/bug.php?id=76982 which might be beneficial.

Matomo 4.11+ introduced a new feature to show goal attributions by page, this increases memory usage for archiving and for sites with a lot of visits and/or goals this can be a problem. Matomo 4.12 includes a new option to disable this additional archiving while we work on performance improvements.
You could try upgrading to Matomo 4.12 and adding disable_archive_actions_goals = 1 to the [General'] section of your config.ini.php

@glatzenarsch
Copy link

Hi @bx80, thank you for the help. I think I slightly misunderstood something and reread the documentation about the archiving and everything should be clear now. My current approach is using the --max-archives-to-process argument which works very well now :)

which value you use?

tnx

@avkarenow
Copy link
Contributor Author

Re @bx80
I updated installation to 4.12.3 and now I'm using PHP 8.0. As you said - the usage of memory is much bigger:

Fatal error: Allowed memory size of 19327352832 bytes exhausted (tried to allocate 20480 bytes)

now problem appears even with 18G... and with PHP 7.4 too.

@peterbo
Copy link
Contributor

peterbo commented Oct 12, 2023

It's an old thread, but as I'm experiencing this still from time to time, I have another idea. Do you by chance have enable_segments_subquery_cache enabled?

@mattab
Copy link
Member

mattab commented Dec 12, 2023

Hi everyone, FYI

Will leave this issue opened to keep gathering feedback on this issue.

Thanks!

@mattab mattab added c: Performance For when we could improve the performance / speed of Matomo. Stability For issues that make Matomo more stable and reliable to run for sys admins. labels Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: Performance For when we could improve the performance / speed of Matomo. Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. Stability For issues that make Matomo more stable and reliable to run for sys admins.
Projects
None yet
Development

No branches or pull requests

8 participants