@parruc opened this Issue on April 19th 2021 Contributor

Expected Behavior

Archive cron does not get stuck on Applying queued rearchiving...

Current Behavior

Archive cron gets stuck during "Applying queued rearchiving..." doing queries like and writing nothing else.

SELECT option_name, option_value FROMpiwik_optionWHERE option_name LIKE '%report_to_invalidate_733_2021-01-29%"

up to a point where it goes out of memory (we have 28GB of memory).

Context

We have a very big matomo installation with more than 20K sites tracked.
We recently updated matomo from 3.12 to 4.2.1

Your Environment

  • Matomo Version: 4.2.1
  • PHP Version: 7.4
  • Server Operating System: Ubuntu
@parruc commented on April 19th 2021 Contributor

a couple of screenshot from htop and mysql show processlist minutes before out of memory:

Screenshot from 2021-04-19 14-23-45
The second archiving process is a "normal" process started while commenting out
#$this->invalidator->applyScheduledReArchiving();
to have fresh statistics while solving the issue
The first archiving process is the memory hog doing the following query

Screenshot from 2021-04-19 14-23-56

@diosmosis commented on April 20th 2021 Member

Hi @parruc, can you run SELECT COUNT(*) FROM piwik_option WHERE option_name LIKE '%report_to_invalidate_733_2021-01-29%' and see how many items it's trying to select?

@parruc commented on April 20th 2021 Contributor

Thanks for the quick response @diosmosis .

Sure thing, with that id/date combo I have none but for example:

SELECT COUNT(*) FROM piwik_option WHERE option_name LIKE '%report_to_invalidate_14006_2021%'; 6

Some other numbers that may be interesting:
SELECT COUNT(*) FROM piwik_option WHERE option_name LIKE '%report_to_invalidate_%'; 18964

SELECT COUNT(*) FROM piwik_option; 203705

SELECT COUNT(*) FROM piwik_archive_invalidations; 3927881

@diosmosis commented on April 20th 2021 Member

A couple more questions @parruc:

  • How many websites exactly are in this matomo? How many of them receive traffic? And do you track data in the past for any of them?
  • How many segments do you have in your matomo? Do you use any premium plugins and have you activated any of them recently?
  • Can you try the following query: SELECT LENGTH(option_value) WHERE option_name = 'ReArchiveList'? And if it's not too large, can you post the contents?
@parruc commented on April 20th 2021 Contributor

Sure @diosmosis :

To answer tour question our installation has 20030 users, 18 segments (8 pre-processed, 10 processed in real-time) and 19876 websites of which I would say about 70% receive daily traffic (it's a really rough estimate)

Plugins (excluded the ones from core and maotmo) are:
CustomAlerts 4.0.1 (https://github.com/matomo-org/plugin-CustomAlerts)
InvalidateReports 4.0.1 (https://github.com/innocraft/plugin-InvalidateReports)
Migration 4.0.2 (https://github.com/matomo-org/plugin-Migration)
WebsiteGroups 0.3.0 (DISABLED)
plus some custom plugins that do not mess with DB and only personalize the dashboard and some widget

SELECT LENGTH(option_value) FROM piwik_option WHERE option_name = 'ReArchiveList'; gives me 17. The full option_values is:

a:1:{i:0;s:99:"{"idSites":"all","pluginName":"VisitFrequency","report":null,"startDate":1606089600,"segment":null}";}

@diosmosis commented on April 20th 2021 Member

Hi @parruc, one thing you can do right now is DELETE the ReArchiveList option. The entry there was to fix in issue in 4.0, but since you're updating from 3.x, it's not required.

How recently did you update to 4.x? If it was relatively recent, you can likely TRUNCATE the archive_invalidations table. This should probably solve the issue for you.

@parruc commented on April 26th 2021 Contributor

Hi @diosmosis , Your suggestion worked and now the archive process does not get stuck anymore (I uncommented the incriminated row) so thank you a lot. I still see the archiver memory problem: when it runs for hours the memory consumption skyrockets, but it is probability another thing so I will open another issue if it keeps doing it ;-)

Thanks again

This Issue was closed on April 26th 2021
Powered by GitHub Issue Mirror