@WebNashr opened this Issue on August 25th 2016

My total database size was 130 MB but after the update it reached more than 1GB overnight. Changing the "Delete old archived reports" settings and "Purge DB now" didn't do a thing.

Due to database size limitations of my host provider I deleted the piwik_archive_blob and piwik_archive_numeric tables except for the last and current months, using phpMyAdmin. The total size of database became 180 MB, more than before the update, but acceptable.

Now a day later, those deleted tables are back and are gaining size! and my database size has reached 1 GB.

database usage administration piwik 1

Running "console core:purge-old-archive-data" and "console core:run-scheduled-tasks" didn't help at all.

@mattab commented on September 27th 2016 Member

Hi @WebNashr
Are you using latest Piwik version? or maybe it is a normal behavior that for each month you have between 4 and 20Mb of archived data...

We fixed several issues related to purging old data in https://github.com/piwik/piwik/issues/7181 - it's possible there is still some issue left, but in your screenshot it is not obvious that there is an issue, as it could be normal to have such disk space used.

How many visits/actions do you get per month?

@WebNashr commented on September 27th 2016

Hi
I'm using the latest version (2.16.2)

My total database size was 130 MB but after the update it reached more than 1GB overnight

So It's not normal for me to have a database this big.

I deleted the piwik_archive_blob and piwik_archive_numeric tables except for the last and current months, using phpMyAdmin. The total size of database became 180 MB, more than before the update, but acceptable. Now a day later, those deleted tables are back and are gaining size! and my database size has reached 1 GB.

In the picture I specified the deleted tables which are back and are gaining size. It's not obvious that there is something wrong?

@mattab commented on September 27th 2016 Member

Now a day later, those deleted tables are back and are gaining size!

The tables are back because that's where piwik stores the report data. It's normal and needed for these tables to be there.

How many visits/actions do you get per month?

@WebNashr commented on September 28th 2016

10,000 visists and 27000 pageviews for 2000 sub-sites, but as I mentioned above my regular database size before the update was 120M.
As for the tables I had deleted the tables belonging to the past months and I have set the piwik to delete old archived reports.
In the "schedule old data deletion" setting I see "estimated database size after purge: 575.8 M" but when I press "purge db now" the db size does not change at all.

@mattab commented on May 29th 2019 Member

This is affecting a few users (already 3-4 duplicated tickets)... and customers too. And the fact that the DB is way bigger than normal will cause all sorts of issues (slow backups, increased costs...).

-> Maybe the solution to this issue would be to run the console command ./console core:purge-old-archive-data all every week or every month as a scheduled task in Matomo?

In theory it's not needed because there is already a daily scheduled task which has a similar effect as calling core:purge-old-archive-data, the daily scheduled task is defined here: https://github.com/piwik/piwik/blob/3.0.1/plugins/CoreAdminHome/Tasks.php#L41-L43
But the console command ./console core:purge-old-archive-data all does more work and so maybe we should also call this command in a scheduled task?

@tsteur commented on May 29th 2019 Member

But the console command ./console core:purge-old-archive-data all does more work and so maybe we should also call this command in a scheduled task?

If so, be good if this was a setting so we can disable it.

@mattab commented on November 7th 2019 Member

People keep reporting this issue, recently by email, and now in the forums at https://forum.matomo.org/t/size-of-piwik-archive-blob-files-is-astronomical/32048/2

My current answer is:

Can you please try to run the console command:
./console core:purge-old-archive-data all
and confirm if this helps?

Once people confirm this solves the problem we could run this in Matomo (and have setting to have it disabled).

@tsteur commented on November 8th 2019 Member

@mattab could we please move this out of 3.13.0?

We run pretty much everything already in tasks and not seeing much that it otherwise doesn't do. Be only a minor improvement and be good to not change anything for now.

does more work and so maybe we should also call this command in a scheduled task?

Not sure what you specifically refer to here.

@mattab commented on November 11th 2019 Member

Someone tried the command but it didn't help them:

we ran the command yesterday in the morning, but there was no significant effect. The file became bigger, but the increase was not as high as before.
To be honest, we expected the file to decrease in size significantly.
Before running ./console core:purge-old-archive-data all
41G Nov 6 09:52 piwik_archive_blob_2019_01.ibd
After running ./console core:purge-old-archive-data all
43G Nov 7 16:22 piwik_archive_blob_2019_01.ibd

Did the command help anyone at all? Maybe it doesn't help and the issue is somewhere else.

Also it seems to be a regression in 3.12.0 as we get quite a few new reports of people who specifically didn't have the issue before.

@tsteur commented on November 11th 2019 Member

It might be expected if there are users having a range date as default period to load.

And it might be otherwise caused by https://github.com/matomo-org/matomo/issues/15086

@tsteur commented on November 12th 2019 Member

FYI: On some of our users' DB we see a lot of archives in calendar week 44. A lot of them with invalidated flag. (value=4 vs OK archive having value=1)

image

Up to 3.11 we used to have temporary archives which were deleted daily. In 3.12 we remove them and only have invalidated and done/OK archives. We still run a logic to delete no longer needed archives daily in the purgeInvalidatedArchives task but I suppose this logic might not fully work anymore and needs to be updated. Not sure if there's maybe an issue with ArchivesToPurgeDistributedList.

@osantiano commented on November 12th 2019

We also noticed a massive CPU/Load/Net raise on our DB server(s) after the 3.11 to 3.12 migration (also with the heavy HD usage mentioned here)

@tobiasnteireho commented on November 13th 2019

I noticed that the title has changed to highlight the 01_2019 blob and although the piwik_archive_blob_2019_01 is the largest blob for us the piwik_archive_blob_2019_11 is also far larger than expected at 4.7GB with the next largest blob being 210MB

Powered by GitHub Issue Mirror