I'm still wondering how this still happens. My January piwik_archive_blob_2017_01 went from 31.9 GB to 800 MB (12 M rows to 161k rows) after doing
./console core:purge-old-archive-data all
I'm well aware of ticket #7181 (Report archives have tripled in size) but with a database that went from63 GB to 28.1 GB total I'm looking for a long term solution.
Piwik 2.17.1 (can't upgrade to 3.x until few months)
PHP 5.5.x, Apache 2.4.x, MySQL 5.5.x
All tests green in Piwik System Check.
Active non-Core plugins : Logviewer, PlatformReport, RestrictLaguageSelection,SecurityInfo and SimpleSysMon.
Here are my last 3 months (last time I run purge-old-archive-data) :
|Table||Size Before/after||Rows before/after|
|piwik_archive_blob_2017_02||1.9 GB / 417 MB||501,726 / 379,490|
|piwik_archive_blob_2017_01||31.9 GB / 800.5 MB||12,179,895 / 161,802|
|piwik_archive_blob_2016_12||171.8 MB / 85.6 MB||82,220 / 14,695|
No errors in my PHP or Apache log nor in Piwik.
I think the results may be a bit bigger in January because it stores yearly archives there etc (I think). Also I don't know how much data PlatformsReport archives but that's quite a difference. I'm not into #7181 but am wondering whether you have "browser archiving" enabled and/or how often your cronjob runs?
Thanks for answering, highly appreciated.
Platform Report. I checked and I'm still running the latest version available. But why such a report data would be "flushed" by purge-old-archive-data ?
Browser archiving is at OFF since 2013 and we run Cronjob every 15 minutes. We got around 150,000 actions total every day from 10 different sites.
And my DB is on different MySQL server (4 cores, 12 GB of RAM) and nothing else run on that server.
Looking at the code it seems that already there is a daily scheduled task which should have the same effect as calling
the daily scheduled task is defined here: https://github.com/piwik/piwik/blob/3.0.1/plugins/CoreAdminHome/Tasks.php#L41-L43
@gaumondp when you check your
core:archive output logs for 1 or 2 days, do you see this scheduled task
purgeOutdatedArchives being executed?
Here is my current cronjob :
*/15 * * * * /usr/bin/php /piwik/console core:archive --url=http://stats.site.com >> /logs/piwik-console-cron217-1.log
Looking at the last 2 days I'm seeing an error message I didn't notice:
INFO [2017-02-21 14:45:01] Running Piwik 2.17.1 as Super User INFO [2017-02-21 14:45:01] --------------------------- INFO [2017-02-21 14:45:01] NOTES INFO [2017-02-21 14:45:01] - Reports for today will be processed at most every 600 seconds. You can change this value in Piwik UI > Settings > General Settings. INFO [2017-02-21 14:45:01] - Reports for the current week/month/year will be refreshed at most every 3600 seconds. INFO [2017-02-21 14:45:01] - Archiving was last executed without error 72 days 3 hours ago INFO [2017-02-21 14:45:01] - Will process 19 other websites because the last time they were archived was on a different day (in the website's timezone) , IDs: 2, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22 INFO [2017-02-21 14:45:01] - Will process 1 other websites because some old data reports have been invalidated (eg. using the Log Import script) , IDs: 6 INFO [2017-02-21 14:45:01] --------------------------- INFO [2017-02-21 14:45:01] START
In fact I really erased all my piwik_archive_blob2016 and piwik_archive_numeric2016 but only website I got a problem is siteId 6 for annual reports only. Other reports are ok and consistant. The log from cron has 0 visit for annual reports (There is no data for this report.) :
INFO [2017-02-21 14:47:08] Archived website id = 6, period = year, 0 segments, 0 visits in last 7 years, 0 visits this year, Time elapsed: 51.443s INFO [2017-02-21 14:47:08] Will pre-process for website id = 6, period = range, date = last7 INFO [2017-02-21 14:47:08] - pre-processing all visits INFO [2017-02-21 14:47:09] Archived website id = 6, period = range, 0 segments, 56352 visits in last 7 ranges, 56352 visits this range, Time elapsed: 1.083s INFO [2017-02-21 14:47:09] Archived website id = 6, 5 API requests, Time elapsed: 102.837s [4/19 done
And today my Report table has February at 13 GB and January at 4 GB. So it looks there are never any purgeOutdatedArchives run.
Am I supposed to see anything about purgeOutdatedArchives in my cron log ? Remember, I'm still using 2.17.1.