New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large blob size for 2020-01 #16211
Comments
@Dave-Oz As long as you do not delete any of the log data all archives can be rebuilt at any time. Nevertheless |
@sgiehl We have Also, there is the "Schedule old data deletion" is set to every week in the settings. Does that mean matomo was supposed to that for January? However, it didn't do it for some reason (a bug in an older version)? Also, we have cron task that runs every couple minutes with the command I'm just trying to understand what is the difference in that table before and after running Which table stores the log data archives? |
If you archive data every few minutes then over time there will be a lot of outdated reports in the DB which some get deleted daily, some weekly, some monthly. If the size is an issue then I recommend running the archive command for example only every hour. The
Yes there was a bug in an older version which we fixed in an update and the update should have triggered the cleanup to run eventually (but might take a while until the task to clean up is executed the next time). |
@tsteur Our only issue is the size for January 2020. All the other months are below 30-40mb each. Except January 2020. That one is over 4gb.
My only worry is that does this break annual reports since they are stored in January 2020? |
@tsteur Also, which task to clean up is executed? Can I do that manually? Is it equivalent to |
should do 👍 it will only delete unneeded data. You could also try
Then it does the current month and the 2020_01 table. |
@tsteur Hi, |
You could execute this task as a cron or run the archiver less often @Dave-Oz Running it every few minutes is quite often. |
-- |
@unkn0wn-developer I see you removed your comment maybe it's all good? |
It's all good for now. I'm monitoring the usage every week or so. I wasn't using the "ls" command in MB or GB mode. I thought I saw that the backup size was 5gb but it was 500mb. Counted the digits wrong in bytes... It was a Friday evening at work so my brain was pretty much fried at the end of the day.😅 Thanks for checking up on me though. |
Hi,
I'm using matomo 3.13.5 on production and 3.13.6 on development.
For production, our data jumped from 32mb (December 2019) to 4.4gb (January 2020). Then, it fell down to 33mb (February 2020).
(January 2019 was also 30-40mb)
That 4.4gb data from January broke our backup cycle eventually(took couple months) and our data partition ran out of space and our apps and the database crashed.
We could bring our database up again. However, I would like to fix this.
I copied our production data on my dev to test
./console core:purge-old-archive-data january
and see if it works or not.After I ran that command, it took 20-30 minutes to complete. However,
piwik_archive_blob_2020_01
database table went from 4.4gb to 16mb.I wonder if it that command deleted any important information. I read on a documentation that matomo stores annual statistics data on January of every year. I don't want to lose any important data if I run this command on production.
I've seen a lot of other issues opened for the large blob size issue.
I just wanted to confirm that whether it is safe to use
./console core:purge-old-archive-data january
for Matomo?What does it delete?
Thanks
@tsteur , @sgiehl
The text was updated successfully, but these errors were encountered: