@James-Oakley opened this Issue on April 19th 2020

This issue is similar to #10439, but that is reported as fixed as of 3.13.0-b1, and I'm running the latest 3.13.4 which postdates that. So I'm opening this here, rather than re-opening that, assuming it's not exactly the same. If someone wants to close this as a duplicate I'll happily re-open that issue instead.

I began getting alerts that my total database size was growing too big, and I was getting mysqldump backup failures. The table that keeps failing to dump is the 2020-01 archive blob table.

Following the comments in the issue I've just linked, I tried running ./console core:purge-old-archive-data all. The total database size shrank by about 1/3, but that 2020-01 blob table remains as large as ever.

So I looked at a backup I had managed to take of the database from 3 days ago.

  • 3 days ago, the table has 53,691 rows and was 162.1 MiB in size.
  • As of now, even having run purge-old-archive-data, the table has grown to 116,055 and 346.2 MiB.

So the table has doubled in size in 3 days. I don't know how the archiving works, but I'd expect the archive of January 2020 to be stable now. Fresh visits should mean the April 2020 table continues to grow, but surely the data within the January table should not be changing.

Weirdly, March 2020 is also big (59 MiB, but hasn't got any larger compared to 3 days ago - presumably there was more traffic that month); April is bigger (already 201.4 MiB, grown from 56.1 MiB 3 days ago). However: February is just 10 MiB, and isn't growing. So only certain months seem to be affected. January and April, but not February or March.

So this is part bug report and part support request.

  • Bug report: The database is growing, and fast, and the inability to take backups is a performance issue.
  • Support request: How do I get the bloated tables back down to the size that they need to be. Probably, even the 162.1 MiB is way larger than necessary (a typical month's archive blob is under 10 MiB), and I know for a fact that 346 MiB is too big because it was half that size 3 days ago. So what command do I run to remove the extraneous data in the table, without losing any ability to analyse visits to the websites being tracked.
@sgiehl commented on April 19th 2020 Member

Hi @James-Oakley. Thanks for creating the issue. We are already aware of that problem and it should be fixed with https://github.com/matomo-org/matomo/pull/15800, which will be included in a new release coming up the next days

@James-Oakley commented on April 19th 2020

Thanks, @sgiehl. Will #15800 shrink old archive tables back to their proper size as well as ensuring they don't grow any further?

@sgiehl commented on April 19th 2020 Member

Yes it should. If it doesn't work with the next release, please create a new issue for it.

@fvdm commented on April 28th 2020

You can force the clean up with console database:optimize-archive-tables 2020-04

For me the updater didn't shrink the archives, at least not right after the update. Given the table size of >4 GB was messing up a lot of things, including the core:archive cronjob that stacked up, I didn't want to wait for the scheduler.

@James-Oakley commented on April 28th 2020

I ran ./console core:purge-old-archive-data all right after the upgrade, and the tables shrank right back to where they should be (1.5 GB -> 350 MB)

@fvdm commented on April 28th 2020

@James-Oakley oops I forgot a word in my sentence: 'it' referred to the updater. The optimizer worked fine.

This Issue was closed on April 19th 2020
Powered by GitHub Issue Mirror