@rick-pri opened this Issue on August 11th 2020

Php 7.4, Matomo 3.13.5 (currently the latest version on FreeBSD in the quarterly releases), MySQLI db adaptor.

I got an out of memory message with a 2GB php limit when trying to run through php ./console core:archive --force-all-websites --force-all-periods=315576000 --force-date-last-n=1000 --url=<domain> after importing a few days of tracking from server logs (we moved server infrastructure between cloud hosting providers). In June we had 14,185,886 visits, 67,764,365 pageviews, 93,096,415 actions for an idea of what I need to process.

However, I've been watching the memory usage of just the archiver creep up fro 1.7% to 3% of 7.21GB of RAM over the last 30-45 mins. I've increased the limit to 3GB now but the same problem is going to reoccur, just at 3GB now and long before I get through the all the archiving that needs to be done. I think that this is also why the normal archiving is having issues, and taking over 24hrs and never actually completing by looking at the logs. I have two app servers and am running the archiving on one of the servers.

DB is not really all that utilised, 16 Cores, 55/104GB and Disk IO throughput is pretty low <10MB/s read most of the time and less write.

@tsteur commented on August 13th 2020 Member

Thanks for this @rick-pri

Can you maybe increase the memory further and see how much it needs? That it requires more than 2GB may be not expected but it's hard to say. The memory often actually doesn't depend on the amount of traffic/visits/pageviews but eg on the number of different page URLs, page titles, referrers, etc being tracked.

@rick-pri commented on August 13th 2020

I have also encountering the issue at 3GB of RAM as the limit (it goes for longer and but then still results in the same error). The saving grace is that each time I'm a little further through my list of 5k domains that now I'm only about 400 from completing. I ran two side by side last night and thought that I had enough RAM for them to exit at the 3GB limit however they died after the server ran out of swap.

Edit: Not that it affects this as the issue still exists, but I realised that the DB RAM configuration was misconfigured when looking into this on Tuesday so I rectified that then. This archiving is a pain though, it takes 60s per site on the older sites, but is now down to about 30s with the newer websites (with data not going back as far.) This is even after getting our retention policy changed so that we now only have 2 years of archived data retained rather than 8 years' worth.

@tsteur commented on August 13th 2020 Member

Do you know if some of the sites have eg a lot of different page urls, page titles, download urls or referrer urls for example? That might explain the memory. are you saying it archives 2 years of data in 60s? that be pretty fast.

@rick-pri commented on August 14th 2020

Do you know if some of the sites have eg a lot of different page urls, page titles, download urls or referrer urls for example? That might explain the memory.

I would guess that there are quite a number of pages in some of these websites (the platform is a CMS which these are tracking) although how many pages actually get visits and downloads is another matter. Covid has seen us have quite a large amount of documents uploaded each month (over 1m per month across all the sites) and those documents will also be being downloaded a lot (we work in the education sector where home schooling has changed our usage patterns).

I kicked it off again just to see:

INFO [2020-08-14 08:14:44] 59411  Archived website id = 1, period = day, 0 segments, 8062660 visits in last 3660 days, 74 visits today, Time elapsed: 64.630s
INFO [2020-08-14 08:14:45] 59411  Will pre-process for website id = 1, period = week, date = last520
INFO [2020-08-14 08:14:45] 59411  - pre-processing all visits
INFO [2020-08-14 08:15:00] 59411  Archived website id = 1, period = week, 0 segments, 0 visits in last 3660 weeks, 0 visits this week, Time elapsed: 16.180s
INFO [2020-08-14 08:15:00] 59411  Will pre-process for website id = 1, period = month, date = last120
INFO [2020-08-14 08:15:00] 59411  - pre-processing all visits
INFO [2020-08-14 08:15:04] 59411  Archived website id = 1, period = month, 0 segments, 0 visits in last 3660 months, 0 visits this month, Time elapsed: 4.490s
INFO [2020-08-14 08:15:05] 59411  Will pre-process for website id = 1, period = year, date = last10
INFO [2020-08-14 08:15:05] 59411  - pre-processing all visits
INFO [2020-08-14 08:15:06] 59411  Archived website id = 1, period = year, 0 segments, 0 visits in last 3660 years, 0 visits this year, Time elapsed: 1.338s
INFO [2020-08-14 08:15:06] 59411  Archived website id = 1, 4 API requests, Time elapsed: 87.178s [1/5070 done]

This particular site will have an inordinately larger number of page titles, page urls and referrer urls because it's the logged in editing part of the website (because of historical reasons our users login and are redirected to a subdomain for accessing the CMS editing system which is site id 1 in Piwik.)

Is there anything to be done about the memory leakage though? It really should clean up after itself after every run. My guess is that garbage collection is not happening fast enough or just not actually happening for some of these objects?

are you saying it archives 2 years of data in 60s? that be pretty fast.

Hmm, we only keep 30 days of unarchived data, so it's the reprocessing of the archives or something, and that's per website which means this takes days to run, and actually, longer because it runs out of memory for the process and then I have to restart it.

@tsteur commented on August 16th 2020 Member

quite a large amount of documents uploaded each month (over 1m per month across all the sites)

This might explain the high memory usage if a lot of them are actually being downloaded or viewed.

It's likely not an issue with garbage collection or anything but likely just a problem with how Matomo calculates weeks/months and yearly data. Eg to calculate the yearly data it loads all 12 reports of every month and then aggregates them. Instead of loading two reports, aggregating them with each other, then load the next report and aggregate that one etc. This way there be max 2 report tables in memory compared to 12. It's something we looked into before but it's not easy to change. I would assume it's that.

If it's not that, it's hard to say without having full access to the system and profiling tools installed etc. Eg we had in past that some visitors would have eg hundred thousands of different referrers and the query caused a lot of memory on the DB and returned a lot of result sets. Finding the leak here would very very likely only work with profiling tools etc as it likely depends on the specific data that was tracked.

@rick-pri commented on August 17th 2020

Okay, I'll close this off now.

This Issue was closed on August 17th 2020
Powered by GitHub Issue Mirror