@toredash opened this Issue on April 27th 2020 Contributor

Hi,

There seems to be a race condition when using mutiple archiver jobs on the same node.

https://github.com/matomo-org/matomo/blob/647ac56ac210ef2125e13c1de4419aacacaa22e9/core/Filesystem.php#L426-L432

Notice that the file is checked for existence before the filesize is checked.

Two archive jobs are started in a bash as such:

CONCURRENT_ARCHIVERS=2
  for i in $(seq 1 $CONCURRENT_ARCHIVERS)
  do
    (/var/www/console core:archive -vvv --concurrent-archivers=$CONCURRENT_ARCHIVERS) &
    pids+=($!)
  done

A few times a day, I get this error message for 1 of the 2 started processes:
WARNING [2020-04-27 03:39:26] 2654 /var/www/core/Filesystem.php(430): Warning - filesize(): stat failed for /var/www/tmp/climulti/archive.sharedsiteids.pid - Matomo 3.13.4. I'm not able to forcefully reproduce it, but we know if happens a few times a day since we get a notification if a archive process exits with non-zero code.

We only see this in our test environment, which has many sites but no new stats added to the sites. Run time for each archive job is <3s.

I looked through the code, and the only thing that I could spot as a potential source is this:
https://github.com/matomo-org/matomo/blob/115527353a9e75e01aa4d263408956ae45403bea/core/CronArchive/SharedSiteIds.php#L119-L150

Is there something there, where if running multiple archivers that completes in less than 5s, where this can cause issues ?

I don't have a suggestion for a fix now. Attached the output from the archive processes that runs, where one of them (b) is giving a WARNING
a.log
b.log

@tsteur commented on April 27th 2020 Member

@toredash for concurrent archivers I always recommend adding a sleep and not start them at the very same time. Eg see https://github.com/matomo-org/matomo/issues/15267#issuecomment-565650904

eg something like

CONCURRENT_ARCHIVERS=2
  for i in $(seq 1 $CONCURRENT_ARCHIVERS)
  do
    (sleep $i && /var/www/console core:archive -vvv --concurrent-archivers=$CONCURRENT_ARCHIVERS) &
    pids+=($!)
  done

I know it might be rather a workaround, but it can prevent various issues potentially.

@toredash commented on April 28th 2020 Contributor

@tsteur thanks, the workaround is not a huge deal but good to know :)

Is https://matomo.org/docs/ in GitHub ? I would not mind adding a PR to enhance the docs but it does not seem to be available for contribution ?

I was about to open a PR to modify the output in CoreArchiver.php, but I'm afraid it isn't the correct place to add that information.

@tsteur commented on April 29th 2020 Member

It's on our website. Not sure if we have already some FAQ or content around multiple archivers. Happy to edit the page if you have any suggestion. I guess the best page be https://matomo.org/docs/setup-auto-archiving/

@toredash commented on April 29th 2020 Contributor

That was the page I was thinking about, but it does not seem it is placed on Github. If you could add it that would be great. If I could file a PR I would :)

@tsteur commented on April 29th 2020 Member

Updated it @toredash thanks

This Issue was closed on April 28th 2020
Powered by GitHub Issue Mirror