@tsteur opened this Pull Request on April 8th 2018 Owner

Currently, the cron archiver might start the same archiving command twice. For example when a site takes long to archive. To avoid running the same archiving process in parallel, we should check if such a job is already running and if so, not start the same archive command again.

I did some simple testing on my local machine but might need some more testing.

Only supports CLI archiving, not web archiving.

@tsteur commented on April 9th 2018 Owner

FYI: https://github.com/matomo-org/matomo/blob/archivenottwice/core/CronArchive.php#L705-L716 might cause some trouble in that possibly a weekly archive will be skipped - because the weekly is already in process but not finished - so it already starts the monthly. However, the weekly is not finished. This has been a problem for a while when eg a weekly archive fails but think might be even more of a problem now.

@mattab commented on April 9th 2018 Owner

@tsteur Good point, this PR should fix this issue? #12708

@mattab commented on April 12th 2018 Owner

I found another use case where one can trigger the same job in parallel twice, when --force-idsites=X,Y,... is used. It could be that someone starts archiving with the force idsites option, and then core:archive would also start again the same archiving. Or it could be the other way around.
-> So maybe we could check for processes core:archive that use --force-idsites with the siteid we're about to process and skip if it is.

(For example I just had an issue while re-processing some data after it was manually invalidated, and the core:archive started again to run while my force-idsites was running)

This Pull Request was closed on April 23rd 2018
Powered by GitHub Issue Mirror