Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the "scheduleReArchiving" function cannot handle bigger Matomo with many sites causing a critical failure of the upgrade due to excessive or endless amount of rows in the "archive_invalidations" table #19133

Open
MichaelRoosz opened this issue Apr 22, 2022 · 4 comments
Labels
Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced.

Comments

@MichaelRoosz
Copy link
Contributor

MichaelRoosz commented Apr 22, 2022

This migration from https://github.com/matomo-org/matomo/blob/4.x-dev/core/Updates/4.1.2-b1.php#L50 seems to cause an excessive or maybe even endless amount of rows in the table "archive_invalidations" under some conditions:

        if (!Rules::isBrowserTriggerEnabled()) {
            $dateOfMatomo4Release = Date::factory('2020-11-23');

            $cmdStr = $this->getInvalidateCommand($dateOfMatomo4Release);

            $migrations[] = new Updater\Migration\Custom(function () use ($dateOfMatomo4Release) {
                $invalidator = StaticContainer::get(ArchiveInvalidator::class);
                $invalidator->scheduleReArchiving('all', 'VisitFrequency', null, $dateOfMatomo4Release);
            }, $cmdStr);
        }

In my setup (7mil visits, 20mil pageviews, 50mil actions per month, 400 sites each with ~20 segments, all innocraft plugins installed) the archive cron ran for more than 3 days, using close to 3 GB of memory and never left this function call:

        $this->logger->debug("Applying queued rearchiving...");
        \Piwik\Tracker\Cache::withDelegatedCacheClears(function () {
            $this->invalidator->applyScheduledReArchiving();
        });

After these 3 days, table "archive_invalidations" had more than 38mil rows and auto increment was at 109mil.

I then:

  • killed the archive cron
  • removed the entry "{"idSites":"all","pluginName":"VisitFrequency","report":null,"startDate":1606089600,"segment":null}" from the array in option "ReArchiveList"
  • truncated table "archive_invalidations"

Now the archive cron runs (and finishes after some minutes) again and reports seem to get generated normally.

The issue happened right after I updated from Matomo 3.14.0 to 4.9.0.

@MichaelRoosz MichaelRoosz added the Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. label Apr 22, 2022
@Findus23
Copy link
Member

This might be related to #18281, #17474 and https://forum.matomo.org/t/matomo-on-premise-scaling/45262

@MichaelRoosz
Copy link
Contributor Author

MichaelRoosz commented Apr 22, 2022

Yes, looks like #18281, #17474 and this are the same issue.

If I understand correctly, the migration is not needed when upgrading from Matomo 3.x as there was never any version 4.x before 4.1.2 installed.

In this case I would say we should not run the migration at all.

And for all other cases where some 4.x < 4.1.2 version had been running, the problem will get bigger and bigger with time as there is this hardcoded date of "2020-11-23".

--- however --
The problem ist not directly the migration, all it does is invalidating a large number of reports.
I think we need to break this "scheduleReArchiving" step apart into multiple steps, and limit it to some maximum amount of invalidations per run.

And also we should verify if all these 38mil+ invalidations are really needed or if we are inserting some duplicates or wrong entries. I still have a backup (5GB) of the "archive_invalidations" table with 38mil rows.

I saw that there are many rows with the same idsite, date1, date2 and period.
The only difference was the hash in "done[hash].VisitFrequency" (ex.: "done6c996f8eb8c159f5497d1a12284b3aaf.VisitFrequency").

I can do some further investigation if you want to know more about the 38mil. table.

@justinvelluppillai
Copy link
Contributor

Hi @MichaelRoosz - did the fix in #18281 work for you?

@MichaelRoosz
Copy link
Contributor Author

MichaelRoosz commented May 2, 2022

Hello @justinvelluppillai ,
I have applied my own fix (see my first post)

  • killed the archive cron
  • removed the entry "{"idSites":"all","pluginName":"VisitFrequency","report":null,"startDate":1606089600,"segment":null}" from the array in option "ReArchiveList"
  • truncated table "archive_invalidations"

The fix in #18281 is basically the same idea, but it deletes everything in "ReArchiveList" instead of only the problematic invalidation.

Still these are not "real" fixes, as the cause of the problem is the "scheduleReArchiving" function, which cannot handle bigger setups with many sites at the moment.

@justinvelluppillai justinvelluppillai added this to the For Prioritization milestone Nov 14, 2022
@mattab mattab changed the title Critical issue with "VisitFrequency" migration in 4.1.2-b1.php the "scheduleReArchiving" function cannot handle bigger Matomo with many sites causing a critical failure of the upgrade due to excessive or endless amount of rows in the "archive_invalidations" table Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced.
Projects
None yet
Development

No branches or pull requests

3 participants