This is a pretty nasty denial of service bug:
Some actions, for example:
will call core\Tracker\Cache->deleteTrackerCache(), which deletes the whole cache.
This causes core\Plugin\Manager->getLicenseValidInfo() to get called for every activated premium plugin on the next tracking request or admin panel request.
getLicenseValidInfo() will send a http request to the matomo marketplace server.
On big setups like mine (>8000 tracking requests per minute) there are 350+ php workers running at the same time, many of them will call getLicenseValidInfo() roughly at the same time after a deleteTrackerCache() cache.
This means: About every 4th time I edit a users permissions or add a new site, the API request doing this admin operation will take over 60 seconds to complete, if it completes at all.
And during and after this, all 350+ php workers will get stuck, all trying to send a http request to the matomo marketplace. Bascially thousands of http request at the same time.
Our load balancer will then report for about 60s with "502 Bad Gateway" because all web servers are stuck on sending thousands of http requests to the matomo marketplace.
We have about 12 premium plugins activated, and about 700 php workers, this means up to 700*12 = 8400 http requests at the same time caused by a single admin action. These will possibly add up if editing multiple users in a short time period.
During this 60+ seconds we lose thousands of tracking requests.
I think an easy fix for now would be to introduce a new "plugin cache", that is independent from the normal "cache" and not affected by deleteTrackerCache(),
Alternative fix: do not call getLicenseValidInfo() during tracking requests.
I can try to create a PR for this, but I would like to know how you want to proceed.
Quickly looking at the tracker cache the problem looks indeed that
deleteTrackerCache deletes all lazy caches in https://github.com/matomo-org/matomo/blob/3.10.0-b1/core/Tracker/Cache.php#L216 might not be too easy to change though as it could break something maybe.
We could in tracker mode only access a cached result. Eventually an archiver or something would create the cache key again. Or if you currently delete a user, the cache would be quickly created again through the UI while you are logged in. Of course if someone uses eg the API to delete a user, and then doesn't issue another API request, and doesn't have any archiving active then it may take a while for the cache to be written again but it should all be fine. In the worst case for a little while a bit more information is tracked.
Changing user permissions takes now
I was not able to make it overload again, so the denial of service / overload problem seems to be gone.
While the problem is gone, maybe it would still be a good idea to separate the plugin license cache in the future, maybe with Matomo 4? Most of the time when I edit users or sites, I edit a bunch of them, and it seems a bit wasteful to check all plugin licenses again for every single action.
But this issue is resolved with your patch, thanks a lot :)
The cache is kind of separate. The problem is really that
Tracker\Cache should here instead only
deleteCacheGeneral() and have some feature like