New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can we queue most tracking requests into one? #15050
Comments
Sounds like an impactful performance improvement 🚀
Would be interesting to know approx how much % of tracking requests would be POST vs GET with the new setting VS currently. Another solution could be to leave it enabled and communicate around Log Analytics Replay (in https://matomo.org/faq/log-analytics-tool/faq_19221/ and other places that this faq is linked from), that Log Analytics won't replay all requests accurately unless the setting xyz is enabled, to force Tracking API requests to be sent as GET to conserve all URL parameters of all Tracking requests. Maybe people use Log Analytics replay as a backup strategy, especially around updates/maintenance of Matomo? Before Upgrades (in app + guides + faqs) we could mention to enable the setting temporarily. We could even display an info message in the System check in the default case eg. -> In the end it may be easier indeed to just leave it disabled and make people opt-in the performance improvement. |
It's only needed for few people with high traffic anyway so be easiest to just have it disabled by default.
It's impossible to answer since it depends on the site how many requests they track. If someone only has pageviews they will be mostly GET requests. If they track events / content tracking etc many of them will be POSTS. Probably even depends on the page etc.
This might be a problem though. I haven't looked how Matomo timestamp works but my understanding is that Matomo wouldn't know exactly at which second something happened. This is why we would need to make sure to at least ping every 5-10 seconds otherwise the timestamps become inaccurate. Otherwise we could even group all requests up to 30 or 60 seconds, or 5 minutes... but then we'd need some logic to restore the correct time it happened |
AFAIK it should work fine as long as each request in the bulk request has each a |
I don't think that works since you can't know the time in UTC as local times are often wrong |
True.. So maybe we could have a new Tracking API for a bulk request to set the timestamp of parameters relative to each other like "5 seconds after the first one, 74 seconds after the first one". Btw another related improvement we could make is probably making Content Tracking more efficient. Taking your use case:
Maybe the 10 content tracking requests could also be merged into only one request, which would make the tracker api much more efficient and save queries. All content block impressions and interactions could be merged into one instead of often 10 or even 20 requests. This would be a big improvement whenever Content tracking is used and the page has blocks visible when scrolling. (we learnt recently of an on-premise user having to disable Content tracking because it caused too much load and made QueuedTracking lag behind processing several hours as a result). |
Yes the point be really to send them altogether as many as possible in a reasonable interval. It wouldn't save crazy heaps on resources but a few things will be cached and therefore faster. Especially when tracking a lot per page view |
BTW queueing more requests into one request has a slight disadvantage that more bulk requests will be used meaning more transactions will be happening and there's more of a risk of deadlocks. However, we are planning to take care of these locks soon anyway. |
Closing for now as it increases the chance of deadlocks. |
We should investigate whether all tracking requests could go by default into a queue and are sent together in one request when possible. Currently, only few premium features make use of the queue. We could enable it for ALL kind of tracking requests.
We have currently a queue timeout of 2.5s. If within 2.5s another request is queued, we'll currently wait another 2.5s before sending the currently queued requests. Could add some logic to make sure to send the requests within 10s max and then emptying the queue or so. There's not really any risk of losing any tracking requests now that we can use sendBeacon and it's already implemented to send the tracking requests for any queued requests on page unload.
This behaviour would only apply when queuedRequests are enabled. Currently, this is the case by default. The problem is that requests will be send using POST and can therefore no longer be replayed using log analytics etc. This means we likely need to disable this feature by default and users can enable this more efficient way of tracking if they don't use log analytics anyway.
This will tracking more effective since less tracking requests will be sent and more requests will be inserted at once. It might even reduces the chances for "0 action in visitor log" #6415 since it works somewhat just like QueuedTracking but in the browser...
The only thing I'm not sure about is whether it's a problem with server side timestamps. Say we queue 1 pageview, 10 content tracking requests, and 3 events into one bulk request... will Matomo record the same server time for all requests instead of maybe knowing the exact time when something happened?
The text was updated successfully, but these errors were encountered: