Allow configuration of a shorter lock wait time in tracker mode (new INI setting innodb_lock_wait_timeout) #16092
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
MySQL wait locks can cause some performance issues on high traffic/concurrency loads. For example the following scenario:
#1
updating various entries in a transaction#2
eg for same visitor comes in#2
because it's waiting for the transaction in#1
to finish.To address this issue, the thought is to configure a shorter lock wait time so these locks aren't held for too long.
When an error
1205 Lock wait timeout exceeded; try restarting transaction
happens, then MySQL rolls back the current statement. Because there might be multiple inserts or updates within one tracking request we can't really safely replay the entire tracking request (unlessinnodb_rollback_on_timeout=1
was enabled). This means for now the behaviour be as follows:We need to see later if we maybe need to respond with a success tracking request when this happens but that's to be seen (to avoid having too many error requests making instances unhealthy). If any possible we don't want to do this.
A note on deadlocks
Thought I also quickly go into deadlocks in case someone thinks that's related which it isn't quite related.
This change does not affect deadlocks which may happen as part of bulk requests from the JS tracker. Deadlocks usually don't happen for log analytics or queued tracking because they don't send bulk requests in parallel for the same visitor but process them one after another.
Depending on the MySQL distribution (MySQL, MariaDB, RDS, Aurora, ...) or version these might be detected immediately without waiting for a few seconds. For some MySQL instances this behaviour depends on the MySQL
innodb_deadlock_detect
variable. FYI: Having this variable enabled can slow down performance on a high load MySQL.When a deadlock happens, MySQL rolls back the entire transactions. This means the current behaviour is that the tracking request fails and nothing in the DB is changed. It's like the bulk request didn't happen.
We could eventually catch deadlock errors and try to reply them once. This would be probably a separate PR.
You can also workaround by configuring
[Tracker]bulk_requests_use_transaction=0
in Matomo but then you need to expect more DB CPU usage and slower average response times cause more indexes are written etc.How can a lock wait timeout happen?
start transaction
update site set name = '...' where idsite = 1
update site set name = '...' where idsite = 1
Because the session 1 is not committed yet, the
session 2
needs to wait for thesession 1
to finish before it can have the row lock to update the site.How does a deadlock happen?
Basically deadlocks happen when there are two DB sessions and they both wait for a lock that the other one holds. Basically they both want the same locks but they get them in the opposite order like this:
start transaction
start transaction
update site set name = '...' where idsite = 1
update site set name = '...' where idsite = 2
update site set name = '...' where idsite = 2
update site set name = '...' where idsite = 1
Basically
session 2
waits for the lock to updatesite 1
whichsession 1
holds, butsession 1
waits at the same time for a lock to updatesite 2
whichsession 2
has.This is my understanding of how things work. I might be wrong :)