New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix InnoDB deadlock with bulk tracking #12980
Conversation
@kachenjr as you pointed out in #6398, the invalidation of reports happens continuously per every single event. With this PR I have introduce cumulations of all the invalidated reports in a single array that is then checked at the very end of the requests, and only once per report. What do you think? I am no PHP expert, so feel free to report back on any technical issue you might find. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Started looking through the code and I'm wondering if this will actually solve the problem. If the problem is with multiple processes trying to set the same row and having to wait too long, then #12733 should have fixed it since this code no longer sets the same rows concurrently.
@kaplun Does the deadlock still occur for you after applying this patch? I'm trying to understand how it would solve the issue. If it does, and the problem is just writing to the Option table, then this issue could easily reoccur later since that table is used quite often. In this case, we may have to take a broader look at the entire process.
/** | ||
* Queue of archives to invalidate | ||
*/ | ||
private $archives_to_invalidate; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to use camel case for Matomo PHP code.
$this->archives_to_invalidate = array(); | ||
} | ||
|
||
public function __destruct() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want something to be run at the end of the request, would probably want to use register_shutdown_function
than depend on __destruct
. Actually may want to use the 'Tracker.end'
since this will be called before the DB connection is closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might explain why currently my patch doesn't actually write in the db (although there are no failures).
Option::set($mykey, '1'); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good if this logic was in a private method named after what it does (instead of __destruct
, something like insertArchivesToInvalidateOptions()
.
I'm going to try and reproduce this locally. EDIT: Unfortunately, not able to reproduce, appears to work fine for me. The table in the error message you posted in the issue mentions log_visit, maybe the reason for this deadlock has to do w/ that table? EDIT 2: Perhaps https://github.com/matomo-org/matomo/blob/3.x-dev/core/Db.php#L721 or something similar could be used in the Tracker Db class to get more information. |
Yeah unfortunately as you have identified, putting the closing code in the I am not sure at this point if postponing and collection all updates at the end is enough to avoid the deadlock. For sure (assuming the code is really called properly, and not like in the current version of this patch), it should reduce the number of writing to the DB (currently the invalidation information is written down per every single event being treated, which would mean e.g. 500 times in a bulk request, instead of just once at the end). |
What I mean about being confused about this PR is that the original PR fixed this specific case:
The other PR fixed this by appending the PID, meaning request A & B above would not try to update the same option row, and thus no one would wait for a lock. This PR moves the writing to one place, but that shouldn't have an effect on the concurrency at play here. Which doesn't mean it won't or doesn't fix the issue for you, it would just be good to find & verify the reason this should fix the deadlock before going ahead and just merging. (And note: thinking about this issue has made me very interested in this, so if you investigate further, please do share :). I can't do much since I can't reproduce it...). |
@kaplun are you still working on this? Be good to have a look at @diosmosis comment maybe |
No unfortunately I am no longer having the resources to work on it. I will try to check if we still have deadlock, and see. |
@mattab I reckon we can close the PR while we can't reproduce things or while we don't know if there is still an issue? I reckon as @diosmosis suggested #12733 might have fixed the issue and this PR doesn't change much in the end re possible deadlock. |
(Closes #6398)