Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix InnoDB deadlock with bulk tracking #12980

Closed
wants to merge 1 commit into from

Conversation

kaplun
Copy link

@kaplun kaplun commented May 24, 2018

(Closes #6398)

@kaplun
Copy link
Author

kaplun commented May 24, 2018

@kachenjr as you pointed out in #6398, the invalidation of reports happens continuously per every single event. With this PR I have introduce cumulations of all the invalidated reports in a single array that is then checked at the very end of the requests, and only once per report.

What do you think? I am no PHP expert, so feel free to report back on any technical issue you might find.

@kaplun kaplun changed the title Fix InnoDB deadlock with bulk tracking WIP: Fix InnoDB deadlock with bulk tracking May 24, 2018
@kaplun kaplun changed the title WIP: Fix InnoDB deadlock with bulk tracking Fix InnoDB deadlock with bulk tracking May 24, 2018
@mattab mattab requested review from diosmosis and sgiehl May 31, 2018 19:07
@mattab mattab added the Needs Review PRs that need a code review label May 31, 2018
@mattab mattab added this to the 3.6.0 milestone May 31, 2018
Copy link
Member

@diosmosis diosmosis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Started looking through the code and I'm wondering if this will actually solve the problem. If the problem is with multiple processes trying to set the same row and having to wait too long, then #12733 should have fixed it since this code no longer sets the same rows concurrently.

@kaplun Does the deadlock still occur for you after applying this patch? I'm trying to understand how it would solve the issue. If it does, and the problem is just writing to the Option table, then this issue could easily reoccur later since that table is used quite often. In this case, we may have to take a broader look at the entire process.

/**
* Queue of archives to invalidate
*/
private $archives_to_invalidate;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to use camel case for Matomo PHP code.

$this->archives_to_invalidate = array();
}

public function __destruct()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want something to be run at the end of the request, would probably want to use register_shutdown_function than depend on __destruct. Actually may want to use the 'Tracker.end' since this will be called before the DB connection is closed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might explain why currently my patch doesn't actually write in the db (although there are no failures).

Option::set($mykey, '1');
}
}
}
Copy link
Member

@diosmosis diosmosis Jun 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good if this logic was in a private method named after what it does (instead of __destruct, something like insertArchivesToInvalidateOptions().

@diosmosis
Copy link
Member

diosmosis commented Jun 1, 2018

I'm going to try and reproduce this locally.

EDIT: Unfortunately, not able to reproduce, appears to work fine for me. The table in the error message you posted in the issue mentions log_visit, maybe the reason for this deadlock has to do w/ that table?

EDIT 2: Perhaps https://github.com/matomo-org/matomo/blob/3.x-dev/core/Db.php#L721 or something similar could be used in the Tracker Db class to get more information.

@kaplun
Copy link
Author

kaplun commented Jun 1, 2018

Yeah unfortunately as you have identified, putting the closing code in the __destructor doesn't seem to be effective (either the code is not really called, or it fails silently, because the net result is that with this path the option table is no longer updated).

I am not sure at this point if postponing and collection all updates at the end is enough to avoid the deadlock. For sure (assuming the code is really called properly, and not like in the current version of this patch), it should reduce the number of writing to the DB (currently the invalidation information is written down per every single event being treated, which would mean e.g. 500 times in a bulk request, instead of just once at the end).

@diosmosis
Copy link
Member

diosmosis commented Jun 1, 2018

What I mean about being confused about this PR is that the original PR fixed this specific case:

  1. tracker request A starts transaction & tracks action to site 1 / date 2018-05-11
  2. tracker request A updates option w/ key report_to_invalidate_1_2018-05-11
    • mysql now locks that specific option row & does not let go
  3. tracker request B starts & also tracks action to to site 1 / date 2018-05-11
  4. tracker request B tries to update to option w/ key reports_to_invalidate_1_2018-05-11, but since request A has a lock and is in a transaction, request B just sits waiting for the lock
  5. tracker request A takes a long time to finish & before it does, tracker request B gives up on getting the lock, resulting in a deadlock

The other PR fixed this by appending the PID, meaning request A & B above would not try to update the same option row, and thus no one would wait for a lock.

This PR moves the writing to one place, but that shouldn't have an effect on the concurrency at play here. Which doesn't mean it won't or doesn't fix the issue for you, it would just be good to find & verify the reason this should fix the deadlock before going ahead and just merging. (And note: thinking about this issue has made me very interested in this, so if you investigate further, please do share :). I can't do much since I can't reproduce it...).

@tsteur
Copy link
Member

tsteur commented Apr 11, 2019

@kaplun are you still working on this? Be good to have a look at @diosmosis comment maybe

@kaplun
Copy link
Author

kaplun commented Apr 12, 2019

No unfortunately I am no longer having the resources to work on it. I will try to check if we still have deadlock, and see.

@tsteur
Copy link
Member

tsteur commented Jul 4, 2019

@mattab I reckon we can close the PR while we can't reproduce things or while we don't know if there is still an issue? I reckon as @diosmosis suggested #12733 might have fixed the issue and this PR doesn't change much in the end re possible deadlock.

@tsteur tsteur closed this Jul 4, 2019
@mattab mattab modified the milestones: 3.13.0, 3.11.0 Jul 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Review PRs that need a code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tracker: Serialization failure: 1213 Deadlock found when trying to get lock
4 participants