Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option::set() INSERT ... ON DUPLICATE KEY can be slow #12550

Merged
merged 1 commit into from Apr 2, 2018

Conversation

robocoder
Copy link
Contributor

https://www.percona.com/blog/2010/07/14/why-insert-on-duplicate-key-update-may-be-slow-by-incurring-disk-seeks/ explains:

Instead, what MySQL does is the following:

  • call handler::write_row to attempt an insertion, if it succeeds, we are done
  • if handler::write_row returns an error indicating a duplicate key, outside of the handler, apply the necessary update to the row
  • call handler::update_row to apply the update
    ...
    So, the moral of the story is this. In MySQL, “insert … on duplicate key update” is slower than “replace into”.

REPLACE INTO might be faster but the semantics are:

INSERT;
if failure then
    DELETE;
    INSERT;
fi

Since Option::set() is largely called to update the value for an existing setting, we should try:

UPDATE;

if affected_rows === 0 then
    INSERT;
fi

It's not an atomic operation but overall, this approach strikes a good balance between performance and ACID.

@robocoder robocoder force-pushed the patch-12400 branch 2 times, most recently from 93e0d53 to dd60fbd Compare February 12, 2018 03:21
@sgiehl sgiehl added the Needs Review PRs that need a code review label Feb 12, 2018
@robocoder
Copy link
Contributor Author

A side-effect of this PR is that the SQL is database agnostic.

$bind = array($name, $value, $autoLoad);

Db::query($sql, $bind);
} catch (\Exception $e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for catching exceptions here? We didn't used to do that before. Wondering if that might have any side effect always updating the all array, even if the insert failed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all?

The proposed PR is no longer an atomic operation. So, the catch is for a potential race condition with concurrent requests to INSERT the same option name; i.e., only the first will succeed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some lines below the 'static' array is updated $this->all[$name] = $value;. This will now be done, even if the second query fails to update.

Db::query($sql, $bind);
} catch (\Exception $e) {
}
}

$this->all[$name] = $value;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was always the assumption that the query (INSERT...ON DUPLICATE) was successful. So, status quo behaviour for concurrent updates.

@mattab mattab modified the milestones: 3.3.1, 3.4.0, 3.4.1 Mar 19, 2018
@diosmosis
Copy link
Member

I don't think introducing potential race conditions is a good idea. Given how many different ways Options are used, it seems like this could potentially have far reaching and hard to debug effects.

Is there a reason REPLACE INTO won't work here? If it's atomic & has the same effect as the original code, then it should provide the same guarantees as the original. Or am I missing something?

I'm also not sure if Option setting is a bottleneck for Matomo. I wonder how much of an effect on performance this actually has? IMO if Option::set() really has that much of an effect on an individual request or command, then it's being called too many times.

@mattab @tsteur @sgiehl do you guys have any opinions?

@robocoder
Copy link
Contributor Author

There are tradeoffs in every approach. INSERT...ON DUPLICATE is simple but MySQL specific and has some performance problems as seen in #12400 and the linked blog post.

REPLACE INTO may be faster and atomic, but it isn't without its cons either. A snippet from the blog post's comments:

REPLACE always deletes the row, then inserts it with the new values. If you do this often enough you can end up fragmenting the innodb table space. With a fragmented table space things just get slower on rotational disks. Plus, you run the risk of running out of tablespace much faster than you would normally.

@robocoder
Copy link
Contributor Author

Like I said, there are always tradeoffs. For example, you're welcome to remove the try...catch and change the INSERT to INSERT IGNORE, but then you'll be MySQL-specific again.

@diosmosis
Copy link
Member

It's not the exception silencing itself that worries me, but the possibility for race conditions in general. If two competing processes try to insert different values into the same row, one process will win out, the other will fail, and the value in the DB may not be what we expect. Options are used in a lot of different ways, some of which are in concurrent contexts (like an option that's used as a mutex), and it worries me that there could now be a potential race condition in something that's used so widely, where we're no longer sure of what data gets written to the row. And that, to me, means potential bugs that are hard to reproduce and hard to debug (especially w/o any sort of logs to dig through).

As for whether it's important for the SQL to be database agnostic, that's a priority question & I can't answer that.

@robocoder
Copy link
Contributor Author

Matomo doesn't have cache coherence for option settings now. There's already a race condition when you have two concurrent requests that both call Option::set(). Each request has its own cached value and neither re-reads from the database after writing, so one of them is going to run to completion with an out-of-date value.

@diosmosis
Copy link
Member

diosmosis commented Mar 25, 2018

The current process will use an out of date value, but the next process execution will use the value we may not expect it to use.

@mattab mattab removed this from the 3.4.1 milestone Mar 26, 2018
@diosmosis
Copy link
Member

diosmosis commented Mar 27, 2018

Two ideas to make race conditions less of an issue here:

  • Change the semantics of Option::delete() and Option::deleteLike() (& any other delete methods if they exist) to set rows to NULL instead of actually DELETE-ing. This makes it so the race condition will only happen once, if it happens.
    I think this is do-able since Option::get() will return false if the option doesn't exist. So we can just check if the value is NULL and return false.

  • In the try-catch, when an exception is thrown, check if it is due to the row already existing. If not, re-throw. If yes, check if the existing value is different from the value we're saving & log a message including the option name & both values with log level INFO. If someone is doing log aggregation w/ Matomo, this error will at least not get lost.

This still leaves whoever is doing ops w/ something additional to consider, and the option table will be less human readable, but it will reduce the amount of times this error can happen.

I think an ideal fix would be creating a new type of component, like Options, which defines the options plugins use. Then we can ensure they always exist and ensure plugin option names don't clash. But that would be a large breaking change...

Note: this is just a proposal, not a review requirement. Waiting for other opinions.

@tsteur
Copy link
Member

tsteur commented Mar 27, 2018

Options are used in a lot of different ways, some of which are in concurrent contexts (like an option that's used as a mutex)

@diosmosis do you know where we do this? Might be worth at some point possibly to refactor this as nowadays options is often used just for like anything which it is maybe not supposed to be...

We could also if really needed have another method like setAtomic() which uses the old behaviour and it would be fine if this one is a bit slower in some cases.

Being MySQL-specific is totally fine as Matomo doesn't work with other SQLs anyway (AFAIK). So if there is a solution that provides us a fast execution that is also atomic I'd be 👍

Otherwise as @robocoder points out Option is already not really atomic anyway as the values are cached for the whole request etc (unless you call clearOptionCache or so).

@diosmosis
Copy link
Member

diosmosis commented Mar 27, 2018

@diosmosis do you know where we do this? Might be worth at some point possibly to refactor this as nowadays options is often used just for like anything which it is maybe not supposed to be...

@tsteur I think CronArchive might store a list or multiple lists in options during parallel execution of archiving? Sometimes PIDs are stored in options values to mark a command as being run by a specific process, etc. There might be some counters too. My knowledge is out of date so specifics escape me.

Otherwise as @robocoder points out Option is already not really atomic anyway as the values are cached for the whole request etc (unless you call clearOptionCache or so).

Actually thinking about it some more, I think the problem I was seeing was a phantom. I think I assumed the original SQL would throw on collision, but w/ ON DUPLICATE UPDATE, it of course wouldn't. So actually this PR is fine, and I will stop creating noise :)

This does highlight a problem in that there aren't really any atomic primitives, but I've had to take concurrency into consideration fairly often in the recent days. Not sure why GET_LOCK() wasn't used, though I remember using in some situations.

@robocoder
Copy link
Contributor Author

There are no special privileges for MySQL's GET_LOCK, but since it's a named lock in a global namespace, it would require a naming strategy for shared hosting users, and as a result, be at risk of DoS attack. It probably also requires MySQL 5.7.5 or newer, since older versions don't share locks across master-slave servers.

@mattab mattab added this to the 3.4.1 milestone Mar 27, 2018
@diosmosis
Copy link
Member

Seems like the consensus is to merge, so merging.

@diosmosis diosmosis merged commit da7aae3 into matomo-org:3.x-dev Apr 2, 2018
@mattab mattab changed the title Fixes #12400 - Option::set() INSERT ... ON DUPLICATE KEY can be slow Option::set() INSERT ... ON DUPLICATE KEY can be slow May 8, 2018
InfinityVoid pushed a commit to InfinityVoid/matomo that referenced this pull request Oct 11, 2018
@robocoder robocoder deleted the patch-12400 branch March 26, 2022 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Review PRs that need a code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants