allocateNewArchiveId: Cannot get named lock allocateNewArchiveId #6417

mattab · 2014-10-11T00:31:32Z

There have been several reports of this error: allocateNewArchiveId: Cannot get named lock allocateNewArchive

2014-10-10 12:47:17 ERROR [2014-10-10 12:47:17] [6c725] Got invalid response from API request: index.php?module=API&method=API.get&idSite=1681&period=day&date=last2&format=php&trigger=archivephp. Response was 'a:2:{s:6:"result";s:5:"error";s:7:"message";s:95:"allocateNewArchiveId: Cannot get named lock allocateNewArchiveId.piwik_archive_numeric_2014_10.";}'

The goal of this issue is to investigate and make this bug does not occur again.

The text was updated successfully, but these errors were encountered:

tsteur · 2014-10-15T03:44:02Z

I had a look but not sure how to fix it unless we completely get rid of it (and using for instance other IDs for idarchive instead of our auto increment) as it somehow always can happen. But maybe I'm wrong.

Does anyone know if it is possible to do GET_LOCK(0.1). Haven't found anything about this. Docs say "seconds" but not sure if it would work with something like this. Update: I did a little test and it seems to be converted to 0 seconds. Not sure if there is another way to wait for a lock for like 100ms. Couldn't find anything. I was expecting the second parameter is only a timeout and the lock will be released immediately but according to a check I did this is not the case.

I had one get_lock() and released it after 30 seconds. Had another get_lock(,100) and although it was released after 30 seconds I got the lock after 100 with my PHP 5.3 / MySQL 5.5 version.

diosmosis · 2014-10-15T04:39:16Z

Maybe we can use an atomic count to keep track of the max number of archive IDs instead of using a lock? Ie, instead of getting a lock and doing a SELECT + INSERT, we could store the max id archive in an archive row, and do update/select like this: http://stackoverflow.com/questions/15034136/select-updated-rows-in-mysql#answer-15034351 . This way, we effectively reserve an ID and can safely insert afterwards. I think it might work.

tsteur · 2014-10-16T01:44:34Z

Not sure how you would set the id here or rather how you would get the maxidarchive?

Wouldn't this be the same as this?

 $insertSql = "INSERT INTO $numericTable "
    . " SELECT @var := IFNULL( MAX(idarchive), 0 ) + 1,
                        '" . $locked . "',
                        " . (int)$idSite . ",
                        '" . $date . "',
                        '" . $date . "',
                        0,
                        '" . $date . "',
                        0 "
    . " FROM $numericTable as tb1";
Db::get()->exec($insertSql);

$selectIdSql = "SELECT @var";
$id = Db::get()->fetchOne($selectIdSql, $locked);

Problem is if 2 or more requests are selecting max(idarchive) at the same time we would again need a lock as with innodb not the whole table is locked.

diosmosis · 2014-10-16T01:54:56Z

My thought does not involve getting MAX(idarchive). The insertion happens in two steps:

reserve new idarchive

Db::query("UPDATE archive_numeric SET value = value + 1 WHERE name = 'specialIdArchiveCount' AND @reservedId := value");

$reservedId = Db::fetchOne("SELECT @reservedId");

use new idarchive in insert

Db::query("INSERT INTO archive_numeric (idarchive, ...) VALUES (?, ...)", array($reservedId));

Complications arise when the special archive doesn't exist, but I have not thought that far. An option could be used for each archive table as well.

tsteur · 2014-10-16T02:41:50Z

Using another table for incrementing id would work. We did this in some other projects to be able to shard if needed. That's why we didn't use auto_increment feature in MySQL. What we did there is basically following:

Table consisting of column name and value.

UPDATE table SET value = LAST_INSERT_ID(value + :count) WHERE name = 'archive_numeric_2014_01';

followed by

SELECT LAST_INSERT_ID()

works as last insert id is managed per connection. Count would allow you to optionally reserve multiple ids in one step if needed.

@mattab any opinions?

diosmosis · 2014-10-16T02:53:26Z

Can you do it with one table or do you need one per archive numeric table?

diosmosis · 2014-10-16T02:56:43Z

Ah ok, saw the name part of the WHERE.

diosmosis · 2014-10-16T03:07:46Z

If a new table is used I'd like to request name refer not to the table per-say, but the name of the atomic count, so it can be used in other ways. This would require being able to reset a count, not sure if that can be done w/ LAST_INSERT_ID(...) being used?

tsteur · 2014-10-16T03:26:21Z

Not sure what you meant with your first sentence?

Should work with last_insert_id. Not sure if there is any difference in using this one or @var. Only posted it as we used it this way and worked perfectly even under load.

diosmosis · 2014-10-16T03:32:35Z

I mean instead of just using it for the IDs of tables, use it as a table for mysql based semaphores (so name could be, for example, 'CronArchive.someDistributedCount'.

tsteur · 2014-10-16T03:34:47Z

yeah sure this can be done out of the box I think. Basically this is what we would use it for with idarchive. Do you have a specific use case in mind? Maybe better for finding a decision whether we want to create such a table or leave the lock.

diosmosis · 2014-10-16T03:41:59Z

I've already implemented something similar (uses Option table and value = value + 1) for #5363 locally, so using the table for this would just mean less duplication. The class I've created is called Semaphore and is located in core/Concurrency (in case someone decides to implement the table in this ticket).

tsteur · 2014-10-30T22:24:09Z

@mattab what's the thought here? Is there actually a need to put a critical and major label here? I started ignoring those labels as they are set just so often and to me sometimes "randomly"

mattab · 2014-10-30T23:08:50Z

@tsteur Meant only to put one not two labels.

It's a Major bug as we regularly have it on Cloud and Enterprise. If you have an idea how it could be solved please give it a try, it would be fantastic to get rid of this issue!

tsteur · 2014-10-31T00:30:29Z

as discussed we could get rid of at least one of those two by having a "counter" in MySQL. Can work on this.

Not sure whether we can get rid of the second one (haven't really thought about this one yet). Maybe it would already occur less or it might be already fixed by removing one of them (probably not).

mattab · 2014-10-31T00:38:17Z

as discussed we could get rid of at least one of those two by having a "counter" in MySQL. Can work on this.

+1 it's a good start to get rid of one of the two locks.

Not sure whether we can get rid of the second one (haven't really thought about this one yet).

Maybe you can think about it once you start removing the first lock. Would of course be nice to get rid of both locks if possible 👍

tsteur · 2014-11-02T22:03:41Z

I am currently implementing the "sequence" table and came across one problem that I already expected.

Imagine we have a new table to count the sequences. When updating Piwik we'll have to "migrate" the data to this table. Meaning we have to create several initial counter values such as archive_numeric_2014_10 => 4822 (name => value/currentMaxId).

Usually we would show the SQL for migration during the update process but the ID might have already changed until someone actually executes the update (either manually, console, UI). To workaround this one could say we simply add +10000 to the next archive id so even if some more ids are created meanwhile it would not fail.

Problem is the code will be already updated and wants to use the new sequence table whereas the SQL update is not executed yet. So all archive creation during this time won't work. Meaning we can also not migrate those sequence ids in the background as the table won't exist yet.

So probably we'll have to detect whether the sequence table is installed and if so migrate in the background (should be very fast as we only have to get current max archive id and insert this value into an empty table) otherwise if table is not installed yet go with old logic. Alternatively we could say something like "when creating the next archive table use sequence table otherwise still use lock" although not sure how to implement it. Need to think a bit more about it. I'll check out what is easier to implement.

tsteur · 2014-11-02T23:10:35Z

Using the sequence table starting from the next archive table and the lock for the current archive table should work but might lead to issues if someone updates around the beginning of a month. Also we could never get rid of the code that uses lock and it could lead to inconsistent ids when lock is used due to any db failure although sequence should be used. ==> This is not an option. (I know this all sounds a bit confusing but simply ignore it if you do not understand)

We could detect in the background whether the table exist and if not, we can create it. We could probably also detect whether a sequence was initialized (meaning whether we have migrated the max archive id to the sequence table) and if not migrate it on the fly. I don't think this would add any overhead but it is possible that two processes migrate the max archive id at the same time which could maybe lead to errors again but don't think so. Will probably go with this solution.

Otherwise we could just create the sequence table and migrate the max archive id during the update but any archive creation would not work between the time of updating the files and finishing the SQL update. Is this a problem? @mattab

mattab · 2014-11-02T23:49:17Z

Otherwise we could just create the sequence table and migrate the max archive id during the update but any archive creation would not work between the time of updating the files and finishing the SQL update. Is this a problem?

AFAIK Piwik will return an error message in the API output when the DB schema is pending some upgrades. So it should be OK to do this! (the archiving will not have triggered anyway)

tsteur · 2014-11-03T23:11:36Z

See pull request.

Removed the lock to create the new archive id. The second lock is still there but this problem should not really occur anymore. The second lock locks no longer the whole table but only the "table + archiveId" which should be fine. We could maybe just remove the second lock completely ( https://github.com/piwik/piwik/pull/6583/files#diff-f0dc4d4d133f84005fcebc1e10f155c0R159 ) if InnoDb is used as there are also indexes on both columns (should not lock the whole table) but not sure whether we can remove it as I still do not 100% understand what is happening in the archiver.

Would be nice to run some tests with the sequence table to see if any problems occur under load (should not but one never knows). Otherwise the system tests say everything still works so I reckon the change should be fine.

refs #6417 this should fix some tests. Start initial value by 0 so first generated id will be 1 fix omnifixture creation was broken refs #6417 wondering if omnifixture creation worked refs #6417 acquire a lock per archive id which should be better refs #6417 we have to generate the SQL in the update itself otherwise the update would fail at some point in the future if the tables signature changes refs #6417 split updates into a separate update file otherwise it would not be executed for devs using Piwik from git refs #6417 added some documentation refs #6417 Piwik 2.9.0-b2 was released meaning we have to move it to b3 refs #6417 Piwik 2.9.0-b3 was released meaning we have to move it to b4

mattab · 2014-11-04T05:29:02Z

The second lock locks no longer the whole table but only the "table + archiveId" which should be fine.

Oh interesting, was this a bug before that we locked the whole table?

but not sure whether we can remove it as I still do not 100% understand what is happening in the archiver.

Can you check the following use case still work:

When two widgets request the same date+period+idsite at the same time
Expected: only one archive is processed (and other widgets "wait" for the archive to finish to grab it)
Not expected: both widgets process each an archive (processing twice the same data)

tsteur · 2014-11-04T20:49:56Z

It creates the same archive multiple times but same happens with master branch. I'm not sure if you verified this in #4918 or if you only had a look at the time. I see the same behavior when adding the 20seconds to archiver (dashboard loads after about 23 sec) as you described in that issue but when verifying the archives one can see it does create the same archive multiple times

mattab · 2014-11-05T01:49:28Z

Likely I didn't check properly but I intended to check (and thought it worked but possible i bugged). If it's not working in master already then it's fine. Maybe we should create a ticket for this enhancement.

refs #6417 this should fix some tests. Start initial value by 0 so first generated id will be 1 fix omnifixture creation was broken refs #6417 wondering if omnifixture creation worked refs #6417 acquire a lock per archive id which should be better refs #6417 we have to generate the SQL in the update itself otherwise the update would fail at some point in the future if the tables signature changes refs #6417 split updates into a separate update file otherwise it would not be executed for devs using Piwik from git refs #6417 added some documentation refs #6417 Piwik 2.9.0-b2 was released meaning we have to move it to b3 refs #6417 Piwik 2.9.0-b3 was released meaning we have to move it to b4

mattab · 2014-11-07T06:10:36Z

Closing as @tsteur has finished this nice solution of a sequence table, and fixed the other "deletion" lock 👍

mattab added the Bug For errors / faults / flaws / inconsistencies etc. label Oct 11, 2014

mattab added this to the Piwik 2.9.0 milestone Oct 11, 2014

mattab added the Major Indicates the severity or impact or benefit of an issue is much higher than normal but not critical. label Oct 11, 2014

tsteur added a commit that referenced this issue Oct 16, 2014

refs #6417 added a simple test for creating archiveid

78dd8a7

mattab added the Critical Indicates the severity of an issue is very critical and the issue has a very high priority. label Oct 23, 2014

mattab removed the Critical Indicates the severity of an issue is very critical and the issue has a very high priority. label Oct 30, 2014

tsteur added a commit that referenced this issue Nov 3, 2014

refs #6417 use sequence table to create archive ids

f462c58

tsteur mentioned this issue Nov 3, 2014

Do not lock when creating a new archive id #6583

Merged

tsteur added a commit that referenced this issue Nov 4, 2014

refs #6417 seems to be unused now

eaa080e

tsteur added a commit that referenced this issue Nov 6, 2014

refs #6417 use sequence table to create archive ids

4dc03cb

tsteur added a commit that referenced this issue Nov 6, 2014

refs #6417 seems to be unused now

1621ad9

tsteur added a commit that referenced this issue Nov 6, 2014

refs #6417 moved update to b7 as b6 was released

6b9c69b

tsteur self-assigned this Nov 7, 2014

mattab closed this as completed Nov 7, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allocateNewArchiveId: Cannot get named lock allocateNewArchiveId #6417

allocateNewArchiveId: Cannot get named lock allocateNewArchiveId #6417

mattab commented Oct 11, 2014

tsteur commented Oct 15, 2014

diosmosis commented Oct 15, 2014

tsteur commented Oct 16, 2014

diosmosis commented Oct 16, 2014

tsteur commented Oct 16, 2014

diosmosis commented Oct 16, 2014

diosmosis commented Oct 16, 2014

diosmosis commented Oct 16, 2014

tsteur commented Oct 16, 2014

diosmosis commented Oct 16, 2014

tsteur commented Oct 16, 2014

diosmosis commented Oct 16, 2014

tsteur commented Oct 30, 2014

mattab commented Oct 30, 2014

tsteur commented Oct 31, 2014

mattab commented Oct 31, 2014

tsteur commented Nov 2, 2014

tsteur commented Nov 2, 2014

mattab commented Nov 2, 2014

tsteur commented Nov 3, 2014

mattab commented Nov 4, 2014

tsteur commented Nov 4, 2014

mattab commented Nov 5, 2014

mattab commented Nov 7, 2014

allocateNewArchiveId: Cannot get named lock allocateNewArchiveId #6417

allocateNewArchiveId: Cannot get named lock allocateNewArchiveId #6417

Comments

mattab commented Oct 11, 2014

tsteur commented Oct 15, 2014

diosmosis commented Oct 15, 2014

tsteur commented Oct 16, 2014

diosmosis commented Oct 16, 2014

tsteur commented Oct 16, 2014

diosmosis commented Oct 16, 2014

diosmosis commented Oct 16, 2014

diosmosis commented Oct 16, 2014

tsteur commented Oct 16, 2014

diosmosis commented Oct 16, 2014

tsteur commented Oct 16, 2014

diosmosis commented Oct 16, 2014

tsteur commented Oct 30, 2014

mattab commented Oct 30, 2014

tsteur commented Oct 31, 2014

mattab commented Oct 31, 2014

tsteur commented Nov 2, 2014

tsteur commented Nov 2, 2014

mattab commented Nov 2, 2014

tsteur commented Nov 3, 2014

mattab commented Nov 4, 2014

tsteur commented Nov 4, 2014

mattab commented Nov 5, 2014

mattab commented Nov 7, 2014