Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix detection of a known visitor when there are multiple requests at same second for same visitor #6298

Closed
wants to merge 5 commits into from

Conversation

tsteur
Copy link
Member

@tsteur tsteur commented Sep 25, 2014

#6296 This is a bug in Piwik that must be there already for quite a while. What is happening is that Piwik creates a new visitor in case there are multiple requests at the same second. This happens especially when using bulk tracking.

Problem is this line: https://github.com/piwik/piwik/blob/2.7.0/core/Tracker/Visit.php#L476

It checks whether a row was changed and if not, it thinks it did not find the visitor. But what is actually happening is that an existing visitor is found but there was simply no change in any of those values (idvisitor, visit_total_time, visit_last_action_time). So MySQL will return 0 as well.

To always have at least one column that changes there the idea was to introduce a new column that will always change (or in 99.9999% of the cases) by generating a random string each time. This way we can differentiate between visitor actually not found or no change.

@diosmosis
Copy link
Member

Maybe instead of creating a new column this could be used: http://dev.mysql.com/doc/refman/5.1/en/innodb-locking-reads.html ?

If not, maybe the column could be a TINYINT(1) and the UPDATE would be UPDATE ... SET random = ! random; ?

@tsteur
Copy link
Member Author

tsteur commented Sep 26, 2014

I was thinking about the TINYINT solution in the beginning but didn't wanna use it as I thought I have to put it in core then (which I don't like). But now I noticed I can return something like return random = !random in the dimension class so will give it a try.

@diosmosis
Copy link
Member

Might it be better to hardcode in core? This is more of a relational database issue I think, I mean I can't think of a use of the 'Random' column in any analytics function, and it would it be unnecessary for data stores that don't have this issue of not being able to get the number of rows that would be affected in an UPDATE (assuming the data store had the same concept of an UPDATE in the first place).

@tsteur
Copy link
Member Author

tsteur commented Sep 26, 2014

The random =!random solution does not work. Only if I disable transactions

@mattab
Copy link
Member

mattab commented Sep 26, 2014

It's problematic to include major DB upgrade in the next release...

Maybe a solution?

According to mysql documentation, you can change the behaviour of affected_rows by passing the MYSQLI_CLIENT_FOUND_ROWS flags while connecting using mysql_real_connect.

In this case, mysql_affected_rows returns the number of rows matched by the WHERE condition, not the number of updated rows. source

Maybe we could use this instead?

@mattab
Copy link
Member

mattab commented Sep 28, 2014

@tsteur maybe my solution in #6323 work for issue #6296 ?

@tsteur
Copy link
Member Author

tsteur commented Sep 29, 2014

I can test it soon. Please note if merging #6323 we need to mention this at least in Changelog as it can break plugins etc. We'd also have to check our code base whether it changes anything but don't think we rely on this (even though the change so far is only in tracker DB)

@tsteur
Copy link
Member Author

tsteur commented Sep 29, 2014

I will close this one and we can merge #6323 once it is fixed

@tsteur tsteur closed this Sep 29, 2014
@tsteur tsteur deleted the 6296_empty_visits branch September 29, 2014 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants