@peterbo opened this Issue on February 19th 2020 Contributor

I'm setting a User-ID, when a user visits a given Site. On a certain action, I'm triggering a goal serverside with his User-ID as a parameter (and a token). Effect after the Update from 3.12 to 3.13.2 is that the serverside triggered action is not only stored in another visit (which would be ok), but also fails to recognize the web-visitor with the same User-ID. For reference, the before/after screenshots:

Before (Visitor recognized -> new visit but returning visitor):
userid-before1

After (Visitor is not recognized -> Visit is not returning -> new visit and new visitor):
userid1

Serverside call:
https://example.com/piwik.php?token_auth=XXX&cdt=2019-08-07 18:56:10&idgoal=3&revenue=1234&idsite=X&rec=1&r=13454&uid=1234567890

In config, trust_visitor_cookie is disabled.

The reason for that is this change: https://github.com/matomo-org/matomo/commit/ea5a14bdf8aa9608cdc2ab7d5c8236a5ff1eb3e2#diff-6700aaf1ce500fe51e284b9ec6f01b01

The change works in the right direction, but now, a User-ID is only assigned to the same visitor, when also the config_id matches. This doesn't make sense, because the main use case is for example a user who logs into a website with different devices (GDPR aside, but the User-ID is for example the customer ID). This user should be recognized as the same visitor (not necessarily the same visit, but at least the same visitor). @MichaelHeerklotz

Refs https://github.com/matomo-org/matomo/pull/14360

@MichaelHeerklotz commented on February 19th 2020 Contributor

This is intended. Since the referenced change the UserID no longer influences the visitor ID or visit ID.

@peterbo commented on February 19th 2020 Contributor

Is this really what we want? In this case, the User-ID has no more added value than a named CustomDimension.

I think the primary use-case is the cross device recognition of users. I know many instances where the User-ID is used and it is always for this case. If you want to record a User-ID for a visit, decoupled from user recognition logic, one could simply use a custom dimension.
Secondary use-case, but equally important, is to receive conversions / other actions server-side from external sites / own internal systems to measure campaign success / other KPIs for a given unique visitor/user.

Using a forced visitor-ID does by far not offer the same flexibility as the User-ID did. And speaking of semantics - User-ID, in my opinion, implies, that's a single user, using multiple devices or is tracked in different places. Therefore it must be the same visitor (in analytics terms).

Dimension definitions from my point of view:

  • visitor-ID: automatic/technical ID to recognize a visitor. Implements default definitions out of the box
  • forced visitor-ID: technical - for developers to solve problems (e.g. Apps native->webview)
  • User-ID: non-technical - for the analyst to define, which information should take precedence over the default mechanism. Can be introduced easily via TagManager. Easy to understand and use.

The proposed use-case from https://github.com/matomo-org/matomo/pull/13620 This is useful for example when using the third party cookie, and thus all Matomo sites use the same "global" visitorId for the same device, and some Matomo sites set a userid. Is not practical anymore, because 3rd party cookies are not GDPR compliant and blocked by most new browsers anyway. AFAIK, the visitor_id is queried in connection with the site-id. In my opinion, the other use cases are far more important than this one. I don't quite follow what's the point of changing this feature towards edge cases (3rd party cookies, cross site-id tracking, or a user that has 10 different accounts to log into online gambling) / case that is not relevant anymore.

@tsteur commented on February 19th 2020 Member

ping @mattab

I'm not so much into this topic. I suppose it wouldn't help to do something like setVisitorId(hashedUserId.substr(0,16)) (pseudo code)?

I suppose in general the idea was maybe also that you can see what a user did before logging in and after as part of the same visit? But indeed tracking cross device is getting more complicated.

@mattab commented on February 21st 2020 Member

Thanks for the report @peterbo!

If you want to record a User-ID for a visit, decoupled from user recognition logic, one could simply use a custom dimension.

That's a good point :thinking: In general it was on purpose to generate separate visits on each device for a same user, but in retrospect I see what you mean that it has become more like a custom dimension and not as valuable maybe.

I'm not so much into this topic. I suppose it wouldn't help to do something like setVisitorId(hashedUserId.substr(0,16)) (pseudo code)?

Yes this could help you @peterbo if you run this code in JavaScript, and then on the server-side you also get to generate the same Visitor ID. It might be the easier solution in your case. another solution be to get the Visitor ID from your visitors and store it in your DB for each visitor/user, and then set it again when tracking the conversion server-side.

I suppose in general the idea was maybe also that you can see what a user did before logging in and after as part of the same visit?

Yes, it was the idea also, and an advantage of changing the implementation...

What do you think?

Pending tasks/bugs

  • in the "after" screenshot, the visit is not marked as "Returning" when a same "user id" visits twice. Expected that the Visits log shows a Returning icon and API mark the visit as a returning visitor, when it had a recent visit with the same User ID. Especially that on your "After" screenshot the 2nd visit was less than 30min after the previous one, so we would have expected it to show the returning visitor icon.

  • Also the FAQ needs updating as it explains the old algorithm with User id https://matomo.org/faq/general/faq_21418/ eg. If a User ID is set, either via setUserId in your favorite SDK or via &uid= in the Tracking API, this User ID will be converted (hashed) into a Visitor ID hexadecimal string. The hashed User ID becomes the Visitor ID. We look first for visits where the log_visit.idvisitor matches this Visitor ID (User ID). If no visit is matched, we look for visits where the log_visit.config_id matches the visitor fingerprint.
@peterbo commented on February 21st 2020 Contributor

Hey @mattab thanks for the feedback!!

Yes this could help you @peterbo if you run this code in JavaScript, and then on the server-side you also get to generate the same Visitor ID. It might be the easier solution in your case. another solution be to get the Visitor ID from your visitors and store it in your DB for each visitor/user, and then set it again when tracking the conversion server-side.

Making it work again is not a problem. Unfortunately, it's not that easy, because we can't execute business logic on the external endpoints. But I can create a plugin that changes recognition logic. That's not really the Problem.
It's rather, that a key feature changed and now can't be used natively for these modern and arising use-cases anymore (e.g. cross device / cross API).

I suppose in general the idea was maybe also that you can see what a user did before logging in and after as part of the same visit?

That's a valid point. However, this is an adjacent use-case to all other uses, and, from my understanding, can be achieved easily with simple CustomDimensions. But also in this case, it doesn't make sense (at least to me) not to recognize the unique visitor again.

in the "after" screenshot, the visit is not marked as "Returning" when a same "user id" visits twice. Expected that the Visits log shows a Returning icon and API mark the visit as a returning visitor, when it had a recent visit with the same User ID. Especially that on your "After" screenshot the 2nd visit was less than 30min after the previous one, so we would have expected it to show the returning visitor icon.

That's what I'd have expected as well. Then the feature would also work for cross device. Decoupling from Visitor-ID is not a bad idea per se, but I feel that, at the moment, the feature is drifting in between use cases and not at all easiy to understand for the average user. Perhaps would be good to have a "default" behavior which can be configured towards a use case for advanced users?

@mattab commented on February 25th 2020 Member

@peterbo

Perhaps would be good to have a "default" behavior which can be configured towards a use case for advanced users?

Feel free to create a separate issue with your thoughts for this :+1:

Still pending tasks/bugs as part of this issue

  • in the "after" screenshot, the visit is not marked as "Returning" when a same "user id" visits twice. Expected that the Visits log shows a Returning icon and API mark the visit as a returning visitor, when it had a recent visit with the same User ID. Especially that on your "After" screenshot the 2nd visit was less than 30min after the previous one, so we would have expected it to show the returning visitor icon.

  • Also the FAQ needs updating as it explains the old algorithm with User id https://matomo.org/faq/general/faq_21418/ eg. If a User ID is set, either via setUserId in your favorite SDK or via &uid= in the Tracking API, this User ID will be converted (hashed) into a Visitor ID hexadecimal string. The hashed User ID becomes the Visitor ID. We look first for visits where the log_visit.idvisitor matches this Visitor ID (User ID). If no visit is matched, we look for visits where the log_visit.config_id matches the visitor fingerprint.
@tsteur commented on February 25th 2020 Member

@mattab what is the benefit of the current userId behaviour of a custom dimension? If there's no clear benefit, I would 100% vote to change behaviour back to original behaviour and no flags on how things work.

@peterbo commented on February 25th 2020 Contributor

Expected that the Visits log shows a Returning icon and API mark the visit as a returning visitor, when it had a recent visit with the same User ID

I doubt that Matomo Core would be able to handle that (a returning visitor flag for a different visitor ID). E.g. Visitor-Log: A visit is flagged as returning and when you open the Visitor profile, you will only see one visit. This will probably also be the case for visitor based report archiving (being flagged as returning visitor but counted as two unique visitors) -> returning visitor reports will be distorted.

For really decouple visitor ID from user ID and really adding value, probably some core modifications (archiving, visitor log, etc.) would be necessary. This would be part of a new ticket.

So at the moment, in my opinion, rolling back or creating a config setting for a default behaviour would be the best options. What do you guys think?

@MichaelHeerklotz commented on February 26th 2020 Contributor

Why revert it? If you want the old behaviour just set the visitor id manually.
... and my first pull request actually had a setting to set the userid behaviour per-site, but you refused it. Honestly I really do not want to go back to applying patches for each Matomo update. I am thinking about forking and continuing the project under a new name. Currently userid is somewhat like a custom dimension, but one that automatically creates new visits. It's not something that can be replaced by simply using a CD.

@MichaelHeerklotz commented on February 26th 2020 Contributor

because 3rd party cookies are not GDPR compliant and blocked by most new browsers anyway. AFAIK, the visitor_id is queried in connection with the site-id.

Wrong. For example in my setup Matomo runs in its own subdomain matomo.domain.com and the matomo sites are other subdomains and paths on the same domain. In this setup the 3rd party feature works very well with all browsers and is very useful to connect the different matomo sites (over 50). So 3rd party cookies are in fact working very well. (and I also have invested a lot of time to fix all the bugs in Matomo related to them)

@MichaelHeerklotz commented on February 26th 2020 Contributor

For really decouple visitor ID from user ID and really adding value, probably some core modifications (archiving, visitor log, etc.) would be necessary. This would be part of a new ticket

Great, so go ahead and create that patch like I did instead of asking to have the work of others removed and break their setup because of your edge case.

@peterbo commented on February 26th 2020 Contributor

For example in my setup Matomo runs in its own subdomain matomo.domain.com and the matomo sites are other subdomains and paths on the same domain.

Thats probably not a 3rd party cookie but a wildcard cookie that you setup with the scope *.example.org?

Currently userid is somewhat like a custom dimension, but one that automatically creates new visits. It's not something that can be replaced by simply using a CD

Generally, this would be easily possible by adding new_visit=1 once to a request that also includes a User-ID: '_paq.push(['appendToTrackingUrl', 'new_visit=1']);' - but I'd rather like to solve this for both use cases.

Great, so go ahead and create that patch like I did instead of asking to have the work of others removed and break their setup because of your edge case.

That's why we're here. To discuss options and added value, not blindly execute. Hence, it would be great if you would contribute in the discussion of use cases and how we could create a feature that is good for different use cases and not break 50% with a minor update.

@MichaelHeerklotz commented on February 26th 2020 Contributor

Thats probably not a 3rd party cookie but a wildcard cookie that you setup with the scope *.example.org?

Technically yes, but it works using Matomo's "3rd Party Cookie" feature.

Generally, this would be easily possible by adding new_visit=1 once to a request that also includes a User-ID: '_paq.push(['appendToTrackingUrl', 'new_visit=1']);' - but I'd rather like to solve this for both use cases.

If one is using the official Matomo JS API yes, but my 50+ Matomo sites are managed by many different teams, some using their own API, some using Pixels, etc, etc,. It would be a big pain and take a lot of migration time to move to this new way to do it.

That's why we're here. To discuss options and added value, not blindly execute. Hence, it would be great if you would contribute in the discussion of use cases and how we could create a feature that is good for different use cases and not break 50% with a minor update.

Yes, that is why I am here.

Basically having a per-site setting to switch between the two userid behaviors would be totally fine for me. Actually my first pull request had such a setting and even defaulted to the old behavior.
I just do not want to be forced to do it the old way.

@peterbo commented on February 26th 2020 Contributor

It would be a big pain and take a lot of migration time to move to this new way to do it.

Well, that's exactly the situation, I (and probably others) find myself in now. I also service a lot of instances with around 10k Sites. Just a few dozen of them are using the User-ID feature, but all of them rely on a recognition by User ID over visitor ID. So you could imagine the pain and work that has to be done.

Great, so go ahead and create that patch like I did instead of asking to have the work of others removed and break their setup because of your edge case.

Another comment to that statement. This not any edge case but the reason to introduce the User-ID feature in the first place. So generally, it'd be good to keep it stable, especially within minor version updates. But that's something, we already all agree on, so lets look ahead towards the resolution.

I'd be fine to make this a config setting - @tsteur @mattab what do you think about that?

@mattab commented on February 27th 2020 Member

we'll need to think more about it. Might take a while. @peterbo I'm not sure about config setting. it would be better to find the optimal solution that fits most use cases. Maybe we can make (almost) everyone happy with a few tweaks to bring back the usefulness of User ID.

@tsteur commented on February 27th 2020 Member

@mattab

I'm trying to understand the thoughts here. What is to your opinion now the difference between userId and a custom dimension? And why was it changed?

It seems to be 99% of users likely don't use 3rd party cookies and it was made worse for them but maybe I'm missing something.

@MichaelHeerklotz

Currently userid is somewhat like a custom dimension, but one that automatically creates new visits.

I'm actually not sure we're doing that currently, or are you saying it should? Really just trying to understand things here. I don't really understand yet why the current behaviour is better for 3rd party cookies and why it was previously not good. Can any of this behaviour maybe achieved with a plugin?

@mattab commented on March 2nd 2020 Member

Wrong. For example in my setup Matomo runs in its own subdomain matomo.domain.com and the matomo sites are other subdomains and paths on the same domain. In this setup the 3rd party feature works very well with all browsers and is very useful to connect the different matomo sites (over 50).

In that case, would you be able to use the setCookieDomain and set the domain to .your-domain.com so the cookie is 1st party yet readable on all subdomains?
Then I see that Peter suggests the same and you reply "Technically yes, but it works using Matomo's "3rd Party Cookie" feature." which does not make sense to me? Why use 3rd party cookie if 1st party would work? Probably 3rd party is only needed when you want to do cross-domain analysis, i suppose...

@mattab what is the benefit of the current userId behaviour of a custom dimension? If there's no clear benefit, I would 100% vote to change behaviour back to original behaviour and no flags on how things work.

I guess the benefit is that, a visit on mobile will appear separately from a visit on desktop. Before the change, the interactions across mobile and desktop visits were merged into one. Whether it's a benefit is not clear however... as Peter points out (and a few other people by email) it's complex to update Mobile Apps and other SDKs to set the proper Visitor ID based on the web visit (or as a hash of User ID) etc.

What would a "revert" look like?

Would reverting this be as simple as reverting this PR? https://github.com/matomo-org/matomo/commit/ea5a14bdf8aa9608cdc2ab7d5c8236a5ff1eb3e2

@mattab commented on March 2nd 2020 Member

Could we maybe assign this to 3.13.4?

@tsteur commented on March 2nd 2020 Member

@mattab there were also few other follow up PRs and also in PHP SDK etc. Not too many I think.

I guess the benefit is that, a visit on mobile will appear separately from a visit on desktop. Before the change, the interactions across mobile and desktop visits were merged into one.

I seriously thought that those merged across devices into one visit (cross device tracking) was the purpose of userId.

Re 3.13.4 depends. Would maybe need to go in a 3.13.5 if needed

@mattab commented on March 2nd 2020 Member

I seriously thought that those merged across devices into one visit (cross device tracking) was the purpose of userId.

:+1:

@MichaelHeerklotz commented on March 2nd 2020 Contributor

Wrong. For example in my setup Matomo runs in its own subdomain matomo.domain.com and the matomo sites are other subdomains and paths on the same domain. In this setup the 3rd party feature works very well with all browsers and is very useful to connect the different matomo sites (over 50).

In that case, would you be able to use the setCookieDomain and set the domain to .your-domain.com so the cookie is 1st party yet readable on all subdomains?
Then I see that Peter suggests the same and you reply "Technically yes, but it works using Matomo's "3rd Party Cookie" feature." which does not make sense to me? Why use 3rd party cookie if 1st party would work? Probably 3rd party is only needed when you want to do cross-domain analysis, i suppose...

Different sites use different first party cookies, how would that help? How would that cause different sites to use the same visitor id?

Note: I have deleted some comments I made after this post, because I went too far with them.
However, I really would prefer a professional handling of this issue.
We could revert the changes and add a setting for it afterwards if you want to fix the issue asap.

Reverting a change that took months to get merged and telling me to use "setCookieDomain" which does not help at all is, let us say... a bit harsh.

@MichaelHeerklotz commented on March 2nd 2020 Contributor

In any case, we should keep the fix that avoids overwriting the global visitor id (_pk_uid) with the user id. If not, if any site messes up the setUserId() call (for example giving every logged out user the same id), it will break the whole Matomo setup for all sites.

@MichaelHeerklotz commented on March 2nd 2020 Contributor

A compromise could be to generate the visitor id from the user id, but to have multiple visits for each device. What do you think? @mattab @tsteur @peterbo

However, this still creates the problem, that it basically breaks any per-device tracking.
How could one see what was done before / after loggin in or out?

I really feel we should have a setting for this.

Powered by GitHub Issue Mirror