@tsteur opened this Pull Request on April 25th 2020 Member

@mattab the idea here is that content impressions is not quite an interactive action and as such it should not extend the visit or maybe only less often (if user scrolls after a while on the page we could consider it maybe interactive). The reason is mostly performance related to have less concurrent log_visit updates and therefore less potential locks. Often people have say 10-30 content blocks and either all of them are sent at the same time (eg if trackAllContentBlocks) or many of them are sent in parallel (when trackVisibleBlocksOnly and many become visible while scrolling). In Matomo 4 we can start queueing this requests so they are not sent all at the same time and instead as bulk (maybe they are already not sure) but in general we want to avoid too many concurrent requests here.

The idea is so far we only extend visit if the last action was more than a minute ago. Could also completely remove it and never extend the visit (only content interactions would extend it). This can avoid many log_visit locks.

Haven't fully tested it yet, but wanted to get feedback on it as I reckon it can help.

For content impressions it will basically no longer update anything on log_visit which be good.

@mattab commented on April 26th 2020 Member

Sounds great, change makes sense. If users want accurate visit time, heartbeat timer would be a better solution. It would be fine to also remove the 1 minute logic and never update the visit. Be good to mention in developer changelog too.

@tsteur commented on April 27th 2020 Member

fyi it seems I can't make this work. It seems actually needed to update visit_last_action_time as otherwise the time_spent_ref_action is calculated wrong.

You can see here some content impressions and their time_spent_ref_action where then eg multiple content impressions get a time_spend_ref_action of 120, but only the first should get 120 and the remaining to 0 . The same for some other entries around 60s etc.

image

This means the data in the exported visitor log shows a wrong time_spent_ref_action

image

As a result, it also changes eg sum_time_spent on page where it aggregates the time_spent_ref_action and it results in an increased value as it tracks eg 120 multiple times instead of just once.

image

Technically, I noticed the page view report should maybe not include time_spent_ref_action of content impressions for a given page, but then we'd result in 0 sum_time_spent and it probably should be included.

So I basically couldn't figure out a way to make this work unfortunately.

FYI @mattab

@mattab commented on April 30th 2020 Member

@tsteur

You can see here some content impressions and their time_spent_ref_action where then eg multiple content impressions get a time_spend_ref_action of 120, but only the first should get 120 and the remaining to 0 . The same for some other entries around 60s etc.

i'm not sure i fully understand yet, but say if we "forced" content impressions' time_spent_ref_action values to be 0, would it mean we could reconsider the PR and performance improvements?

Technically, I noticed the page view report should maybe not include time_spent_ref_action of content impressions for a given page, but then we'd result in 0 sum_time_spent and it probably should be included.

it would be fine I think to have sum_time_spent as 0 in these cases, especially if it solves some major performance problem edge cases... The view would be that if people need accurate time on page, they would better use HeartBeat timer.

EDIT: there is also this issue but it's not wanted for now due to performance impacts https://github.com/matomo-org/matomo/issues/9748

@tsteur commented on April 30th 2020 Member

Yeah we'd need to implement https://github.com/matomo-org/matomo/issues/9748 to make this work which in the end would cause more issues.

If we forced content impressions to have 0 values for time_spent_ref_action, then we could use this PR. However, anyone with content impressions wouldn't get an accurate time on page anymore (which may be fine as events seem to be breaking time on page feature already anyway if I see this right)

This Pull Request was closed on April 27th 2020
Powered by GitHub Issue Mirror