Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper encoding of segment values for Actions reports. #13481

Merged
merged 17 commits into from Oct 2, 2019

Conversation

diosmosis
Copy link
Member

Changes:

I am not sure if this fix will have repercussions so I haven't modified the tests. Will do so after someone else gives their opinion.

Fixes #13460

@diosmosis diosmosis added not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org. Needs Review PRs that need a code review labels Sep 25, 2018
@diosmosis diosmosis added this to the 3.6.1 milestone Sep 25, 2018
@tsteur
Copy link
Member

tsteur commented Sep 25, 2018

If it fixes your issue and tests pass probably all good to go. It's almost impossible to review it since the segments are just a mess and we never know where they are urlencoded or urldecoded etc.

@diosmosis
Copy link
Member Author

It will change a lot of the test output I think (another encoding in the segment property in API output). Think this won't be a BC break? at least not a potentially huge one.

@tsteur
Copy link
Member

tsteur commented Sep 25, 2018

Think this won't be a BC break?

I really don't know. As long as we can still handle the segments urlencoded and urldecoded it should be fine.

@fdellwing
Copy link
Contributor

Applied the patch locally and it resolves the issue for me.

@diosmosis diosmosis added Pull Request WIP Indicates the current pull request is still work in progress and not ready yet for a review. and removed Needs Review PRs that need a code review labels Oct 1, 2018
@diosmosis diosmosis modified the milestones: 3.6.1, 3.7.0 Oct 10, 2018
@tsteur
Copy link
Member

tsteur commented Dec 6, 2018

#13460 (comment)

Don't have a fix yet, but found the cause of the error. Actions are stored urlencoded in the log_action table, but the value in the segment when opening the segmented visitor log/transitions modal/etc. is not.

I'm not sure this is the case? That might have been the case quite a while ago but shouldn't be the case anymore? at least for me it was stored normally
image

or you mean the URL is encoded after tracking?

@@ -26,6 +26,6 @@
<exit_rate>100%</exit_rate>
<avg_time_generation>0.333</avg_time_generation>
<url>http://www.example.org/page</url>
<segment>pageUrl==http%3A%2F%2Fwww.example.org%2Fpage</segment>
<segment>pageUrl==http%253A%252F%252Fwww.example.org%252Fpage</segment>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this double encoded on purpose?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, because one decode is for the condition, and one for the value

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's correct, would expect to have here either:
pageUrl==http%253A%252F%252Fwww.example.org%252Fpage (current behavior)
or
pageUrl%3D%3Dhttp%253A%252F%252Fwww.example.org%252Fpage (current behavior URL encoded)

but pageUrl==http%253A%252F%252Fwww.example.org%252Fpage wouldn't be a correct value since Matomo would then segment based on Page URL http%3A%2F%2Fwww.example.org%2Fpage which wouldn't exist (didn't check but it's possible it currently works anyway since we try the url decoded version in

matomo/core/Segment.php

Lines 99 to 108 in 2020b12

}
// First try with url decoded value. If that fails, try with raw value.
// If that also fails, it will throw the exception
try {
$this->initializeSegment(urldecode($segmentCondition), $idSites);
} catch (Exception $e) {
$this->initializeSegment($segmentCondition, $idSites);
}
}
but still: we wouldn't want to show this double encoded URL in the API output)

Copy link
Member Author

@diosmosis diosmosis Dec 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is required to have the value encoded 3 times because it is urldecoded 3 times, see the PR description for more info.

Most URLs don't need to be double encoded, but if the value includes a value that can be decoded, like a plus sign, then it will be interpreted incorrectly if not double encoded. In this case the old API output would be wrong.

@tsteur
Copy link
Member

tsteur commented Dec 6, 2018

I've looked at the PR for quite a while but the PR is quite complex and not trivial to understand especially re side effects etc. Left some comments though. Do you think there would be otherwise a solution where we don't need to change anything except for maybe the segment expression?

@diosmosis diosmosis added Needs Review PRs that need a code review and removed Pull Request WIP Indicates the current pull request is still work in progress and not ready yet for a review. labels Dec 12, 2018
@diosmosis
Copy link
Member Author

I'm not sure this is the case? That might have been the case quite a while ago but shouldn't be the case anymore?

It's not the case, it looked like it was, but I think I tracked something wrong.

@diosmosis diosmosis added Pull Request WIP Indicates the current pull request is still work in progress and not ready yet for a review. and removed Needs Review PRs that need a code review labels Dec 28, 2018
@diosmosis diosmosis modified the milestones: 3.8.0, 3.9.0 Jan 6, 2019
@mattab mattab modified the milestones: 3.9.0, 3.10.0 Mar 18, 2019
@mattab
Copy link
Member

mattab commented May 28, 2019

@diosmosis Feedback:

  • Would it be possible to create a failing UI tests in 3.x-dev branch (in a separate PR) which shows the bug(s) currently in Matomo
  • Then could you update this PR and solve conflicts and show that the bug(s) are now fixed.

Looking forward to merging this 👍

@diosmosis
Copy link
Member Author

This change turned out to be rather complex, I doubt I'll be able to finish it soon, and it's not as high value as other tickets. Would vote to put it in the priority backlog.

@diosmosis
Copy link
Member Author

@mattab / @tsteur I added a system test for checking the segment values and found a couple more segment related bugs (which are fixed).

@diosmosis
Copy link
Member Author

Nevermind, now all the other tests fail.

@tsteur
Copy link
Member

tsteur commented Jul 17, 2019

@diosmosis let me know when it's ready for a review

@mattab mattab modified the milestones: 3.11.0, 3.12.0 Jul 23, 2019
@mattab mattab added the Major Indicates the severity or impact or benefit of an issue is much higher than normal but not critical. label Aug 1, 2019
@diosmosis diosmosis added Needs Review PRs that need a code review and removed Pull Request WIP Indicates the current pull request is still work in progress and not ready yet for a review. labels Oct 2, 2019
@tsteur
Copy link
Member

tsteur commented Oct 2, 2019

image

works 👍 code looks good too 👍

@tsteur tsteur merged commit 9e34ecd into 3.x-dev Oct 2, 2019
@tsteur tsteur deleted the 13460-segment-encoding branch October 2, 2019 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Major Indicates the severity or impact or benefit of an issue is much higher than normal but not critical. Needs Review PRs that need a code review not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segmented Visits log empty if url contains whitespaces
4 participants