Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of regex groups in CustomDimension extraction #16141

Closed
danielfett opened this issue Feb 9, 2016 · 2 comments · Fixed by #20548
Closed

Improve handling of regex groups in CustomDimension extraction #16141

danielfett opened this issue Feb 9, 2016 · 2 comments · Fixed by #20548
Assignees
Labels
c: Custom Dimensions For issues related to the Custom Dimensions plugin. Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.
Milestone

Comments

@danielfett
Copy link

Bug:
There may be more than one "(" or ")" in regular expression used for dimension extraction although there is only one capturing group. Examples include non-capturing groups, modificators like "(?i)", and escaped character sequences like "(". Regular expressions with more than one "(" or ")" are rejected currently.

Example:
(?i)example.com/(?:SOME_PART/)?([a-z]{2})(?:$|/)
This is captures exactly one group, yet is not allowed. (This is what I wanted to use in my project, actually.)

Root cause:
Counting the number of "(" or ")" as is done in https://github.com/piwik/plugin-CustomDimensions/blob/master/Dimension/Extraction.php#L46 is just wrong: It does not ensure that there is exactly one capturing group.

Better solutions:
Short term: Remove this useless test. Add an example or warning regarding the number of groups.
Long term: Include "live-preview" of matches as described in another bug report. Warn when more than one group captures.

@tsteur tsteur transferred this issue from matomo-org/plugin-CustomDimensions Jun 29, 2020
@tsteur tsteur added c: Custom Dimensions For issues related to the Custom Dimensions plugin. Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. labels Jun 29, 2020
@9joshua
Copy link
Contributor

9joshua commented Apr 2, 2023

I have another request for this. A regex expression like this should work: ((?:media|stream)\/\d+)

But Matomo returns this error message:
You need to group exactly one part of the regular expression inside round brackets, eg 'index_(.+).html'

As @danielfett pointed out, Matomo's parsing of regex expressions incorrectly considers a grouping like this (?.....) to be an additional capture group.

@9joshua
Copy link
Contributor

9joshua commented Apr 3, 2023

PR for this: #20548

@sgiehl sgiehl linked a pull request Apr 3, 2023 that will close this issue
11 tasks
@bx80 bx80 added this to the 5.0.0 milestone Apr 18, 2023
@sgiehl sgiehl changed the title Regex should allow for more than one group Improve handling of regex groups in CustomDimension extraction May 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: Custom Dimensions For issues related to the Custom Dimensions plugin. Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants