Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating/updating segments w/ values w/ encoded chars does not auto schedule rearchiving #17583

Closed
diosmosis opened this issue May 18, 2021 · 9 comments · Fixed by #17610
Closed
Labels
Regression Indicates a feature used to work in a certain way but it no longer does even though it should.
Milestone

Comments

@diosmosis
Copy link
Member

Expected Behavior

Creating or updating a segment w/ an encoded value, eg, pageUrl==some%2Fpath should automatically schedule rearchiving of past data. Invalidations should appear in the archive_invalidations table and should be handled by core:archive.

Current Behavior

Creating or updating the aforementioned type of segment and running core:archive will add rows to archive_invalidations, but will not run them.

Possible Solution

Steps to Reproduce (for Bugs)

  1. Open the segment editor and create a segment with an encoded value. For example, Page URL is <url>
  2. Run core:archive. No jobs for the new segment will be created.
  3. Look in the archive_invalidations table, invalidations will exist for the new segment.

Context

Your Environment

  • Matomo Version:
  • PHP Version:
  • Server Operating System:
  • Additionally installed plugins:
  • Browser:
  • Operating System:
@diosmosis diosmosis added the Regression Indicates a feature used to work in a certain way but it no longer does even though it should. label May 18, 2021
@diosmosis diosmosis added this to the 4.4.0 milestone May 18, 2021
@tsteur
Copy link
Member

tsteur commented May 18, 2021

@diosmosis do you know why that is happening? Does the rearchive logic not work there?

@diosmosis
Copy link
Member Author

@tsteur I didn't look that far into it, the problem appears to be in core:archive. reArchive works to put the item in ReArchiveList. core:archive successfully processes it and puts rows in archive_invalidations. But the rest of the command fails to recognize the invalidations (they remain there after core:archive runs).

@diosmosis diosmosis modified the milestones: 4.4.0, 4.3.1 May 18, 2021
@EthanMcL
Copy link

I've been following this (specifically #17138). Do we have a workaround to get segments that include encoded URL strings working? Currently they only work on single days, where even doing a date range for yesterday-today breaks.

@diosmosis
Copy link
Member Author

Hi @EthanMcL, can you tell me, is it the automatic rearchiving of past data that fails? Or just viewing data for a segment w/ encoded values? If the latter, can you tell me:

  1. Is it just range periods or also week/month/year periods?
  2. Do you have browser archiving enabled or are you primarily using core:archive?

@EthanMcL
Copy link

@diosmosis It is viewing data with a segment.
Bit of background, we introduced a new Custom Report and Segment (to filter out some results) last week. We invalidated data to fill the Custom Report with historical data (just back to mid April). The segment could filter on any of the single days, but not a custom range or week/month/year.

Fast forward to this week, we still can not use the segment on a custom range of a couple days (say May 17-19, which is "new" data outside of the invalidation scope),or any of the week/month/year still.

Our segment value is like: John%2520Smith -- where we are trying to filter on names like "John%20Smith", where we actually have to further encode the % to %25 when doing the filter.

Browser archiving is enabled, and we actually do not have the cron for hourly core:archive enabled.

Hope this all makes sense / helps.

@diosmosis
Copy link
Member Author

@EthanMcL that sounds like a different issue then. Can you tell me, for the segments you use as filters, do you see data in the main UI (not just in the customreports it's used in)? Are there some periods where it's present, and others where it is missing?

@EthanMcL
Copy link

@diosmosis Yes, I can confirm that my filter (with adaption, see below) works in the Visitors Log section, where I can do data range / week / month / year, and filter based on the segment. It appears to work for all periods that I tested. The issue seems to be purely with the Custom Reports, and applying the segment there on anything but a single day.

Side observation: We are filtering based on a custom variable, with the values like: FirstName%20LastName. When filtering in Visitor Logs, my filter has to be simply "FirstName%20LastName", where as when filtering in Custom Reports, my filter has to be "FirstName%2520LastName". Appears Custom Reports does an extra layer of encoding.

@diosmosis
Copy link
Member Author

@EthanMcL ok, then that seems to be a bug in CustomReports, since the segment filter is only actively used in generating the day archives (higher periods just add together reports of lower periods). Are there other reports aggregated for those periods or is the entire period missing data?

@diosmosis
Copy link
Member Author

Note: cause of original issue is that the segment hash we see when invalidating data is different from the hash stored in the table. This is created from the stored segment definition column though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Regression Indicates a feature used to work in a certain way but it no longer does even though it should.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants