@diosmosis opened this Issue on May 18th 2021 Member

Expected Behavior

Creating or updating a segment w/ an encoded value, eg, pageUrl==some%2Fpath should automatically schedule rearchiving of past data. Invalidations should appear in the archive_invalidations table and should be handled by core:archive.

Current Behavior

Creating or updating the aforementioned type of segment and running core:archive will add rows to archive_invalidations, but will not run them.

Possible Solution

Steps to Reproduce (for Bugs)

  1. Open the segment editor and create a segment with an encoded value. For example, Page URL is <url>
  2. Run core:archive. No jobs for the new segment will be created.
  3. Look in the archive_invalidations table, invalidations will exist for the new segment.

Context

Your Environment

  • Matomo Version:
  • PHP Version:
  • Server Operating System:
  • Additionally installed plugins:
  • Browser:
  • Operating System:
@tsteur commented on May 18th 2021 Member

@diosmosis do you know why that is happening? Does the rearchive logic not work there?

@diosmosis commented on May 18th 2021 Member

@tsteur I didn't look that far into it, the problem appears to be in core:archive. reArchive works to put the item in ReArchiveList. core:archive successfully processes it and puts rows in archive_invalidations. But the rest of the command fails to recognize the invalidations (they remain there after core:archive runs).

@EthanMcL commented on May 20th 2021

I've been following this (specifically #17138). Do we have a workaround to get segments that include encoded URL strings working? Currently they only work on single days, where even doing a date range for yesterday-today breaks.

@diosmosis commented on May 20th 2021 Member

Hi @EthanMcL, can you tell me, is it the automatic rearchiving of past data that fails? Or just viewing data for a segment w/ encoded values? If the latter, can you tell me:

  1. Is it just range periods or also week/month/year periods?
  2. Do you have browser archiving enabled or are you primarily using core:archive?
@EthanMcL commented on May 20th 2021

@diosmosis It is viewing data with a segment.
Bit of background, we introduced a new Custom Report and Segment (to filter out some results) last week. We invalidated data to fill the Custom Report with historical data (just back to mid April). The segment could filter on any of the single days, but not a custom range or week/month/year.

Fast forward to this week, we still can not use the segment on a custom range of a couple days (say May 17-19, which is "new" data outside of the invalidation scope),or any of the week/month/year still.

Our segment value is like: John%2520Smith -- where we are trying to filter on names like "John%20Smith", where we actually have to further encode the % to %25 when doing the filter.

Browser archiving is enabled, and we actually do not have the cron for hourly core:archive enabled.

Hope this all makes sense / helps.

@diosmosis commented on May 20th 2021 Member

@EthanMcL that sounds like a different issue then. Can you tell me, for the segments you use as filters, do you see data in the main UI (not just in the customreports it's used in)? Are there some periods where it's present, and others where it is missing?

@EthanMcL commented on May 21st 2021

@diosmosis Yes, I can confirm that my filter (with adaption, see below) works in the Visitors Log section, where I can do data range / week / month / year, and filter based on the segment. It appears to work for all periods that I tested. The issue seems to be purely with the Custom Reports, and applying the segment there on anything but a single day.

Side observation: We are filtering based on a custom variable, with the values like: FirstName%20LastName. When filtering in Visitor Logs, my filter has to be simply "FirstName%20LastName", where as when filtering in Custom Reports, my filter has to be "FirstName%2520LastName". Appears Custom Reports does an extra layer of encoding.

@diosmosis commented on May 21st 2021 Member

@EthanMcL ok, then that seems to be a bug in CustomReports, since the segment filter is only actively used in generating the day archives (higher periods just add together reports of lower periods). Are there other reports aggregated for those periods or is the entire period missing data?

@diosmosis commented on May 21st 2021 Member

Note: cause of original issue is that the segment hash we see when invalidating data is different from the hash stored in the table. This is created from the stored segment definition column though.

This Issue was closed on May 23rd 2021
Powered by GitHub Issue Mirror