@Situphen opened this Issue on July 4th 2022

I am trying to get some statistics such as nb_visits for a specific label (corresponding to a URL) over a specific range of dates, but sometimes Matomo response is just an empty list. While investigating, I found that if there is no data for one date in the date range, then the response is an empty list.

Request with period=day

Request parameters:

/index.php?module=API&format=JSON&idSite=6&period=day&date=2022-06-28,2022-07-02&method=Actions.getPageUrls&label=tutoriels+>+3645+>+demontrer-par-labsurde

Current and correct response:

{
  "2022-06-28": [],
  "2022-06-29": [],
  "2022-06-30": [
    {
      "label": "demontrer-par-labsurde",
      "nb_visits": 8,
      ...
    }
  ],
  "2022-07-01": [
    {
      "label": "demontrer-par-labsurde",
      "nb_visits": 7,
      ...
    }
  ],
  "2022-07-02": [
    {
      "label": "demontrer-par-labsurde",
      "nb_visits": 3,
      ...
    }
  ]
}

Request with period=range

Request parameters:

/index.php?module=API&format=JSON&idSite=6&period=range&date=2022-06-28,2022-07-02&method=Actions.getPageUrls&label=tutoriels+>+3645+>+demontrer-par-labsurde

Current Behavior

Current response:

[]

Expected Behavior

Correct response:

[
  {
    "label": "demontrer-par-labsurde",
    "nb_visits": 18,
    ...
  }
]

Steps to Reproduce (for Bugs)

I guess any requests with method=Actions.getPageUrls, some correct label specified, period=range and date with a date range that as at least one date without data.

Context

See the first paragraph.

Your Environment

  • Matomo Version: 4.10.1
  • PHP Version: at least 7.3
  • Server Operating System: Debian 10
  • Additionally installed plugins: only Google Analytics Importer as far as I know
@peterhashair commented on July 5th 2022 Contributor

@Situphen thank you for reporting this, our product will review this.

@justinvelluppillai commented on July 5th 2022 Member

@peterhashair it'd be good to confirm we can reproduce this issue and check it's not a regression.

@peterhashair commented on July 6th 2022 Contributor

@Situphen trying to reproduce this, but it seems like it works on my local. I notice your label param is tutoriels+>+3645+>+demontrer-par-labsurde, and the actual response label is demontrer-par-labsurde. Can you try a request with label=demontrer-par-labsurde

@Situphen commented on July 6th 2022

I tried a request with label=demontrer-par-labsurde as you suggested and I got an empty response : {"2022-06-28":[],"2022-06-29":[],"2022-06-30":[],"2022-07-01":[],"2022-07-02":[]} with period=day and [] with period=range.

@peterhashair commented on July 6th 2022 Contributor

@Situphen how about without label, and do a search on-page (ctrl+f) for demontrer maybe, just to confirm what the actual label is in the response, it could be caused by special characters.

@Situphen commented on July 6th 2022

With period=day

Request parameters without label=... but with expanded=1:

/index.php?module=API&format=JSON&idSite=6&period=day&date=2022-06-28,2022-07-02&method=Actions.getPageUrls&expanded=1

Correct response (3 responses with the label demontrer-par-labsurde as before and 2 with the label /demontrer-par-labsurde.pdf when filtering with demontrer in Firefox) :

image

With period=range

Request parameters without label=... but with expanded=1:

/index.php?module=API&format=JSON&idSite=6&period=range&date=2022-06-28,2022-07-02&method=Actions.getPageUrls&expanded=1

Incorrect response (nothing when filtering with demontrer in Firefox):

image

@peterhashair commented on July 7th 2022 Contributor

@Situphen thanks for providing this, I will do more investigation on this

@Situphen commented on July 7th 2022

As you were not able to reproduce this behavior, I investigated further and for some date range it works and for some other it doesn't. So I downloaded the data with period=day with a date range of 30 days (filename data_day_*.json) as well as the data with period=range for all possible date range combinaison (filename data_range_*.json) with a Python script (get_data_range.py). For each date range combinaison, I compared the nb_visits result I got from period=range to the sum of each daily nb_visits with another Python script (analysis.py). I sorted the date ranges in two lists: correct and incorrect (filename analysis_*.json). Everything is inside this zipfile below. I was not able to find a strict pattern common to all incorrect date ranges, but maybe you will be able to?

test-matomo.zip

@peterhashair commented on July 7th 2022 Contributor

@Situphen thank you very much for providing the additional info and scripts, it really helps, I will come back to you ASAP.

@sgiehl commented on July 7th 2022 Member

@Situphen I haven't had a look in detail, but that might be an issue of data truncation. When reports are archived, the aggregated data is limited to a certain amount of records. For actions the default is 500 for the base report and 100 for all subtables. Depending on the amount of page you are tracking it might happen that pages that are visited quite few are summarized into a Others row. So in theory a page can be visible on each day report, but if each day has varying pages tracked, a bigger period (e.g. week, month, year or range) might not contain a certain row as the report would have too many records.

Powered by GitHub Issue Mirror