@feikede opened this Issue on February 27th 2020

This call randomly returns an empty [] json. I.e. currently:

/?method=Contents.getContentNames&label=anwalt%3A6122&date=2020-01-01,today&period=range&format=JSON&module=API&idSite=1&format=JSON&token_auth=93e...384

returns [] with headers


Cache-Control | must-revalidate
-- | --
Connection | Keep-Alive
Content-Length | 2
Content-Type | application/json; charset=utf-8
Date | Thu, 27 Feb 2020 08:55:27 GMT
Keep-Alive | timeout=5, max=200
Server | Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/7.3.10
X-Matomo-Request-Id | 85f8e
X-Powered-By | PHP/7.3.10

While the exact same call with segmentation by label returns a (the correct) result:
/?method=Contents.getContentNames&segment=contentName==anwalt%3A6122&label=anwalt%3A6122&date=2020-01-01,today&period=range&format=JSON&module=API&idSite=1&format=JSON&token_auth=93e...


Cache-Control | must-revalidate
-- | --
Connection | Keep-Alive
Content-Length | 201
Content-Type | application/json; charset=utf-8
Date | Thu, 27 Feb 2020 08:46:51 GMT
Keep-Alive | timeout=5, max=200
Server | Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/7.3.10
X-Matomo-Request-Id | 5fd41
X-Powered-By | PHP/7.3.10

[{"label":"anwalt:6122","nb_visits":6,"nb_impressions":10,"nb_interactions":10,"sum_daily_nb_uniq_visitors":5,"interaction_rate":"100\u00a0%","segment":"contentName==anwalt%3A6122","idsubdatatable":1}]

Think this is a bug, or am I missing something?

BTW: This seems to be related to any kind of caching. Sometimes the call WITH segmentation delivers wrong results - sometimes without - and sometimes it's all gone one day later (have the cron running every hour, have about 10K req/day). Have latest V 3.13.3 installed.

@tsteur commented on February 27th 2020 Member

Random thought here.

  • Data is archived, eg shortly after midnight
  • First tracking requests with content comes in
  • Previous data is invalidated
  • Data is archived
  • Previous data is invalidated
  • API is fetched before the next archive finished, it's maybe not using the latest invalidated archive for some reason but the first one for example.
  • Data is archived

I haven't tried to reproduce this. Looking at the code we would use the most recent invalidated archive though because of https://github.com/matomo-org/matomo/blob/3.13.3/core/DataAccess/Model.php#L243 and https://github.com/matomo-org/matomo/blob/3.13.3/core/DataAccess/ArchiveSelector.php#L128-L136 where we should use the first idArchive which is the most recent. Quickly looked into it and would say it almost can't be that.

Could be some other race condition as I've never heard of it before. Like not sure if at some point maybe a done flag is written to early for example.

@feikede do you know if you have browser archiving enabled? You can see this in Admin -> General Settings. Also for how long is the empty result returned if you know? Like are we talking few seconds here or few minutes?
Thanks for mentioning the issue and your help 👍

@feikede commented on February 28th 2020

@tsteur Thanks for your analysis. As for your questions:

  • we are talking about some hours here (not minutes or seconds)
  • browser archiving is not enabled

Is it normal to have about 2500 tables in the database?

Maybe the matomo-archive.log helps?

INFO [2020-02-28 12:05:01] 26904  ---------------------------
INFO [2020-02-28 12:05:01] 26904  INIT
INFO [2020-02-28 12:05:01] 26904  Running Matomo 3.13.3 as Super User
INFO [2020-02-28 12:05:01] 26904  ---------------------------
INFO [2020-02-28 12:05:01] 26904  NOTES
INFO [2020-02-28 12:05:01] 26904  - Async process archiving supported, using CliMulti.
INFO [2020-02-28 12:05:01] 26904  - Reports for today will be processed at most every 900 seconds. You can change this value in Matomo UI > Settings > General Settings.
INFO [2020-02-28 12:05:01] 26904  - Reports for the current week/month/year will be requested at most every 3600 seconds.
INFO [2020-02-28 12:05:01] 26904  - Archiving was last executed without error 59 min 54s ago
INFO [2020-02-28 12:05:01] 26904  - Will invalidate archived reports for 2020-02-28 for following websites ids: 1
INFO [2020-02-28 12:05:01] 26904  ---------------------------
INFO [2020-02-28 12:05:01] 26904  START
INFO [2020-02-28 12:05:01] 26904  Starting Matomo reports archiving...
INFO [2020-02-28 12:05:01] 26904  - tracking data found for website id 1 since 2020-02-28 11:05:07 UTC (since the last successful archiving)
INFO [2020-02-28 12:05:01] 26904  Will pre-process for website id = 1, period = day, date = last1
INFO [2020-02-28 12:05:01] 26904  - pre-processing all visits
INFO [2020-02-28 12:05:02] 26904  Archived website id = 1, period = day, 0 segments, 472 visits in last 2 days, 113 visits today, Time elapsed: 1.200s
INFO [2020-02-28 12:05:03] 26904  Will pre-process for website id = 1, period = week, date = last1
INFO [2020-02-28 12:05:03] 26904  - pre-processing all visits
INFO [2020-02-28 12:05:04] 26904  Archived website id = 1, period = week, 0 segments, 0 visits in last 2 weeks, 0 visits this week, Time elapsed: 1.172s
INFO [2020-02-28 12:05:04] 26904  Will pre-process for website id = 1, period = month, date = last1
INFO [2020-02-28 12:05:04] 26904  - pre-processing all visits
INFO [2020-02-28 12:05:05] 26904  Archived website id = 1, period = month, 0 segments, 0 visits in last 2 months, 0 visits this month, Time elapsed: 1.231s
INFO [2020-02-28 12:05:05] 26904  Will pre-process for website id = 1, period = year, date = last1
INFO [2020-02-28 12:05:05] 26904  - pre-processing all visits
INFO [2020-02-28 12:05:06] 26904  Archived website id = 1, period = year, 0 segments, 0 visits in last 2 years, 0 visits this year, Time elapsed: 1.354s
INFO [2020-02-28 12:05:06] 26904  Archived website id = 1, 4 API requests, Time elapsed: 4.972s [1/1 done]
INFO [2020-02-28 12:05:06] 26904  Done archiving!
INFO [2020-02-28 12:05:06] 26904  ---------------------------
INFO [2020-02-28 12:05:06] 26904  SUMMARY
INFO [2020-02-28 12:05:06] 26904  Total visits for today across archived websites: 113
INFO [2020-02-28 12:05:06] 26904  Archived today's reports for 1 websites
INFO [2020-02-28 12:05:06] 26904  Archived week/month/year for 1 websites
INFO [2020-02-28 12:05:06] 26904  Skipped 0 websites
INFO [2020-02-28 12:05:06] 26904  - 0 skipped because no new visit since the last script execution
INFO [2020-02-28 12:05:06] 26904  - 0 skipped because existing daily reports are less than 900 seconds old
INFO [2020-02-28 12:05:06] 26904  - 0 skipped because existing week/month/year periods reports are less than 3600 seconds old
INFO [2020-02-28 12:05:06] 26904  Total API requests: 4
INFO [2020-02-28 12:05:06] 26904  done: 1/1 100%, 113 vtoday, 1 wtoday, 1 wperiods, 4 req, 5060 ms, no error
INFO [2020-02-28 12:05:06] 26904  Time elapsed: 5.060s
INFO [2020-02-28 12:05:06] 26904  ---------------------------
INFO [2020-02-28 12:05:06] 26904  SCHEDULED TASKS
INFO [2020-02-28 12:05:06] 26904  Starting Scheduled tasks... 
INFO [2020-02-28 12:05:06] 26904  Scheduler: executing task Piwik\Plugins\CustomPiwikJs\Tasks.updateTracker...
INFO [2020-02-28 12:05:06] 26904  Scheduler: finished. Time elapsed: 0.001s
INFO [2020-02-28 12:05:06] 26904  Scheduler: executing task Piwik\Plugins\PrivacyManager\Tasks.deleteLogData...
INFO [2020-02-28 12:05:06] 26904  Scheduler: finished. Time elapsed: 0.001s
INFO [2020-02-28 12:05:06] 26904  Scheduler: executing task Piwik\Plugins\PrivacyManager\Tasks.anonymizePastData...
INFO [2020-02-28 12:05:06] 26904  Scheduler: finished. Time elapsed: 0.000s
INFO [2020-02-28 12:05:06] 26904  done
INFO [2020-02-28 12:05:06] 26904  ---------------------------

I'tt try to track down the next case in detail.

@tsteur commented on February 29th 2020 Member

2500 tables is not quite normal, is it maybe a rather old installation? Possible there was a bug a while ago.

Not available for few hours doesn't sound normal. Is in the same time the result not available in the UI either?

@feikede commented on February 29th 2020

Ah, sorry, it's only 107 tables, I looked at the wrong number.
And the result is available in the UI while it's not correct in the reporting API at the same time. I'll try to find another case today to track that down.

@tsteur commented on March 1st 2020 Member

Thanks @feikede . Is there maybe any caching active in your server set up or so? I'm asking because the UI is also just using the API and I can't think of a reason why the UI should be showing data while the API doesn't. Not sure what your server setup looks like or if your webserver sends maybe some caching headers.

@feikede commented on March 4th 2020

No, there's no caching, it's a plain matomo LAMP install on CentOS 7.7. I have another case today (exactly now):
/?method=Contents.getContentNames&label=anwalt:2695&date=2019-01-01,today&period=range&format=JSON&module=API&idSite=1&format=JSON&token_auth=93..3

returns empty array [], while

/?method=Contents.getContentNames&segment=contentName==anwalt:2695&label=anwalt:2695&date=2019-01-01,today&period=range&format=JSON&module=API&idSite=1&format=JSON&token_auth=93..3

returns

[{"label":"anwalt:2695","nb_visits":10,"nb_impressions":12,"nb_interactions":2,"sum_daily_nb_uniq_visitors":10,"interaction_rate":"16,67\u00a0%","segment":"contentName==anwalt%3A2695","idsubdatatable":1}]

and

/?method=Contents.getContentNames&segment=contentName==anwalt:2695&label=anwalt:2695&date=last30&period=day&format=JSON&module=API&idSite=1&format=JSON&token_auth=93..3

returns

{"2020-02-04":[],"2020-02-05":[],"2020-02-06":[],"2020-02-07":[],"2020-02-08":[],"2020-02-09":[],"2020-02-10":[],"2020-02-11":[],"2020-02-12":[],"2020-02-13":[],"2020-02-14":[],"2020-02-15":[],"2020-02-16":[],"2020-02-17":[],"2020-02-18":[],"2020-02-19":[],"2020-02-20":[],"2020-02-21":[],"2020-02-22":[],"2020-02-23":[],"2020-02-24":[],"2020-02-25":[],"2020-02-26":[],"2020-02-27":[],"2020-02-28":[],"2020-02-29":[],"2020-03-01":[],"2020-03-02":[],"2020-03-03":[{"label":"anwalt:2695","nb_uniq_visitors":2,"nb_visits":2,"nb_impressions":2,"nb_interactions":0,"interaction_rate":"0\u00a0%","segment":"contentName==anwalt%3A2695","idsubdatatable":1}],"2020-03-04":[{"label":"anwalt:2695","nb_uniq_visitors":1,"nb_visits":1,"nb_impressions":3,"nb_interactions":2,"interaction_rate":"66,67\u00a0%","segment":"contentName==anwalt%3A2695","idsubdatatable":1}]}

while

/?method=Contents.getContentNames&label=anwalt:2695&date=last30&period=day&format=JSON&module=API&idSite=1&format=JSON&token_auth=93..3
returns

{"2020-02-04":[],"2020-02-05":[],"2020-02-06":[],"2020-02-07":[],"2020-02-08":[],"2020-02-09":[],"2020-02-10":[],"2020-02-11":[],"2020-02-12":[],"2020-02-13":[],"2020-02-14":[],"2020-02-15":[],"2020-02-16":[],"2020-02-17":[],"2020-02-18":[],"2020-02-19":[],"2020-02-20":[],"2020-02-21":[],"2020-02-22":[],"2020-02-23":[],"2020-02-24":[],"2020-02-25":[],"2020-02-26":[],"2020-02-27":[],"2020-02-28":[],"2020-02-29":[],"2020-03-01":[],"2020-03-02":[],"2020-03-03":[{"label":"anwalt:2695","nb_uniq_visitors":2,"nb_visits":2,"nb_impressions":2,"nb_interactions":0,"interaction_rate":"0\u00a0%","segment":"contentName==anwalt%3A2695","idsubdatatable":39}],"2020-03-04":[]}

The UI shows 2 Impressions on 3.3.20 (4.3 is missing here, and no impr. before march) - and it's totally different from above's "label":"anwalt:2695","nb_visits":10,"nb_impressions":12

So in summary, this seems pretty useless. Do you know what's the suggested way to count content impressions and API-report them in matomo?

@tsteur commented on March 4th 2020 Member

@sgiehl @diosmosis maybe any idea right away how UI and API might report different results?

@diosmosis commented on March 5th 2020 Member

@feikede Does this only happen with dates that include today? Or does it happen w/ earlier dates? Does this affect the export link provided by the UI as well? Does it only affect ranges?

@tsteur I don't know why the UI would and API would report different results, seems very strange, but seems likely related to range archiving in some way.

@feikede commented on March 5th 2020

Today
/?method=Contents.getContentNames&label=anwalt:2695&date=2019-01-01,today&period=range&format=JSON&module=API&idSite=1&format=JSON&token_auth=93e3..
still returns [], matching access_log entry is
2.x.y.180 - - [05/Mar/2020:08:41:45 +0100] "GET /?method=Contents.getContentNames&label=anwalt:2695&date=2019-01-01,today&period=range&format=JSON&module=API&idSite=1&format=JSON&token_auth=yy HTTP/1.1" 200 2 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:73.0) Gecko/20100101 Firefox/73.0"

@diosmosis
/?method=Contents.getContentNames&label=anwalt:2695&date=2019-01-01,2020-03-04&period=range&format=JSON&module=API&idSite=1&format=JSON&token_auth=93..
also returns []

While
/?method=Contents.getContentNames&label=anwalt:2695&date=last5&period=day&format=JSON&module=API&idSite=1&format=JSON&token_auth=93e3..
returns
{"2020-03-01":[],"2020-03-02":[],"2020-03-03":[{"label":"anwalt:2695","nb_uniq_visitors":2,"nb_visits":2,"nb_impressions":2,"nb_interactions":0,"interaction_rate":"0\u00a0%","segment":"contentName==anwalt%3A2695","idsubdatatable":39}],"2020-03-04":[{"label":"anwalt:2695","nb_uniq_visitors":1,"nb_visits":1,"nb_impressions":3,"nb_interactions":2,"interaction_rate":"66,67\u00a0%","segment":"contentName==anwalt%3A2695","idsubdatatable":145}],"2020-03-05":[]}
now.

@diosmosis How should I test the "export link" issue?

@feikede commented on March 5th 2020

btw:
/?method=Contents.getContentNames&label=anwalt:2695&date=2020-03-03&period=day&format=JSON&module=API&idSite=1&format=JSON&token_auth=93e returns
[{"label":"anwalt:2695","nb_uniq_visitors":2,"nb_visits":2,"nb_impressions":2,"nb_interactions":0,"interaction_rate":"0\u00a0%","segment":"contentName==anwalt%3A2695","idsubdatatable":39}]
maybe it's a period=range issue?

@diosmosis commented on March 5th 2020 Member

@feikede at the bottom of each graph there is an icon that can be used to get an API URL for a report:

image

clicking on it would show:

image

The link it generates should be directly to the API, I'm wondering if this feature is broken for you as well for content reports.

maybe it's a period=range issue?

It very well could be, though it's still extremely strange, and I'm not immediately sure what could be causing it.

@feikede commented on March 31st 2020

Today, on 2020-03-31, I have the following buggy behavior:

/?method=Contents.getContentNames&label=anwalt:6122&date=2019-01-01,today&period=range&format=JSON
empty result

/?method=Contents.getContentNames&label=anwalt:6122&date=2019-01-01,2020-03-30&period=range&format=JSON
empty result

/?method=Contents.getContentNames&label=anwalt:6122&date=2019-01-01,2020-03-29&period=range&format=JSON
correct result ([{"label":"anwalt:6122","nb_visits":7,"nb_impressions":11,"nb_interactions":10,"sum_daily_nb_uniq_visitors":6,"interaction_rate":"90,91\u00a0%","segment":"contentName==anwalt%3A6122","idsubdatatable":844}])

/?method=Contents.getContentNames&label=anwalt:6122&date=2019-01-01,2020-03-28&period=range&format=JSON
correct result ([{"label":"anwalt:6122","nb_visits":7,"nb_impressions":11,"nb_interactions":10,"sum_daily_nb_uniq_visitors":6,"interaction_rate":"90,91\u00a0%","segment":"contentName==anwalt%3A6122","idsubdatatable":844}])

etc.

I give up using this now, it's not reliable. Thanks for your help

@tsteur commented on March 31st 2020 Member

Sorry to hear @feikede . We would be maybe able to have a look should we have full server access (incl db) but otherwise this be hard to identify the issue. We can't possibly imagine how this is happening since Matomo uses the same API's in the reporting UI as well. It does somehow suggest there is some issue somewhere else in the server setup but of course there could be also just as well be some edge case issue in Matomo.

Powered by GitHub Issue Mirror