You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
duplicateFor issues that already existed in our issue tracker and were reported previously.TaskIndicates an issue is neither a feature nor a bug and it's purely a "technical" change.
there's missing data for one of periods (e.g. week) ore for one of 100 defined segments.
It's only one case, but that is something that happens really often on production servers and is random. I know that the main goal should be to improve Piwik platform to fix such bugs, however now it's not possible (I don't know how to reproduce issue with randomly vanishing archives).
Proposed solution:
A scheduled task triggered after archiving would do following:
iterate over recently archived periods (might be cached during archiving probably to prevent overkill with checking ?)
for each done flag it should count number of numeric values and blob
blob count should be counted only on 0-level (as we cannot predict how many subtables can there be for each period)
we should be able to calculate number of expected blobs and numeric (maybe based on active plugins list or just take maximum number for given period, because it's not likely that ALL sites have broken archives. also always at least one site should have complete metrics list)
if number of numerics is smaller than expected, then such idarchive is suspected to be incomplete
there should be number of allowed discrepancies - i.e. no ecommerce report if site isn't e-commerce enabled, might have 0 in nb_visits (then such archive is ok to be empty) etc. Especially this task should be aware of some values not being inserted into archives (i.e. empty datatable,s zeroes in numerics).
after iterating over all recently archived ids, report should be logged informing which idsites, periods, segment names, dates seem to be invalid
also it would be nice to maybe log some values that would allow to easily invalidate such archives along with piwik_option timestamps, so next archiving would reprocess those?
Such a scheduled task would be very useful to ensure high data availability in Piwik even when edge cases bugs may be present.
in Piwik 2.8.3 there are reasons to hope that there will be less cases of "missing data".
(that's why I didn't set Major label until the bug is confirmed to occur again quite regularly)
Since the issue was created we have released some better tools such as the core:purge-old-archive-data console command in #7377#7181 - which seems to have solve the problem for users 👍
duplicateFor issues that already existed in our issue tracker and were reported previously.TaskIndicates an issue is neither a feature nor a bug and it's purely a "technical" change.
Example use case:
It's only one case, but that is something that happens really often on production servers and is random. I know that the main goal should be to improve Piwik platform to fix such bugs, however now it's not possible (I don't know how to reproduce issue with randomly vanishing archives).
Proposed solution:
A scheduled task triggered after archiving would do following:
refs #5805
The text was updated successfully, but these errors were encountered: