@diosmosis opened this Pull Request on April 9th 2020 Member

we only care about the newest usable one

@tsteur commented on April 15th 2020 Member

@diosmosis I just ran the task for us and I think it didn't clean up anything maybe because there are no reports marked as to be invalidated

image

and it seems it would only remove the ones that were invalidated

image

Not sure this is maybe still an issue? But I suppose then the only solution would be to iterate over all archive tables for this which we maybe don't want as it could be quite resource intensive?

Not sure we're doing it already. Maybe it would help to invalidate all previous archives for the same site, date & period when we write a new OK archive?

@diosmosis commented on April 15th 2020 Member

@tsteur The fix is to run the command manually, otherwise we'd have to somehow add one date per existing archive table to ArchivesToPurgeDistributedList. Could do that in an update?

@diosmosis commented on April 15th 2020 Member

Maybe it would help to invalidate all previous archives for the same site, date & period when we write a new OK archive?

I guess in the purge task, we can always do today/yesterday, that would solve the biggest source of dupes.

@tsteur commented on April 15th 2020 Member

That sounds good to do it in an update 👍 If I see this right in our DB it is only march and april that seems impacted by this so would only need to add two dates for these months maybe?

Currently executing this command to potentially fix it for now for us:

console core:purge-old-archive-data "2020-04-15" "2020-03-15" --exclude-ranges --skip-optimize-tables --include-year-archives  '
@tsteur commented on April 15th 2020 Member

I guess in the purge task, we can always do today/yesterday, that would solve the biggest source of dupes.

👍 that should work and be quite simple. I guess today would be enough. Yesterday could be done maybe if it belongs to a different month? As purgeInvalidatedArchivesFrom looks at the entire month anyway

@diosmosis commented on April 15th 2020 Member

I guess today would be enough. Yesterday could be done maybe if it belongs to a different month?

:+1:

This Pull Request was closed on April 14th 2020
Powered by GitHub Issue Mirror