Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding the ability to only archive sites that were viewed within a time period #8263

Closed
jackellenberger opened this issue Jul 1, 2015 · 1 comment
Labels
duplicate For issues that already existed in our issue tracker and were reported previously.

Comments

@jackellenberger
Copy link

Background: Our Piwik instance currently has 13,000 sites associated with it, where only 100-200 of them are active at a time (both receiving views and viewing reports / analytics). This means our core:archive output looks like this:

INFO CoreConsole[2015-06-30 14:25:31] - Will process 101 websites with new visits since 1 days 0 hours , IDs: [...]
INFO CoreConsole[2015-06-30 14:25:31] - Will process 91 other websites because some old data reports have been invalidated (eg. using the Log Import script) , IDs: [...]
INFO CoreConsole[2015-06-30 14:25:31] - Will process 13212 other websites because the last time they were archived was on a different day (in the website's timezone) , IDs:

It is pretty easy to efficiently process 100-200 sites, but 13,000 takes about 7 days (!).

Proposed Solution: Create a flag for the core:archive command such as "--ignore-untouched-sites" that would not archive sites that were last archived on a different day, if there were no views or actions. If at a later time these sites did get views, it may take longer to archive them, but I would rather have the option to frontload the time savings and ignore sites that have not been interacted with in the last day (or any time period specified by --force-all-periods).

Let me know what you think!

Edit: it appears I can't label this, but if I could it'd be a feature request / enhancement!

@tsteur tsteur added the duplicate For issues that already existed in our issue tracker and were reported previously. label Jul 1, 2015
@tsteur
Copy link
Member

tsteur commented Jul 1, 2015

I think this is a duplicate of #5922 . Please comment/reopen if not. Cheers!

@tsteur tsteur closed this as completed Jul 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate For issues that already existed in our issue tracker and were reported previously.
Projects
None yet
Development

No branches or pull requests

2 participants