Currently, archive.sh simply loops over the periods and triggers archiving for all websites: https://github.com/piwik/piwik/blob/master/misc/cron/archive.sh
because there are memory leaks issues during archiving, see #766, when a Piwik has hundreds or thousands of websites in the system, the memory usage peaks and can lead to ridiculous values.
Until #766 is addressed and as a useful safe guard, we should looop over each website and trigger archiving separately for each. This is a quick fix that will help many users get past the memory usage issues.
Subscribing to this ticket as I would really appreciate that method. Let me know if I can / should test something :)
Thanks for your work on this. Overall I would say its better. Its still aching slow running over the sites and my most active site (15k visits / 20k actions per day, out of around 150k per day) is taking 2 GB to process the yearly stats.
I tried the new archive.sh script, but it didn't work for me, because the list of site id's seems to be utf encoded it doesn't pass the isnumeric test. The convertToUnicode=0 param to the api call doesn't seem to do anything here.
alx, did you apply the full patch? only applying a subset will obviously fail. does it work for you?
Oops, my bad. I didn't apply the full patch, sorry 'bout that.