New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
initial proposition of params to introduce to extend CoreArchiver usability #319
Conversation
Just a sidenote - maybe parmeter responsible for segments concurent requests could be generalized ? So that we don't only use full server processing power only for segments, but also if there are "free" threads, we process further idsites.
Currently threading option is only used for segments, so single big idsite can block multi threaded processing for long time, while other threads could be carrying on further idsites. Also current implementation might be dangerous when concurent segments count is big. For ex. when you launch 5 separate core:archive commands, each of them can possibly launch another 3 threads which will give 15 threads on server, instead of desired 5. |
private function disablePurgeIfNeeded() | ||
{ | ||
if ($this->startDate !== false) { | ||
Rules::$purgeDisabledByTests = true; // TODO: replace this hack with some generic solution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this cannot work because this flag will be set in the core:archive CLI process, and not in the actual archiving PHP process
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then maybe we could introduce additional flag to prevent outdated archives droping ? Idea is to prevent this to happen especially when we're computing past and completed periods like 2-3 months ago without affecting current data. It would be great if you could introduce such flag or method to be triggered. Alternatively I could give it a try, but would need a hint where to start ?
IIRC to work around this issue we implemented the possibility to run several core:archive in parallel. Each processus will then calculate data for a different website: http://dev.piwik.org/trac/ticket/4903
+1, this is limitation of our algorithm, maybe we can solve this problem separately? feel free to create ticket if it's important we investigate 👍 |
As for paralell threading - yes, it is possible to launch several commands. However this is very inconvienient, as each of them calculates only one idsite, but can calculate multiple segments, which makes it preety hard to tune up archiving in consistent way (there should always be the same number of threads, regardless we're going through sites or segments). I'll create ticket on Trac to improve this part, as it will be big progress in matters of archiving tuning. edit: created ticket on trac http://dev.piwik.org/trac/ticket/5363 |
$command->addOption('end-date', null, InputOption::VALUE_OPTIONAL, "If set, also requires start date to be defined. Will force Piwik to re-archive up till this date"); | ||
$command->addOption('force-periods', null, InputOption::VALUE_OPTIONAL, "Allows to define what periods only should be archived. Can be comma-separated like day,week,month"); | ||
$command->addOption('disable-outdated-archives-purge', null, InputOption::VALUE_OPTIONAL, "Allows to disable purging of outdated archives (i.e. today, this week etc.) if you are archiging past periods"); | ||
$command->addOption('nb-concurent-segment-requests', null, InputOption::VALUE_OPTIONAL, "Allows define how many segments do you want to process at a time"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small typo: concurrent
In the case that several core:archive commands are triggered, maybe it's best to always use |
I think it could be a temporary workaround for problem. Generally it will be quite problematic to manage such cronjob with i.e. 45 core:archive commands ;) also one wouldn't be able to launch archiving manually with N threads easily (for ex. while benchmarking?), instead it would require launching N screen instances/ N terminal connections to server. |
@mgazdzik all items you've suggested have been fixed. cheers |
No description provided.