Just a sidenote - maybe parmeter responsible for segments concurent requests could be generalized ? So that we don't only use full server processing power only for segments, but also if there are "free" threads, we process further idsites.
For ex.
Currently threading option is only used for segments, so single big idsite can block multi threaded processing for long time, while other threads could be carrying on further idsites.
Also current implementation might be dangerous when concurent segments count is big. For ex. when you launch 5 separate core:archive commands, each of them can possibly launch another 3 threads which will give 15 threads on server, instead of desired 5.
Currently threading option is only used for segments, so single big idsite can block multi threaded processing for long time, while other threads could be carrying on further idsites.
IIRC to work around this issue we implemented the possibility to run several core:archive in parallel. Each processus will then calculate data for a different website: http://dev.piwik.org/trac/ticket/4903
current implementation might be dangerous when concurent segments count is big
+1, this is limitation of our algorithm, maybe we can solve this problem separately? feel free to create ticket if it's important we investigate :+1:
As for paralell threading - yes, it is possible to launch several commands. However this is very inconvienient, as each of them calculates only one idsite, but can calculate multiple segments, which makes it preety hard to tune up archiving in consistent way (there should always be the same number of threads, regardless we're going through sites or segments). I'll create ticket on Trac to improve this part, as it will be big progress in matters of archiving tuning.
edit:
created ticket on trac http://dev.piwik.org/trac/ticket/5363
Also current implementation might be dangerous when concurent segments count is big. For ex. when you launch 5 separate core:archive commands, each of them can possibly launch another 3 threads which will give 15 threads on server, instead of desired 5.
In the case that several core:archive commands are triggered, maybe it's best to always use --concurrent-requests-per-website=1
? This way you can be sure that each thread will process one request at a time and that max number of threads is predictable.
I think it could be a temporary workaround for problem. Generally it will be quite problematic to manage such cronjob with i.e. 45 core:archive commands ;) also one wouldn't be able to launch archiving manually with N threads easily (for ex. while benchmarking?), instead it would require launching N screen instances/ N terminal connections to server.
@mgazdzik all items you've suggested have been fixed. cheers