Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log which segments are currently being archived #7536

Closed
mattab opened this issue Mar 25, 2015 · 9 comments
Closed

Log which segments are currently being archived #7536

mattab opened this issue Mar 25, 2015 · 9 comments
Assignees
Labels
c: Platform For Matomo platform changes that aren't impacting any of our APIs but improve the core itself. c: Usability For issues that let users achieve a defined goal more effectively or efficiently. Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.
Milestone

Comments

@mattab
Copy link
Member

mattab commented Mar 25, 2015

The goal of this issue is to do a small improvement the core:archive log,

Currently the core:archive output looks like: Will pre-process for website id = 2, range period, the following 10 segments: { pageUrl!=xx, segment2here, segment3here, segment4here, .... }

the log line can get very long....
The idea would be to have an output like this:

Will pre-process for website id = 2, range period, 10 segments
Starting processing for website id = 2, range period, of these 3 Segments { pageUrl!=xx, segment2here, segment3here  } [1 / 4]
Starting processing for website id = 2, range period, of these 3 Segments { segment4here, ..., ...  } [2 / 4]
Starting processing for website id = 2, range period, of these 1 Segments { segment10here  } [4 / 4]
[...]

Note: by default 3 segments will be triggered at once (--concurrent-requests-per-website=3). if a website has eg. 100 segments, it would issue 34 requests or so. This simple log statement change will really improve the experience to Piwik administrators!

@mattab mattab added Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. c: Platform For Matomo platform changes that aren't impacting any of our APIs but improve the core itself. c: Usability For issues that let users achieve a defined goal more effectively or efficiently. labels Mar 25, 2015
@mattab mattab added this to the Piwik 2.12.1 milestone Mar 25, 2015
@mattab mattab modified the milestones: Piwik 2.13.0, Piwik 2.12.1 Mar 25, 2015
@quba
Copy link
Contributor

quba commented Mar 26, 2015

It would be also good to have information what is currently being processed. Now we get information only about last task and we don't know what is in progress.

@mattab
Copy link
Member Author

mattab commented Mar 26, 2015 via email

@mnapoli mnapoli assigned mnapoli and unassigned mnapoli Mar 27, 2015
@mnapoli
Copy link
Contributor

mnapoli commented Apr 15, 2015

@quba we had a look with @mattab and there is a maybe a quick win we can do: showing progress of current requests (when there are many segments). For example:

Will pre-process for website id = 2, range period, 30 segments
Querying URLs 1, 2, 3 of 30
Querying URLs 4, 5, 6 of 30
Querying URLs 7, 8, 9 of 30
...
[...]

Would that be a progress from the current situation?

The problem is that a better solution (e.g. showing the name of the segments) would require much more work because we need to do extensive changes to how the archiving work. So in the meantime we try to find a simpler solution that could help.

@mattab
Copy link
Member Author

mattab commented Apr 16, 2015

@mnapoli I had an idea on how we could log as suggested in issue description: instead of sending all segment URLs to CliMulti at once, we could create chunks of concurrent-requests-per-website elements and then loop through the chunks, log Starting processing for website id = 2, range period, of these 3 Segments { .... } and call Climulti for this chunk only. would that work?

@mnapoli
Copy link
Contributor

mnapoli commented Apr 16, 2015

Yes but that would make the feature of CliMulti to chunk and parallelize obsolete, so do we remove multi-threading from the CliMulti to put it in CronArchive? And if we do that that's still not easy as we need to find a way to keep trace of the segment (or message) from the moment we list the segments and add them to the url list, and the moment we iterate through that url list to query each one of them. I.e. same problem as today except we stay in the same class (and add more logic to cronarchive). It's possible, but I don't think it's especially easier or cleaner than another solution.

@diosmosis
Copy link
Member

Quick change for this would be to add callbacks to CliMulti so we can execute code when an archiving job is about to be executed.

@mnapoli
Copy link
Contributor

mnapoli commented Apr 16, 2015

I have a POC that has this output:

INFO CoreConsole[2015-04-16 23:22:25] Starting Piwik reports archiving...
INFO CoreConsole[2015-04-16 23:22:25] Will pre-process for website id = 1, day period, 2 segments
INFO CoreConsole[2015-04-16 23:22:25] - pre-processing segment entryPageTitle=@Web%20Analytics
INFO CoreConsole[2015-04-16 23:22:26] - pre-processing segment pageTitle=@Piwik
INFO CoreConsole[2015-04-16 23:22:26] Archived website id = 1, period = day, 6 visits in last last2 days, 0 visits today, Time elapsed: 1.380s
INFO CoreConsole[2015-04-16 23:22:26] Will pre-process for website id = 1, week period, 2 segments
INFO CoreConsole[2015-04-16 23:22:26] - pre-processing segment entryPageTitle=@Web%20Analytics
INFO CoreConsole[2015-04-16 23:22:26] - pre-processing segment pageTitle=@Piwik
INFO CoreConsole[2015-04-16 23:22:36] Archived website id = 1, period = week, 459 visits in last last3 weeks, 301 visits this week, Time elapsed: 9.440s

@mattab & @quba would that be OK?

@mattab
Copy link
Member Author

mattab commented Apr 16, 2015

It looks good to me 👍

here is an idea: add - pre-processing all visits line so we can see from the datetime how long the request 'without segment' took

@mnapoli
Copy link
Contributor

mnapoli commented Apr 21, 2015

PR: #7723

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: Platform For Matomo platform changes that aren't impacting any of our APIs but improve the core itself. c: Usability For issues that let users achieve a defined goal more effectively or efficiently. Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.
Projects
None yet
Development

No branches or pull requests

4 participants