Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not log SQL statements by default #7640

Closed
tsteur opened this issue Apr 7, 2015 · 19 comments
Closed

Do not log SQL statements by default #7640

tsteur opened this issue Apr 7, 2015 · 19 comments
Assignees
Labels
c: Performance For when we could improve the performance / speed of Matomo. Task Indicates an issue is neither a feature nor a bug and it's purely a "technical" change.
Milestone

Comments

@tsteur
Copy link
Member

tsteur commented Apr 7, 2015

Eg in Tracker mode 14% of my wall time is spent in Db::logSql() even though it doesn't actually log anything because of the log level. Also when archiving and thousands of queries are triggered it spends quite a lot of time there in total.

We discussed this about 2 weeks ago in the chat and an idea is to use introduce a new config setting [Debug] log_sql_queries.

Maybe in tracker we should always use the NullLogger and no handlers or so for best performance unless [Tracker] debug = 1 is enabled? Not sure!

@tsteur tsteur added the c: Performance For when we could improve the performance / speed of Matomo. label Apr 7, 2015
@tsteur tsteur added this to the Piwik 2.13.0 milestone Apr 7, 2015
@mattab mattab added the Task Indicates an issue is neither a feature nor a bug and it's purely a "technical" change. label Apr 8, 2015
@diosmosis
Copy link
Member

Do you remember how many times the method was called, total wall time and whether any specific methods called by Db::logSql took up most of the time? Also, isn't the tracker Db different from \Piwik\Db? I guess some parts call Db::, but I was under assumption most tracker code didn't.

@tsteur
Copy link
Member Author

tsteur commented Apr 9, 2015

Another run shows 120ms total run. There's a method Piwik\Tracker\Model::getIdsAction that does a fetchAll that triggered the logSql. It still uses some part of Db:: as tracker Db extends this one. Anyway, even in normal UI or when archiving it makes things slower.

@diosmosis
Copy link
Member

120ms out of what total? And that's just for logSql? Did any functions called somewhere within logSql's execution take a lot of time or is it just the if statement in Logger?

I don't really care if logSql is run or not (it was a PRO developer request from the last meetup), but it seems strange that what should essentially be just if (!shouldLog($level)) { return; } results in noticeably slower execution. If there is an issue here, it bears investigation.

Did you test w/o xhprof/xdebug while doing a large bulk track?

@tsteur
Copy link
Member Author

tsteur commented Apr 10, 2015

It takes about 14% of 120ms total time.

I tested with xhprof but without xdebug, no bulk. I do not have the run here anymore to see what exactly was slow.

@diosmosis
Copy link
Member

For posterity/future work: Tracker\Db.php doesn't inherit from Piwik\Db.php, it's possible something's using Piwik\Db in the tracker incorrectly.

@diosmosis
Copy link
Member

I did a bulk tracking test (300 actions / 100 visits), and Db::logSql is only called 20 times and took .07% of wall time. It's only called when autoloading Option values in Cache::getCacheGeneral. Making sure it uses the correct Db will fix this.

@tsteur
Copy link
Member Author

tsteur commented Apr 17, 2015

Why bulk tracking? Can you try with a normal tracking request? And xdebug disabled + development mode enabled and caches warmed up and it such have a bigger effect. Anyhow, if it is fixed when using the Tracker\Db then it is fine. It also had an effect when archiving and using normal Db see issue description but don't remember how much it was.

@diosmosis
Copy link
Member

I used bulk tracking to see if it would have a bigger effect. I used w/ xdebug disabled + development mode enabled. I don't know what you mean by caches warmed up?

FYI, I used blackfire, so I think I can send you a link.

It makes sense that archiving would have an issue, it doesn't use the Tracker\Db class.

@diosmosis
Copy link
Member

W/ one tracking request, it still is only called by Option::autoload & only called 20 times. It represents a bigger portion of the time, but still only 1.3%

@diosmosis
Copy link
Member

On further examination, Option::autoload is only called by the tracker when running scheduled tasks. So nothing to do tracker side.

I'm looking at core:archive profiles.

@diosmosis
Copy link
Member

During a core:archive command, logSql takes up .07% of the total time. My related config is:

[Development]
enabled = 1
disable_merged_assets = 1

[log]
log_writers[] = file
log_level = INFO

Can you confirm the results you last found, and perhaps share the profile?

@tsteur
Copy link
Member Author

tsteur commented Apr 19, 2015

You should really disable development mode when profiling anything as it gives you wrong results. It will spend more time in things that are usually cached etc. With warming up the cache I mean make sure to perform one or two requests before actually profiling to make sure everything is cached.

I don't use blackfire but used it to quickly upload one run from xhprof https://blackfire.io/profiles/b17f20df-2ec4-4c4f-b930-636768eefc6e/graph

trackerprofile

When you test the archiver, do you have many sites, with like 1mio actual urls, many different actions, also lots of other data etc? Which period did you profile? Sometimes archiving of one period for one site takes 50 minutes. Even 1% can be actually quite a bit then although it is surely not much.

@diosmosis
Copy link
Member

You should really disable development mode when profiling anything as it gives you wrong results.

Did not know that, I assumed from your comment above I should run it with development mode enabled.

I don't use blackfire but used it to quickly upload one run from xhprof https://blackfire.io/profiles/b17f20df-2ec4-4c4f-b930-636768eefc6e/graph

From your profile, I see that logSql is called when autoloading Options, and only when remembering to invalidate reports. I also see that most of the time comes not from logSql, but Log::getInstance(). So going through DI to setup the logger is the problem here, at least for the tracker. I'll look into why the normal Db is being used when loading options.

When you test the archiver, do you have many sites, with like 1mio actual urls, many different actions, also lots of other data etc? Which period did you profile?

I had 20 sites, and was using demo2db's data, but only doing archiving for one month of it, so it wasn't a lot. I think doing the profile you did would be rather complicated to do again, do you still have the profile? In the tracker the issue is setting up the logger, but this should already be done for archiving, so I'm not sure why an if (logLevel > levelToLog) would be even a slightly expensive operation in this case. The profile would help in figuring this out.

In the future, it would be very useful for these profiles to be uploaded and linked in issues like these.

@tsteur
Copy link
Member Author

tsteur commented Apr 20, 2015

Sorry I meant development mode disabled and caches warmed up.

I don't have the other profile from archiving anymore but I should still have the profile from this morning of the tracker if you meant this? Can send it on slack.

I can upload profiles in the future, in this case I thought it wouldn't be a big deal as it shouldn't be a problem adding the requested option. In most cases the logged sql statements are not needed at all and when needed it could be simply enabled I thought. It would help of course when a wrong DB is used. Didn't think it would use the wrong one.

@diosmosis
Copy link
Member

I don't have the other profile from archiving anymore but I should still have the profile from this morning of the tracker if you meant this? Can send it on slack.

The tracker is easy to get, the archiving one would be good to see. I'll try and replicate.

I can upload profiles in the future, in this case I thought it wouldn't be a big deal as it shouldn't be a problem adding the requested option. In most cases the logged sql statements are not needed at all and when needed it could be simply enabled I thought.

I'm not sure I see the reason for adding the option, since those logs should only be logged via DEBUG level. Adding the option might help us when debugging (to clear a lot of the crap), but putting it in right now might hide the real problem we haven't found yet (at least for archiving).

I'll try to run a profile on EC2 for archiving. Did you use a specific data dump (maybe the data on testing-big?)?

@tsteur
Copy link
Member Author

tsteur commented Apr 20, 2015

yeah testing big. Warning it can take some hours to archive :)

If adding this option is a problem I suggest to just move this issue to mid term or so. In my mind it was ok to just enable that option if needed but if this is not the case, the time spent on this issue is probably not worth it and we can improve somewhere else even more. I mean re the archiver.

The tracker should be fixed though.

@diosmosis diosmosis self-assigned this Apr 20, 2015
@diosmosis
Copy link
Member

Adding the INI option is fine, but I want to find the underlying performance issue first, just so we don't accidentally cover it up.

@mnapoli
Copy link
Contributor

mnapoli commented Apr 20, 2015

By the way remember also to optimize the autoloader (composer dump-autoload -o) it makes a very big difference in performances too.

@diosmosis
Copy link
Member

Added some optimizations in #7711, feel free to review/merge.

diosmosis added a commit that referenced this issue Apr 21, 2015
Refs #7640, optimize tracker by making sure production tracker does not use a real logger. Includes change to ContainerFactory that allows plugins to provide environment specific DI config & introduces the tracker.php DI environment. Moved logging related config in cli.php environment config to Monolog container. Also has update to php-di.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: Performance For when we could improve the performance / speed of Matomo. Task Indicates an issue is neither a feature nor a bug and it's purely a "technical" change.
Projects
None yet
Development

No branches or pull requests

4 participants