Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Live.getCounters times out on very large instance #6758

Closed
mattab opened this issue Dec 1, 2014 · 6 comments
Closed

Live.getCounters times out on very large instance #6758

mattab opened this issue Dec 1, 2014 · 6 comments
Labels
Bug For errors / faults / flaws / inconsistencies etc. c: Performance For when we could improve the performance / speed of Matomo.
Milestone

Comments

@mattab
Copy link
Member

mattab commented Dec 1, 2014

On a very large Piwik instance, there are several performance challenges. One issue is that the Real time counter displayed on top of the Real time visitors widget times out on a very large DB.

Query is:

SELECT
                count(*) as visits, COUNT(DISTINCT log_visit.idvisitor) as visitors
            FROM
                piwik_log_visit AS log_visit
            WHERE
                log_visit.idsite in ('2') AND log_visit.visit_last_action_time >= '2014-11-24 10:58:45'

the problem likely is the COUNT(DISTINCT) which doesn't scale.

Possible solution: let user disable the counting of unique users via config file. For example we could reuse the existing enable_processing_unique_visitors_day and if it is se to 0 then the Live API would not issue COUNT(DISTINCT idvisitor.

@mattab mattab added Bug For errors / faults / flaws / inconsistencies etc. c: Performance For when we could improve the performance / speed of Matomo. labels Dec 1, 2014
@mattab mattab added this to the Short term milestone Dec 1, 2014
@quba
Copy link
Contributor

quba commented Dec 1, 2014

For example we could reuse the existing enable_processing_unique_visitors_day and if it is se to 0 then the Live API would not issue COUNT(DISTINCT idvisitor.

+1

But this does not solve the problem for most of users (the don't want to disable unique visitors processing during archiving).

@tsteur
Copy link
Member

tsteur commented Mar 9, 2015

I can confirm the problem is caused by COUNT(DISTINCT log_visit.idvisitor) but couldn't find a way to speed it up. FYI: it uses the index index_idsite_idvisitor. Using any other index is much slower. The only way would be probably to add another index for this but doubt that's worth it.

@mattab
Copy link
Member Author

mattab commented Mar 9, 2015

The only way would be probably to add another index for this but doubt that's worth it.

just to confirm, this feature is not worth a new index.

tsteur added a commit that referenced this issue Mar 9, 2015
tsteur added a commit that referenced this issue Mar 9, 2015
tsteur added a commit that referenced this issue Mar 10, 2015
* If we know there are 0 visits, do not execute any further query.
* Added showColumns as it is more appropriate
@tsteur
Copy link
Member

tsteur commented Mar 10, 2015

As enable_processing_unique_visitors_day is enabled by default I agree that it is not very useful etc.

The problem is, that the current getCounter API method is used for the "last 3 minutes" but also for the "last 24 hours". A query that performs much slower over the last 24 hours might perform fast when only checking over the last 3 minutes.

If we only look at the last 3 minutes in that widget we could write a separate method that is faster for this use case maybe. I reckon in 3 minutes we're talking about max 150k visitors that we have to deal with?

Currently, querying the unique visitors means it uses the index index_idsite_idvisitor and has to check each row for date/time. This is faster when checking over 24 hours. But over 3 minutes we could maybe use the index on last_action_date_time reducing the results to check quite a bit. We would need to try this on a bigger instance maybe. A subselect etc. could be faster

@tsteur
Copy link
Member

tsteur commented Mar 10, 2015

Edit: I just verified MYSQL will automatically choose the index_idsite_datetime index once the result set is lower (eg < 200k visitors). We do not have to force a different index. So the Counter widget should not be a problem. The RealTime widget as described in the issue should be fast now as well see pull request. All good!

@mattab mattab modified the milestones: Piwik 2.12.0, Short term Mar 11, 2015
@mattab
Copy link
Member Author

mattab commented Mar 11, 2015

well done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc. c: Performance For when we could improve the performance / speed of Matomo.
Projects
None yet
Development

No branches or pull requests

3 participants