Is it maybe possible to optimize count(distinct) SQL queries? #10188

mattab · 2016-05-28T02:52:36Z

As I was reading this very interesting article: https://www.periscopedata.com/blog/use-subqueries-to-count-distinct-50x-faster.html I am wondering whether these findings could be applied to our SQL queries in Piwik?

We have had performance issues with processing COUNT( DISTINCT field ) on our very large datasets... maybe there is actually a possibility to improve archiving performance. Help is most welcome!

The text was updated successfully, but these errors were encountered:

tsteur · 2016-05-30T02:12:33Z

I'm pretty sure I tried this when I had a look at this last time but definitely worth having a look again at some point

andristeiner · 2022-03-11T15:58:54Z

Today, i stumbled upon this as well while debugging slow reports when using custom date ranges. I was able to identify the count(distinct) queries as the culprit, then found this article: https://www.sisense.com/blog/use-subqueries-to-count-distinct-50x-faster/, and finally this existing ticket.

Let me know when we can assist you with some testing.

mattab · 2023-12-10T12:48:45Z

The team would re-create issue later if needed

mattab added the c: Performance For when we could improve the performance / speed of Matomo. label May 28, 2016

mattab added this to the 2.16.x (LTS) milestone May 28, 2016

mattab modified the milestones: 3.0.0, 2.16.x (LTS) Jul 7, 2016

mattab closed this as not planned Won't fix, can't repro, duplicate, stale Dec 10, 2023

sgiehl added the wontfix If you can reproduce this issue, please reopen the issue or create a new one describing it. label Dec 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it maybe possible to optimize count(distinct) SQL queries? #10188

Is it maybe possible to optimize count(distinct) SQL queries? #10188

mattab commented May 28, 2016

tsteur commented May 30, 2016

andristeiner commented Mar 11, 2022

mattab commented Dec 10, 2023

Is it maybe possible to optimize count(distinct) SQL queries? #10188

Is it maybe possible to optimize count(distinct) SQL queries? #10188

Comments

mattab commented May 28, 2016

tsteur commented May 30, 2016

andristeiner commented Mar 11, 2022

mattab commented Dec 10, 2023