TiDB currently cannot push
COUNT(DISTINCT x) queries to TiFlash so on large datasets these queries can perform slowly and use a lot of memory. Reworking
COUNT(DISTINCT x) queries to use sub-queries instead solves this issue for TiDB, but sub-queries perform ~2.5x slower on MySQL, so a hybrid solution will be required.
In order to support TiDB as an alternative database we need to provide the option for certain queries to be generated using sub-queries instead of
COUNT(DISTINCT x). This could be a beneficial optimization for other databases too so it would be good to implement in a generic manner.
COUNT(DISTINCT x)(33 occurrences in non-test code, mostly in
preferSubqueriesand implement it on the standard Matomo MySQL adapter to return false. A future TiDB PDO adapter can return true.
COUNT(DISTINCT x), rework the query generation code to optionally replace
COUNT(DISTINCT x)with an appropriate sub-query if the PDO adapter
preferSubqueriesoption is set.