My Piwik database is 5.9MB big and I've got an overhead of 1.1MB. How about implementing an "OPTIMIZE" routine in Piwik to avoid getting a huge overhead?
What tables have overhead?
if they are piwikarchive* tables, we could probably add an OPTIMIZE statement around the postCompute method at: https://github.com/piwik/piwik/blob/master/core/ArchiveProcessing/Period.php#L280
Do we know how this overhead changes as the database increases? Is it closer to a constant (e.g., 1.1MB), percentage (e.g., 18%), exponential, or logarithmic?
If the overhead is on the piwik_log_archive (which I suspect it is), then I believe the overhead is linearly proportional to the number of deleted rows by the query in https://github.com/piwik/piwik/blob/master/core/ArchiveProcessing/Period.php#L280 which itself is learnly proportional to the frequency at which archives are done for the Piwik install (if cron is installed or not, etc.).
OPTIMIZE is probably costly to run, but the DELETE are also ran only once a day maximum, so we might be OK running the OPTIMIZE just after these DELETEs. Or we could run it every week or so (every sunday eg.) in this same code path.
An OPTIMIZE run may be issued on an entire database by "mysqlcheck -o <db>", which can be put in crontab. I guess it might be enough to put this in the FAQ.
I think it's enough to put this into FAQ. Including this into the core will make it more complicated to implement other database backends in future (OPTIMIZE is MySQL specific).
The DBStats plugin is already backend-specific.