Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Piwik more efficient: sharding the data in several databases #471

Closed
mattab opened this issue Dec 9, 2008 · 9 comments
Closed

Piwik more efficient: sharding the data in several databases #471

mattab opened this issue Dec 9, 2008 · 9 comments
Labels
Bug For errors / faults / flaws / inconsistencies etc. Major Indicates the severity or impact or benefit of an issue is much higher than normal but not critical. wontfix If you can reproduce this issue, please reopen the issue or create a new one describing it.

Comments

@mattab
Copy link
Member

mattab commented Dec 9, 2008

In Piwik, all data is stored in a monolithic database. That is a problem when you have a huge traffic to monitor with Piwik: the database server struggles, the queries take too long to finish. One solution is to record data in several databases, within the same Piwik instance. Piwik will route automatically the data to the right database using the “idsite”.

For example:
- sites 1-1000 in serverA
- site 1001-1100 in serverB

We need to have the idsite in all SQL queries (as a parameter, or in a comment) and then automatically grep their content and route to the right server. The pairs (idsite, server) are stored in a configuration file.

THIS IS NOT FINISHED AND NEEDS MORE WORK.

Johan Mathe built the first dev version of the plugin, attached in the ticket.
We updated the core to add the idsite in some queries, but there is more to do.
This plugin was developed in August 2008 and Piwik has slightly changed since, added more queries, etc.

This plugin would be incredibly useful to all the big users of Piwik; some people are using Piwik to monitor thousands of websites, millions of visits, etc.

Attached is the current development version of the plugin. This is DEV only (it won’t work with current trunk). It is helpful to give an idea on how it could work.

Please post a comment here if you are interested in this plugin development and would like to participate.

@mattab
Copy link
Member Author

mattab commented Dec 9, 2008

Attachment: DEV version of sharding Piwik plugin
[Sharding.zip](http://issues.piwik.org/attachments/471/Sharding.zip)

@anonymous-matomo-user
Copy link

SKype provide its engine for sharding database with PostgreSQL. It is exactly deisgned for the purpose here.
Let’s have a look for those interested : [https://developer.skype.com/SkypeGarage/DbProjects/PlProxy]

And it can be better to dispatch the idsite in a non linear way. The above link provide an example based on a hash.

@robocoder
Copy link
Contributor

This plugin will need to be updated to reflect db abstraction changes.

In the absence of sharding, consider providing an option to remove the sharding comments to workaround a query cache bug on older MySQL versions.

@mattab
Copy link
Member Author

mattab commented Oct 12, 2009

We should not remove the comments. Even though the plugin is now not in a working state, it is there as a proof of concept. Sharding in Piwik would be a must have feature for high traffic piwik instances.

What is the issue with mysql cache? if it is fixed in stable mysql releases, it is not a blocker for us.

@robocoder
Copy link
Contributor

Thought: investigate using the Spider storage engine for MySQL as a more transparent method for partitioning/sharding.

@mattab
Copy link
Member Author

mattab commented Nov 24, 2010

Sharding as such is not the way to go... we can open specific tickets for specific implementations (eg. Mysql spider storage) if someone starts work on it.

@mattab mattab added this to the Future releases milestone Jul 8, 2014
@erickhuang17
Copy link

So,how's it going now?

@eramirezprotec
Copy link
Contributor

@tsteur
Copy link
Member

tsteur commented Mar 7, 2020

It wasn't implemented. We now support a reader though: https://matomo.org/faq/how-to-install/faq_35746/

Also in case you did need sharding you could look if you can configure something through MySQL directly.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc. Major Indicates the severity or impact or benefit of an issue is much higher than normal but not critical. wontfix If you can reproduce this issue, please reopen the issue or create a new one describing it.
Projects
None yet
Development

No branches or pull requests

7 participants