@mattab opened this Issue on January 11th 2011 Member

We have done a lot of work on unit testing, integration testing and come up with an excellent Hudson setup. These have proved incredibly useful and time saving.
Now, it would be great to be able to automatically tell whether Piwik overall performance is impacted by a change or set of changes.

There are a few facts that we want to keep an eye on, and learn more about:

  • throughput of Tracker: how much data can piwik tracker handle per second before becoming too slow or before it breaks
  • concurrency: is tracker affected much by concurrent connections?
  • Archiving performance: when does the archiving process start to get too slow? This depends on number of websites, number of visits, unique URLs, etc.
  • How do Archiving/Tracker performance decrease over time, ie. as more and more data are added to the DB.
  • What are Piwik bottlenecks in Tracker and Archiving?
    To help assess whether Piwik performance is improved following a change, and to generally help developers be more aware of the performance of Piwik in a high load environment, we could automate performance testing completely.

This is my proposal for continuous performance testing in Piwik. Please comment if you have any feedback or idea.

Performance test script

  • we will prepare a very large real life 'piwik log format' log of Piwik visits, pages, downloads and outlinks and several websites. This log format is implemented in #134. For a start, we have a 7GB piwiklog* dump that we could anonymize and reuse.
  • write a 'log replay' script with inputs 'speed', 'duration', concurrency that could replay the logs
    • at given speed (eg. 10 times faster than they really happened),
    • and/or for given amount of time (eg. replay logs 10 times faster for 1 hour),
    • with given concurrent http connections
    • to a target piwik instance
  • Build a 'performance test' script that would
    • call this log replay script
    • then would call archive.sh on the target piwik

Manually run this script

A manual run of the script, with very high speed, concurrent connections, is equivalent to a stress test. It will highlight what is the limit of traffic piwik can handle.

Continuously run this script

The goal is to run this script as part of our continuous integration process.

  • Run this as a nightly Hudson build (and on per click demand)
  • Ensure we can get metrics and monitoring:
    • Install Monitoring on the piwik instance hit by this script, so we can review all important metrics: Cpu, ram, etc.
    • For awesomeness, we could install XHProf and keep runs between builds. Then we could compare runs of each build with each other, literally comparing each commit impact on overall performance. This will help finding out the "what" caused an issue, or "what" the bottleneck is. XHProf could profile few random Tracker requests, as well as profile all Archiving requests (see tutorial).
  • Establish metrics that define the build to 'pass' or 'fail'. This is important to have the build fail if the build becomes slower than expected, uses too much memory or other issue. We will come up with these metrics once everything else is in place.
  • Ensure that all these metrics, graphs are archived and kept on disk, so that we can eg. visually compare graphs (and tables) from last night and few weeks ago.

Other notes

  • Maybe there are tools that I'm not aware of that could help some of these tasks. As a start, JMeter looks useful and worth a look.
  • The piwik instance should run latest version of PHP (at least 5.3)
  • Ultimately, maybe the instance should be on a different box than the Hudson box. But to keep things simple, at first we will run the log replay script and the target Piwik instance on the same server (ie. Hudson server). Once all is in place, or if other builds are impacting the performance testing build, we can move the Piwik instance to a separate server and keep the log replay script on Hudson.
  • I focused on Tracker and Archiving since these are the bottlenecks so far. API and UI speed is usually great as we have done a lot of good work in this area.
@robocoder commented on January 12th 2011 Contributor

qa.piwik.org:8080/hudson currently runs php-cgi 5.3.5 (latest). jetty + cgi is not optimized for speed; it's geared to flexibility (eg switching php versions on demand without restarting jetty; no build dependencies, eg mod_php5) and use with hudson.

I think we can setup the performance test as a remote job, and have it report its "build" status to hudson dashboard.

I suggest setting up lighttpd or nginx (on another port, or replace Apahe httpd at port 80), plus php-fcgi-fpm.

@mattab commented on May 22nd 2011 Member

See also http://www.mnot.net/blog/2011/05/18/http_benchmark_rules

See also: Plugin for Jenkins? #3280

@mattab commented on September 20th 2012 Member

XHProf is now integrated in our testing suite, see commits at: #3177

next logical step will be to have it run on Jenkins, and run a daily real life performance test (log import), with web access to the XHProf reports in the morning. :-) Stay tuned!

@mattab commented on September 2nd 2014 Member
@mattab commented on December 18th 2014 Member

I'm closing this ticket for now, as it's not directly related to Piwik core. We're working on such setup internally as we need infrastructure to monitor performance. Good times!

This Issue was closed on December 18th 2014
Powered by GitHub Issue Mirror