New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Screenshot test CI job is running too long #8222
Comments
I worked on integrating Docker in Piwik lately. It could have a lot of nice outcomes (replace Vagrant, automatically deploy branches in staging/demo very cheaply - e.g. 100 different branches/demos deployed on a simple VPS - a container not being used has 0 overhead, etc.), but the main one is targeted at tests. I now have Piwik fully working in Docker and have been trying out running tests in parallel: docker offers isolation between "VM" (not really vm, that's the point) with the advantage of being very cheap and fast to setup and run (no overhead). I gave a try to running 4 specs (between 30s and 60s each):
The machine I used to run the tests has 2 cores (so I'm pushing it a bit…) and 4 Gb of RAM. Each spec is slower by between 20% to 40% when run in parallel: I expect this is because I run 4 tests with 2 cores only. By running one spec for one core I think they shouldn't be any slower, but even with 20%-40% slower the total time is still better so it's fine. The good thing with this solution is that it scales. We could buy a 12 cores server and run 12 tests in parallel (over-simplification: 40 minutes/12 = 3.3 minutes…), with the setup time being constant and minimal: we setup Piwik only once (unlike if we were to split in different jobs on Travis). That could also help making this build much faster. Also, there are other very interesting technologies like Docker Machine that allows to run the same command, but on a distant machine (like Vagrant and its providers). Running the tests on AWS or whatever (e.g. our CI instance if it existed) wouldn't require anything other than configuring Docker Machine (no code to write, no separate command, etc…). And it would be extremely fast (because of the parallelization). But here I am seeing far ahead… So to sum up, yes that means some work. But UI tests and Travis have been a problem forever, when we sum all the time spent on this it would be much more worth it spending time for a nice solution :) If interested I can push my current branch with Docker. I'll keep looking into this in the meantime. |
I would seriously not want to maintain and host our own CI environment. When I needed quick result I just used our Before doing any more on this I'd to 2 things:
|
FYI: I tried to have a look what is slow without having to sum the time manually via remote debugging (https://drupalize.me/blog/201410/using-remote-debugger-casperjs-and-phantomjs) but didn't work for me. Update: I kinda got it working and will see if I get something out of this. |
Well done - let us know what you find! For sure your changes to make remote debug possible would be nice (and also a little README extract to explain how to remote debug - or a link to URL that explains) |
The UI tests CI job runs between 35min and 40min and regularly times out. Hopefully we can find something soon to help this situation - or at least split the CI job across two jobs as quick fix (unless you have other suggestion for a quick fix when we are hit by this problem soon). |
What are we gonna do here? Anyone a problem with splitting them into multiple jobs (2 or more)? |
I'm just going to link what I worked on a few weeks ago so that it's not lost if anybody wants to reuse it: https://github.com/piwik/piwik/compare/docker This branch contains the following changes:
The command will run each UI test suite into isolated containers (own tmp filesystem, own database, etc.). There is a The command works in a very non-optimized way: it runs 12 test suites in parallel and waits for all of them to finish, then starts 12 again. That means if a test suite takes 30s, and another 1'30s, then for 1 minute a container will do nothing. I haven't taken time to work on it, but optimizing it to parallelize really would improve the run time a lot (some test suites are really fast). It would be even better to run the longest test suites first, and the shortest one at the end. ResultsWith the current implementation I was able to run the UI tests in 6 minutes instead of 40 on a 8 core machine. (40 minutes was on AWS) I think it would be possible to run it in 3-4 minutes with more cores and optimized parallelization, including git clone, etc. That's to compare with 40-45 minutes on Travis. Long term ideaI think an ideal solution would be to have a CI server that only runs the UI tests, and contains the UI build artifacts viewer. Yes it would be a lot of work to maintain, but a lot of work has been spent anyway in the build-artifacts UI, the Travis config, the "how to make it faster", the "run on aws" command, etc. So it's not much more that what we already invest today. Also considering all the time lost because of Travis and UI tests, it would be worth it. The UI tests would run either on push to GitHub (web hook) or through a console command (which would replace the "run on AWS"). It would be also useful to look into Docker Machine as it's exactly what "run on AWS" is about, so maybe it can be used with minimum effort. On top of being fast (both for CI and "locally" if we use the remote run feature), Docker would also guarantee exactly the same environment between CI and locally, thus easier debugging (and much faster debugging too). It would also simplify running UI tests as one could either run them with Docker (need to install Docker) or through the remote command (the replacement for "run on AWS"). Also having a staging deployment for each branch would be doable more easily thanks to Docker (that's something I mentioned would be very useful to validate new features, or review UI changes). To conclude, I'm just leaving this here to present the results and explain the idea. Feel free to reuse it or not. I'm not arguing for anything, I'm just documenting. |
@diosmosis do you see a problem with splitting UI tests into two parts like this: https://github.com/piwik/piwik/compare/8222?expand=1#diff-354f30a63fb0907d4ad57269548329e3R33 Not sure re possible side effects for plugins etc FYI: I only disabled comparison of travis-yml https://github.com/piwik/piwik/compare/8222?expand=1#diff-354f30a63fb0907d4ad57269548329e3L93 for test purposes to keep things simple. One job running all UI test usually takes between 35 and 45 minutes I think. Running 2 test suites running about 50% of the tests each took about It's not perfect solution but looks like easy to do and would help right now. |
I don't think it's necessary to split the build for plugins, just for core. So you don't need to edit the travis scripts to add the job for plugins, just add the command line options to the test system, and change the matrix in the core .travis.yml file. |
Problem: Our Screenshot tests CI job is running too long.
Goal:
Solution:
The text was updated successfully, but these errors were encountered: