class: center, middle # WPT, Amazon & Sauce Labs 2018-03-13 .softer[ (press the
p
key to view presenter's notes) ] .license[
This presentation is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License
. ] --- class: center, middle # Part 1: Parallelization .softer[ # Part 2: Troubleshooting # Part 3: Next Steps ] --- # Parallelization: Original Layout .-- EC2 --. | | .--- | WPT CLI | WebDriver .- Sauce Labs -. | | | <-------> | Edge | | '---------' | & | | .-- EC2 --. <-------> | Safari | | | | WebDriver '--------------' +--- | WPT CLI | | | | .-- GCP --. | '---------' | wpt.fyi | <-+ .Bytemark-. '---------' | | WPT CLI | +--- | & | | | Firefox | | '---------' | .Bytemark-. | | WPT CLI | '--- | & | | Chrome | '---------' --- # Parallelization: New Layout .-- EC2 --. | WPT CLI |. .- Sauce Labs -. .-- GCP --. .-- EC2 --. | & ||. WebDriver | | | wpt.fyi | <-- | Leader | <---> | Firefox ||| <-------> | Edge | '---------' '---------' | & ||| WebDriver | & | | Chrome ||| <-------> | Safari | '---------'|| WebDriver | | '-- x10 --'| <-------> '--------------' '---------' --- # Parallelization: Implemented with Buildbot ![Buildbot logo](img/buildbot-logo.svg) > Buildbot is an open-source framework for automating software build, test, and > release processes. Used by: Python, WebKit, Mozilla, Chromium https://buildbot.net --- # Parallelization: Buildbot "Workers" ![](img/buildbot-workers.png) --- # Parallelization: Buildbot "Builders" ![](img/buildbot-builders.png) ??? We've implemented three "builders" for WPT: - Daily initiator - identifies a revision of WPT to use for results collection; triggers builds for that revision - Chunked Runner - runs a specified subset of WPT in a specified browser; conditionally triggers results uploading - Uploader - publishes the results to Google Cloud Platform --- # Parallelization: Daily Initiator ![](img/buildbot-build-daily-steps.png) ??? After determining which revision of WPT should be tested, this builder triggers a large number of partial tests. Currently, it is configured to segment WPT into 10 so-called "chunks" and schedule one build for each of those chunks in Firefox, Chrome, Safari, and Edge. --- # Parallelization: Daily Initiator (continued) ![](img/buildbot-build-daily-properties.png) ??? Each build is parameterized with "properties". The most salient property for this build is `got_revision`. This property is passed along to all the "chunked" builds which are triggered by the builder. --- # Parallelization: Chunked Runner ![](img/buildbot-build-chunk-steps.png) ??? - These are the steps performed by the "Chunked Runner" builder - "WPT Run" is configured as an allowed failure because a browser's performance on the tests doesn't affect our desire to collect and publish results - "Validate results" is how we can interpret the meaning of the failed "WPT Run" command (i.e. whether it reflects test failures or an error in the infrastructure) - The build "master" collects the chunk results, but only uploads when results are available for every chunk (more on this later) --- # Parallelization: Chunked Runner (continued) ![](img/buildbot-build-chunk-command.png) ??? As you would expect from any continuous integration solution, logs of each command are available for review. --- # Parallelization: Authentication ![](img/buildbot-auth-1.png)
![](img/buildbot-auth-2.png) ??? Buildbot also offers integration with popular OAuth providers. In the current deployment, only members of the "bocoup" and "w3c" organizations on GitHub.com may influence builds via the web UI. Because test runs continue to crash intermittently, we can use this interface to re-try builds on demand. The "upload when all chunks are complete" heuristic described earlier means that these manually-triggered build can automatically provoke uploading. --- class: center, middle .softer[ # Part 1: Parallelization ] # Part 2: Troubleshooting .softer[ # Part 3: Next Steps ] ??? - You may have noticed a lot of failed builds being reported in the preceding screenshots. - The truth is that this new system has never produced complete results for Edge or Safari - Understanding why took a bit of research --- ## Amazon EC2 CPU Credits ![](img/ec2-cpu-credit-pricing.png) --- ## Usage during tests - single instance ![](img/cpu-credit-balance-one.png) ??? Seemingly-giant dip in the CPU credits --- ## Usage during tests - single instance (scaled) ![](img/cpu-credit-balance-one-scaled.png) Phew. ??? Relatively speaking, it was a drop in the bucket --- ## Usage during tests - all instances ![](img/cpu-credit-balance-all.png) ??? All instances tell a similar story --- ## Usage during tests - all instance (scaled) ![](img/cpu-credit-balance-all-scaled.png) ??? - This demonstrates that CPU was not the bottleneck - This was a false avenue, anyway: the utilization spikes coincided with tests running in Chrome and Firefox - Running the tests against Safari and Edge via Sauce Labs did not trigger any perceptible change in CPU utilization --- ## Sauce Labs API Rate Limiting ![](img/sauce-labs-rate-limiting.png) https://wiki.saucelabs.com/display/DOCS/Rate+Limits+for+the+Sauce+Labs+REST+API ??? - The number 10 is suspicious--that's the number of concurrent testing sessions we were attempting to run - I initially guessed that we were exhausting the 10-requests-per-second limit and getting docked by Sauce Labs with an extended "cool off" state - I implemented a resource lock with Buildbot so the build master would throttle itself and avoid this penalty - That did not work, so I began to research other sources of failure --- ## Network traffic ![](img/network-out-one.png) --- ## Network traffic (including subsequent run) ![](img/network-out-all.png) --- ## Network traffic (including previous run) ![](img/network-out-with-prior.png) ??? - Each run included a single build ("chunk" number 9 out of 10) with a huge amount of traffic - I was unaccustomed to looking at WPT from this perspective, so I didn't know if/how this indicated infrastructure errors --- ## Overlaying tests ![](img/network-spike-traffic.png) ![](img/network-spike-tests.png) HTTP**S** :/ ??? I ran the tests locally and waited for the traffic spike. It's all encrypted as it is sent to Sauce Labs, so I couldn't look at the data directly. But I was able to review the tests themselves. --- ## Prime suspect [css/css-text/i18n/css3-text-line-break-jazh-421.html](https://github.com/w3c/web-platform-tests/blob/de0c4ad1634bac712eadecf9a1a170eaf375ae8e/css/css-text/i18n/css3-text-line-break-jazh-421.html#L12-L16) ```css @font-face { font-family: 'mplus-1p-regular'; src: url('/fonts/mplus-1p-regular.woff') format('woff'); /* filesize: 803K */ } ``` ```shell $ grep mplus-1p-regular.woff css/css-text/i18n -r | wc -l 834 ``` ??? - That's a big font. - Running this "chunk" of tests involves sending a 800KB file to Sauce Labs over 800 times. - Unfortunately, cancelling that build did not increase the reliability of the other builds. In other words: the aberrant network utilization was a red herring, but at least we know why it's happening. --- class: center, middle > In this case I think restructuring the way your tests start to just have one > tunnel connecting and sending all the traffic would be best. \- Enrique Gonzalez, Sauce Labs support technician --- class: center, middle .softer[ # Part 1: Parallelization # Part 2: Troubleshooting ] # Part 3: Next Steps --- # Next step: Avoid Sauce Labs Rate Limits ## Alternative #1 .-- EC2 --. | WPT CLI |. .- EC2 -. .-- GCP --. .-- EC2 --. | & ||. WebDriver | | | wpt.fyi | <-- | Leader | <---> | Firefox ||| <-------> | | '---------' '---------' | & ||| WebDriver | Proxy | | Chrome ||| <-------> | | '---------'|| WebDriver | | '-- x10 --'| <-------> '-------' '---------' ^ WebDriver | v .- Sauce Labs -. | Edge | | & | | Safari | '--------------' ??? - We can't naively scale the solution that WPT uses for communication with Sauce Labs. - We're being rate limited because we are creating so many concurrent "tunnels" - To avoid this Sauce Labs, we would need to designate a small number of servers to manage this connection, and direct all testing traffic through that proxy. - De-multiplexing the traffic coming back from Sauce Labs will increase system complexity further (DNS records, etc.) --- # Next step: Avoid Sauce Labs Rate Limits ## Alternative #2 .-- ??? --. | WPT CLI |. .-> | & ||. | | Safari ||| | '---------'|| | '-- x N --'| | '---------' | .-- EC2 --. | | WPT CLI |. .-- GCP --. .-- EC2 --. | | & ||. | wpt.fyi | <-- | Leader | <-+-> | Firefox ||| '---------' '---------' | | & ||| | | Chrome ||| | '---------'|| | '-- x N --'| | '---------' | .-- ??? --. | | WPT CLI |. '-> | & ||. | Eddge ||| '---------'|| '-- x N --'| '---------' ??? Because we have another goal which Sauce Labs cannot help us reach (collecting results from unstable browsers), we're currently working to solve this by running machines without that service. The requirements to run a Buildbot workers are relatively easy to satisfy: - Python - Port 8898 We can distribute the load without Sauce Labs by installing the workers directly on the machines running the target platforms. --- # Next step: Avoid Sauce Labs Rate Limits ## Alternative #3 .-- EC2 --. .- ??? -. | WPT CLI |. WebDriver | Edge |. .-- GCP --. .-- EC2 --. | & ||. <-------> '-------'|. | wpt.fyi | <-- | Leader | <---> | Firefox ||| '- x N -'| '---------' '---------' | & ||| '-------' | Chrome ||| .-- ??? --. '---------'|| WebDriver | Safari |. '-- x N --'| <-------> '---------'|. '---------' '-- x N --'| '---------' ??? We might also choose to expose WebDriver endpoints instead. This requires more network configuration, so it is probably better to defer for the time being. It's worth shooting for because it makes the division of responsibility more clear. --- class: center, middle # Thanks! mike@bocoup.com