2019-08-19
(press the p key to view presenter's notes)
This presentation is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License.
Laysan albatross and Midway sunset is licensed CC BY-NC 2.0 by USFWS - Pacific Region
server layout / error recovery / submission previews
Laysan albatross and Midway sunset is licensed CC BY-NC 2.0 by USFWS - Pacific Region
.-- W3C server --.| wptserve |'----------------'
+
publicly accessible
-
TLS certificate maintained manually and at cost
-
regularly offline
-
recovery required human intervention
Why?
wptserve
is built on Python's SimpleHTTPServer
Warning:
SimpleHTTPServer
is not recommended for production. It only implements basic security checks.
https://docs.python.org/2.7/library/simplehttpserver.html
.- Amazon EC2 instance --.| systemd || wptserve Certbot |'------------------------'
+
Free and automated certificate renewal thanks to Let's Encrypt (via Certbot)
+
Improved uptime thanks to systemd
-
Still regularly falling offline (just recovering faster)
.----- GCP instance -----. .----- GCP instance -----.| .- Docker container -. | | .- Docker container -. || | wptserve | | | | wptserve | || '--------------------' | | '--------------------' |'------------------------' '------------------------' .----- GCP instance -----. | .- Docker container -. | | | Certbot | | | '--------------------' | '------------------------' * GCP - Google Cloud Platform
+
Improved uptime further
-
Significantly more complex
Each container is actually running a number of processes, all managed by the supervisord init system. That's typically frowned upon by Docker users, so the rationale is included in the project documentation.
*Let's Encrypt* *GitHub* | | [TLS certificate] [WPT source code] | .------------. | V .-->| wpt server |<---+ .--------------. +++++++++++++++ | '------------' | | cert-renewer |--->+ certificate +---+ | '--------------' + store + | .------------. | +++++++++++++++ '-->| wpt server |<---' '------------' Legend .---. +++++ * * external | | GCE + + object [ ] message service '---' instance +++++ store contents
The server is run by multiple Google Compute Engine (or "GCE") instances deployed in parallel. Many of the web-platform-tests concern the semantics of the HTTP protocol, so load balancing is provided at the TCP level in order to avoid interference.
In addition to serving the web-platform-tests, each server performs a few tasks on a regular interval. These include:
When any of these periodic tasks complete, the web-platform-tests server process is restarted in order to apply the changes.
A separate Google Compute Engine instance interfaces with the Let's Encrypt service to retrieve TLS certificates for the WPT servers. It integrates with Google Cloud Platform's DNS management in order to prove ownership of the system's domain name. It stores the certificates in a Google Cloud Platform Storage bucket for retrieval by the web-platform-tests servers.
server layout / error recovery / submission previews
Laysan albatross and Midway sunset is licensed CC BY-NC 2.0 by USFWS - Pacific Region
container GCE Instance | | x | err! .---restart---' | | | okay
If the WPT server fails (as indicated by its process exiting), then Docker as running in the Google Compute Engine Instance will automatically restart the Docker container.
Restarting the container completely refreshes runtime state, and this is expected to resolve many potential problems in the deployment.
container GCE Instance GCE Managed Group | | | x x | err! .-----restart-----' | | .---restart---' | | | | | okay | | | | | | okay
In the case of the web-platform-tests server, an additional layer of error recovery is provided via a Google Cloud Platform "health check." If the Google Compute Engine instance fails to respond to HTTP requests, then it will be destroyed and a new one created in its place. That new instance will subsequently create a Docker container to run the WPT server.
This second recovery mechanism guards against more persistent problems, e.g. those stemming from state on disk (since even a running GCE instance will fail HTTP health checks if restarting the Docker container has no effect).
server layout / error recovery / submission previews
Laysan albatross and Midway sunset is licensed CC BY-NC 2.0 by USFWS - Pacific Region
----------------------------- GitHub ------------------------------ | | | | [master] [pr#13451] | | | | [pr#13452] | | | | [ etc. ] v v v v -------------------------- w3c-test.org --------------------------- v v v v editors, implementors & developers WPT contributors
w3c-test.org automatically publishes the contents of many patches that are submitted to the project through GitHub.
We had to replicate this feature before our system would be considered a viable replacement.
It's a fundamentally insecure feature because patches may include arbitrary Python code, and we have to run that.
As the canonical location to run the tests on the web, we expect this deployment to be referenced from web specifications, wpt.fyi, and more. We want it to be as stable as possible.
----------------------------- GitHub ------------------------------ | | | | [master] [pr#13451] | | | | [pr#13452] | | | | [ etc. ] v v v v -------------------------- w3c-test.org --------------------------- v v v v editors, implementors & developers WPT contributors
----------------------------- GitHub ------------------------------ | | | | [master] [pr#13451] | | | | [pr#13452] | | | | [ etc. ] v v v v ---- web-platform-tests.live ----- --- web-platform-tests.pr --- v v v v editors, implementors & developers WPT contributors
w3c-test.org automatically publishes the contents of many patches that are submitted to the project through GitHub.
We had to replicate this feature before our system would be considered a viable replacement.
It's a fundamentally insecure feature because patches may include arbitrary Python code, and we have to run that.
As the canonical location to run the tests on the web, we expect this deployment to be referenced from web specifications, wpt.fyi, and more. We want it to be as stable as possible.
One of the strongest decisions we made was to deploy it to a separate server. Instability resulting from untrusted patches can only annoy people contributing; it won't diminish availability for the wider audience.
Contributor GitHub.com w3c-test.org git repository | | | | '---[pull request]---.| | | v | | '--[web hook]--.| | v | '------[git fetch]----. .---------------------' | | V
Contributor GitHub.com w3c-test.org git repository | | | | '---[pull request]---.| | | v | | '--[web hook]--.| | v | '------[git fetch]----. .---------------------' | | V
Main problems:
Contributor GitHub.com git repository web-platform-tests.live | | | | | | .------[git fetch]----' | | '---------------------. '---[pull request]---.| | | v | | '--[git tag]--.| | v | | | .------[git fetch]----' '---------------------. V (fetching continues on a regular interval)
wpt-server-submissions.Dockerfile
:
FROM web-platform-tests-live-wpt-server-tot COPY src/mirror-pull-requests.sh /usr/local/bin/ COPY src/supervisord-pull-requests.conf /etc/supervisor/conf.d/
+
Easier to maintain than a standalone implementation
+
Safer than branching on runtime flags
This server could have been built completely standalone from the "tot" or ("tip-of-tree") server. There would have been a lot of duplication, though, and that's hard to maintain.
Alternatively, we could have built a single server that had all functionality, and simply disabled the "submission preview" part in the "tot" deployment. Runtime flags are too easy to toggle, so this would be susceptible to accidental enabling of the "submissions preview" functionality.
Docker and supervisord both offer clean extension mechanisms. That allows us to define a distinct image for the submissions server in terms of the "tot" (or "tip-of-tree") server.
Source code & documentation:
https://github.com/bocoup/web-platform-tests.live
Laysan albatross and Midway sunset is licensed CC BY-NC 2.0 by USFWS - Pacific Region
server layout / error recovery / submission previews
Laysan albatross and Midway sunset is licensed CC BY-NC 2.0 by USFWS - Pacific Region
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |