Fix a XSS vulnerability in our Markdown rendering.
Review Request #2119 — Created April 14, 2021 and updated — Latest diff uploaded
We used to depend on Python-Markdown and some extensions to handle sanitizing any content going into or out of Python-Markdown, but this hasn't actually worked in a while. Python-Markdown itself got rid of the sanitation feature, and the extensions didn't do enough. We're now using `bleach`, a popular Python HTML sanitation library, along with `bleach-allowlist`, which contains a list of known safe tags and attributes for Python-Markdown. Everything we render via Markdown is run through `bleach`. To preserve the XHTML output we expect and require (for parsing capabilities) from Markdown, we override `bleach`'s internal `html5lib` serializer's settings to keep self-closing short tags. Unfortunately, `bleach` itself does not give us this compatibility directly, but unit tests can ensure we don't regress. We have unit tests rendering an extensive collection of Markdown code that could trigger vulnerabilities in some Markdown libraries. Each is compared to a sanitized result. We can extend this as new methods are found, and we'll catch any future regressions that might come up. Testing Done: All unit tests passed. Ran through a wide assortment of known Markdown vulnerabilities. Verified that the generated HTML was properly sanitized. We're running this build on the server right now. Reviewed at https://reviews.reviewboard.org/r/11554/