Always treat diffs in commits as byte strings.
Review Request #917 — Created Oct. 17, 2017 and discarded
Information | |
---|---|
guest2117 | |
Review Board | |
Reviewers | |
demo | |
aaa |
`Commit.diff` used to be more than happy to accept any Unicode or byte strings thrown at it, and some code actually expected these to be Unicode strings, attempting to unconditionally encode the contents as a UTF-8 byte string. If a hosting service set the diff as a byte string, and there was Unicode content within the diff, this could lead to a crash. Now, `Commit.diff` always stores a byte string, handling the encoding from Unicode if needed. Callers can and should now always treat this as a byte string. This fixes a crash when posting existing commits from Bitbucket with Unicode content. Testing Done: Unit tests pass. Manually tested that the diff content no longer gets improperly re-encoded and crashes in the customer case. Reviewed at https://reviews.reviewboard.org/r/9148/