Recently, I spotted a remarkable anomaly on the leaves of a Lime tree near my house. Consulting the almighty internet I learned that it was “Eriophyes Tilia”, a gall mite, that had caught my eye. Or rather its chemically induced shielding, composed of plant tissue, but controlled by the insect. I found it quite nifty that a parasite not only sucked sap from the leaf, but also exploited an attack vector to make the tree provide it shelter and even concentrate resources for it to feed on.

At the same time, one of our clients did a security assessment on our system and revealed a Cross-Site Scripting (XSS) vulnerability in our HTML5 video player. Having encountered XSS before, I was not really impressed, but soon found out that there were similarities with the mite; more nifty exploits than I could think of and no immediate solution.

Standing on the shoulders of giants

I read what the experts had to say about the subject. Hacker blogs, the utterances of security officers, etc. Usually, the bottom-line advice was to “properly escape user input”.

Our problem, though, was a bit more complex, since we wanted to allow “bona fide” HTML to be passed in, for instance <strong>Yeah!</strong> for a media clip title. Therefore, simply escaping HTML entities (<>&”‘) was not an option.

And attackers have learned to circumvent simple removal of <script> strings by using tricks like <scr<script>ipt>.

The point is always that no matter how weirdly crafted the XSS attack string may be, right before it is going to be inserted into the DOM, it will be valid HTML. This is true even for multi-vector XSSes (attacks that rely on two or more insertions to form a single exploit).

Hence, parsing this HTML into a DOM tree and then traversing it, sanitizing each node, before actually inserting it into the document is a good idea. As of version 1.8, jQuery has a method ‘parseHTML‘ that serves as the basis for our solution. By default, it removes script nodes as a first line of defense. But that of course is not enough (although jQuery suggests parseHTML might become more strict in the future).

The reason is that HTML allows script into attributes as well. In event handler attributes like ‘onerror’ or ‘onload’, but also in ‘src’, ‘href’ and ‘action’ attributes through the ‘javascript:’ URL scheme. And indirectly through the ‘data:’ URL scheme and obscure ‘srcdoc’ attribute. Emptying them is the only safe approach.


The coders’ mantra “avoid manipulating XML as text” proves valuable again. That said, I’m not claiming at all that we completely secured the area, because you never can. We closed the holes that were pointed out to us — and a great lot more –, so we are pretty confident that the parasites won’t find new ones. But if they do, we’ll be able to respond quickly…