Monday, July 23, 2007

Attribute-Based Cross-Site Scripting

A couple of weeks ago I posted sections from one of our WhiteHat customer newsletters that focused HTTP Response Splitting. Newsletters are one way we keep customers informed of important industry trends and improvements to the Sentinel Service. Judging from the blog traffic and comments it was well received. So this time I’ll highlight Attribute-Based Cross-Site Scripting, which Arian Evans (WhiteHat’s Director of Operations) has been spending a lot of R&D time to get working properly. Enjoy.

New Vulnerability Detection
Attribute-Based Cross-Site Scripting is one of the hardest types of Cross-Site Scripting to find in an automated fashion. Today, no desktop scanner does a good job at this; most don't even attempt it because false-positives skyrocket – except for the WhiteHat Sentinel Service. Naturally.

WhiteHat Sentinel implemented our second-generation attribute injections last week. Many of you have seen new XSS attack vectors being pushed on your sites, and for quite a few it is a result of these tests. The example we most often push is sourcing in JavaScript via an injected STYLE tag (attack vector for Internet Explorer).

Attribute injection is when user-controlled data lands inside of an HTML tag, or specifically a value inside of an HTML tag, where notorious characters like “<” and “>” may not be required for XSS exploitation. For example:

HTTP GET request (not actual Sentinel test - this is an example for exploitation):

Will result in this example tag in the HTTP Response:

<* td>
<* a href="/index.cfm?sessionid=12345678901&hid="" STYLE="background-image: expression(alert('Is_XSS_HERE?))">
<* img src="" width="274" height="83" border="0">
<* /a>
<* /td>

This is a perfect example of an XSS vulnerability in which the attacker wouldn't need HTML tags or meta characters like <>. All you need in this case is a double-quote, a colon, and some parenthetics to begin your attack. From here the exploit can be carried out in many ways (e.g.-malicious Javascript). The ability to detect these issues accurately will grow exponentially with the advanced conditional logic currently being implemented into the Sentinel Service.

WhiteHat Website Vulnerability Management Practice Tips

Q. How do I stop an XSS attack that lands in an HTML tag?

A. For most attribute-based attacks to work, the attacker needs at least single or double-quotes. Double-quotes are what is most often needed – from what we see at WhiteHat. You could try escaping, removing, or substituting single and double-quotes on input.

Alternately you could encode any user-supplied data safely on output. This is the safest approach. Barring international-language sites – there are a minimum of four alternate encoding types for all Latin-ASCII code page characters: being Unicode, Decimal, Hexadecimal, and Named. This can jump to 18 variants for something as simple as double-quote, if you factor in international-language code pages.

Q. How do I encode my output safely?

A. If you encode double-quotes as their named-entity references, you will remove most of your attribute XSS issues. If you encode single-quotes using Decimal (works across the most browsers) or named-entity reference, this should solve the problem, as well (by breaking the initial escape sequence the attacker needs to take over the tag and begin scripting).

A nice reference page for more on entity-encoding values can be found here:

Q. What is this Unicode craziness you speak of?

A. A great place to start is here:


Alexander Berezhnoy said...

Fortify SCA, which I use, can also usually find such stuff.
Thing, which is more difficult to detect, is ugly practice of run-time javascript generation or (to be closer to the topic) run-time generation of the style attribute values from the user input.

Milan Cvejić said...

I am wandering, is there some way for this to work on firefox?

I tested it, and it works only on IE

Jordan said...

Milan -- that particular example uses a trick of IE, but there are lots of other variants that would work in Firefox. For example:

Which produces:
<* td>
<* a href="/index.cfm?sessionid=12345678901&hid=""
<* img src="" width="274" height="83" border="0">
<* /a>
<* /td>

That requires the user to mouse-over the link, but you could use a stylesheet to cause the link to fill the page, or use a number of other similar mechanisms.

For example, if the user-input is going into an image url, you can use the trick Jeremiah used in a previous post to force a bogus image URL, and then add an onError attribute which will be executed when the request for the image fails.

Andy, ITGuy said...

Jeremiah, Thanks for the info and also for giving tips on how to fix this problem. Too often people write about something and then leave it at that. We need to get the information out to others so it can be used.

Milan Cvejić said...

Jordan, thanks for the answer...

Jeremiah Grossman said...

@Buben Razuma, ... while we meant black box scanners predominantly, this is good to know just the same. Would you be able to provide some screen shots of that in action? Obfuscated or otherwise. That'd be really interesting.

@Milan, Jordan has it exactly right. While its possible to exec JS from a style attribute in FF, it does so without a DOM so its pretty useless. The next step is to try some event handlers that the user will trigger. To improve the odds of mousing over, or something like that, you could update the size of the object via a style sheet or attribute so they can't help but mouse over by mistake. OR, as Jordan mentioned, if you land in an image tag, onerror is a great auto-fire mechanism, so it really all just depends.

@andy, sure thing! As a service provider the service wouldn't be of much value unless I provided solutions. :)

Anonymous said...


Just thought you should know that AppScan, had this kind of XSS test variants for quite some time (years now).

Specifically, AppScan parses the returning HTTP response, analyzes the DOM, and figures out if the JavaScript code is interpreted properly by a browser. Not a big deal to automate as you have mentioned, if you have the right technology.

I've seen other scanners add these kinds of tests later on, but as you have said, they introduced many false positives, since they didn't analyze the HTML in the response properly, but rather used Regexps.

Anonymous said...

Hey Ory -- whilst Appscan may have these tests, I have at least three clients using your latest and greatest that are finding somewhere between none and a third of what we are finding.

This could of course be a biased sample. e.g.-I wouldn't know for sure how effective your mechanism is unless I could run Appscan fully configured against our roughly 600 sites and compare the accuracy of detection by comparing our results and your results....

We have a new way of doing this coming down the pipe that should put our detection into the 99% range (is my gut feeling -- caveat: of course I've been wrong before :).

This all said... you certainly do a better job at detection for these type of issues than any other black box scanner I've seen besides our own Sentinel. Sadly, some of the other scanners are crazy-out-to-lunch right now. Some have actually gotten *worse* in the last year or two.

It's clear, when you see their interfaces and portals, that they put most of their dollars and resources into flashy reports and shiny, reflective buttons in the interface.

We have a little less of a "cute, shiny bauble" approach to feature development over here at WHS than some other folks appear to.

Anonymous said...

Watch out for untrusted content landing in attributes that are interpeted as URIs. Replacing characters with HTML entities doesn't prevent XSS there:


Anonymous said...

Apparently blogger didn't much like my HTML. Here's an example of the attack:

<a href="javascript:alert('xss')">foo</a>

Anonymous said...

That's isn't a very good example. Properly encoding output would block that one. If you read our recommendations, it would solve for that, and other attribute-based injections.

Anonymous said...

Actually I should be more specific --

HTML Named Entity Encoding will not help you if you are using a class library that encodes the four basic chars ( " <> & ).

However, if you Decimal Entity encode the colon and the parenthetics like I recommended, it's sure not going to get interpreted as a URI.

Problem solved.

Anonymous said...

but in case the our payload is going into a tag sumthing like value="$X", and moreover " are converting to "

if we pass 'xss', the result will be sumthing like value="'xss'" which again will be useless as no execution can be done. Any tag can't be used, no event can be called until and unless the " is byepassed.

Unknown said...

Thanks for providing this informative information. it is very useful you may also refer-