Coerced to make Xerces think
- 6 minutes read - 1169 wordsMore and more I’m thinking that XML is evil! This is the third part of my series on why, as a software engineer, it is very useful to think about the potentially dangerous combination of outdated libraries and XML.
I recently carried out a review of the dependency scanning results
Apache Xerces2 Java allows remote attackers to cause a denial of service (CPU consumption) via a crafted message to an XML service, which triggers hash table collisions.
Now, I had come across this previously but wasn’t able to find much information about how to create this crafted message until talking with members of the Equal Experts security practice led me to find the following two articles:
- https://events.ccc.de/congress/2011/Fahrplan/attachments/2007_28C3_Effective_DoS_on_web_application_platforms.pdf
- https://stackoverflow.com/q/8669946/227140
That was all the info I needed!
Whats a hash collision?
Simply put, a hash collision is when two different Strings have the same hashcode. For strings, hash codes are generated using the following method (Java 8):
|
|
So a long (or short string) is reduced to an integer value. HashMaps and HashSets use those integer values to organise the data structure.
Why are hash collisions bad?
If strings have the same hash code, then a HashMap will store it under the same key in a list. This means that HashMaps are only efficient if the hash codes are sufficiently different.
How does this affect XML parsing?
Xerces stores “symbols” (i.e. XML elements, attributes, etc) in a SymbolTable (https://svn.apache.org/repos/asf/xerces/java/tags/Xerces-J_2_11_0/src/org/apache/xerces/util/SymbolTable.java) - which essentially is a HashMap.
The CCC article describes how forcing a HashMap implementation to deal with collisions can lead to the CPU being spiked for quite a while, which makes such things the ideal candidate for Denial of Service attacks.
The attack
Once I was armed with the knowledge about what hash collisions were and how they could be exploited, I set about creating a malicious XML file. I used the information from the stackoverflow article to create a little piece of code to create my malicious file
|
|
this creates an XML file which looks like this
|
|
Each of the element names have got the same hashcode!
Attacking File Uploads
I found that anything that processed XML payloads (and was using Xerces 2.11.0) was vulnerable. By uploading as little as 5 files with a size of 2MB each simultaneously, I was able to cause a CPU load of 100% for up-to about a minute. Sustain this enough, and it might cause a rather inconvenient Denial of Service probably without being caught by the traditional DDoS protections…
Now, finding applications that allow file uploads isn’t that frequent, so how bad could it be?
Attacking Play
I have been working with Play 2.5 a fair bit recently and found that it too was using Xerces 2.11.0
|
|
This means that something in play-2.5 is using an XML library that is vulnerable to the hash collision attack. Further digging led me to https://github.com/playframework/playframework/blob/2.5.x/framework/src/play/src/main/scala/play/api/mvc/ContentTypes.scala#L609-L641
|
|
The above code is used by Play to parse any request - this means even if a Play service is not designed to accept XML,
by passing in a content type of “text/xml” or “application/xml”, we can “coerce” the play service to parse the payload.
The only restriction is that the maximum length of content that will be parsed is DefaultMaxTextLength (which maps to
the play.http.parser.maxMemoryBuffer
setting).
This means that I can POST XML to Play Services even if they weren’t designed to accept XML!
In my case I found a service that was configured to allow bigger POST payloads than the default of 100K:
|
|
So I just fired off a small attack (having asked permission first, of course!)
|
|
The above fires off 10 requests in parallel - and it wouldn’t matter whether the requests are successful or not as Play is processing the XML…
And the following was the CPU:
The mitigation
The fix was simple - I found that there is a later version of Xerces that had an implementation of SymbolTable that changed the hashing algorithm to something other than the standard Java hashing if too many collisions occurred:
Compare:
- https://svn.apache.org/repos/asf/xerces/java/tags/Xerces-J_2_11_0/src/org/apache/xerces/util/SymbolTable.java
- https://svn.apache.org/repos/asf/xerces/java/tags/Xerces-J_2_12_0/src/org/apache/xerces/util/SymbolTable.java
So, the fix is to ensure using the following dependency (note also, Play >=2.6 has updated the dependency)
"xerces" % "xercesImpl" % "2.12.0" withSources()
or (in Maven land)
|
|
Also useful is the enforcer Maven plugin to have to ensure that xerces 2.11.0 doesn’t sneak in transitively
|
|
Moral of the Story
- Even if you do not design your service to process XML, it may be vulnerable to XML attack vectors
- Do not use outdated libraries
- Know what dependencies you are using
- XML is evil
If you'd like to find more of my writing, why not follow me on Bluesky or Mastodon?