1 <?xml version=
"1.0" encoding=
"UTF-8"?>
2 <!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
3 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
5 xmlns=
"http://www.w3.org/1999/xhtml"
6 xmlns:
xi=
"http://www.w3.org/2001/XInclude"
9 <title>HTTP Protocol Removal Full Disclosure - Security - HTML Purifier
</title>
10 <xi:include href=
"common-meta.xml" xpointer=
"xpointer(/*/node())" />
11 <meta name=
"description" content=
"Full disclosure security page detailing the HTTP protocol removal attack." />
12 <meta name=
"keywords" content=
"HTMLPurifier, HTML Purifier, HTML, filter, filtering, standards, compliant, 3.1.0, attack, full disclosure, http, xss, security" />
16 <xi:include href=
"common-header.xml" xpointer=
"xpointer(/*/node())" />
19 <h1 id=
"title">HTTP Protocol Removal Full Disclosure
</h1>
24 An error in the
<code>HTMLPurifier_URI-
>validate()
</code> allowed for
25 an attacker craft a specially formed
<abbr>URI
</abbr> that, once processed by HTML
26 Purifier, was an active JavaScript
<abbr>URI
</abbr>. If a user clicked on the malicious
27 link, or used a browser that automatically evaluates JavaScript
<abbr>URI
</abbr>s in
28 image tags, an attacker could execute arbitrary JavaScript in the context
29 of the website the
<abbr>HTML
</abbr> was served on.
33 This vulnerability was reported via full disclosure by
34 <a href=
"http://sla.ckers.org/forum/read.php?14,21598#msg-21604">Gareth Heyes
</a>,
35 and brought to the attention of the vendor by CrYpTiC_MauleR.
36 No active exploits are currently known.
42 This vulnerability was fixed in HTML Purifier
3.1.0 and
2.1.4. No hot-patch
43 is currently available.
46 <h2 id=
"Details">Details
</h2>
49 In accordance to
<a href=
"http://tools.ietf.org/html/rfc3986#page-37">RFC
3986</a>,
50 a relative
<abbr>URI
</abbr> with the same scheme name as the
51 base
<abbr>URI
</abbr> is discouraged, but allowed for backwards-compatibility.
52 As HTML Purifier's goal is to produce standards-compliance in all aspects
53 of its output, HTML Purifier converts such
<abbr>URI
</abbr>s to their
54 correct form by removing the scheme. Thus,
<code>http:dir/dir2
</code>
55 becomes
<code>dir/dir2
</code>.
59 Doing this bypasses HTML Purifier's safeguards against JavaScript
60 <abbr>URI
</abbr>s. During the parsing of normal URIs, a URI is parsed
61 and its scheme extracted from the original. Thus, a normal
62 <code>javascript:xss()
</code> is identified to have a
<code>javascript
</code>
63 scheme and is removed. Any of the common bypasses to this, such as
64 <code>java
<strong>\n
</strong>script
</code> are avoided because HTML Purifier
65 does not recognize the scheme from its list of allowed schemes. However, once
66 parsing and this initial scheme check is performed, parsing is not
71 Removal of the scheme causes a URI like
<code>http:javascript:xss()
</code>
72 to become
<code>javascript:xss()
</code>, and now
<code>javascript
</code> is
73 the new scheme, although in the original,
<code>javascript:xss()
</code> was
95 <td>javascript:xss()
</td>
97 <td>javascript:xss()
</td>
103 The appropriate fix can be determined by figuring out how to convert the
104 last column into a
<abbr>URI
</abbr> that will be parsed into the same form.
105 Obviously, simple concatenation doesn't work; the key is percent encoding
106 the path. Instead of
<code>javascript:xss()
</code>,
<code>javascript%
3Axss()
</code>
111 HTML Purifier's fix also percent-encodes any other reserved character in
112 each segment of a URI. This was actually a previously identified section
113 of relaxed standards compliance, and strictly enforcing the rules eliminated
117 <h2 id=
"History">History
</h2>
120 The vulnerability was reported on March
25,
2008, although not directly to
121 the vendor. A patch was committed to the public repository on May
13,
2008,
122 ostensibly as a
<q>revamp [of] URI handling of percent encoding and validation.
</q>
123 HTML Purifier
3.1.0 was released on May
18,
2008. This was the first security
124 vulnerability in HTML Purifier's core, and the second in all of HTML Purifier's
129 We would have strongly preferred if Gareth Heyes had contacted us through
130 private channels before publically disclosing the vulnerability. We actually
131 did not realize that the post was illustrating vulnerabilities with
132 HTML Purifier until CrYpTiC_MauleR asked why the exploit worked on
133 May
13,
2007 (an
<code>http:javascript:
</code> doesn't actually work by itself; HTML Purifier
134 must munge off the http scheme to activate the attack.) This accounts in
135 part for the large discrepancy between the first disclosure,
136 and the committing of a fix. Still, we greatly appreciate Gareth Heyes' report
137 and sincerely hope that he will continue to help weed out bugs in HTML Purifier.
138 We apologize for not crediting him immediately in the changelog.
142 Since full disclosure is generally a good idea, just not before the vendor
143 has gotten a chance to release a fix (please don't be afraid to use it to
144 light a fire under our butts and get a security bug fixed), we've released
145 this document along with the next point release of HTML Purifier, hopefully
146 having given projects and end users enough time to upgrade their installations.
147 We hope to do this for all future vulnerabilities in HTML Purifier. Especially
148 for the two which were fixed in the most recent point release.