khtml/DESIGN.html

   1 <html>
   2 <head>
   3 <title>Internal design of khtml</title>
   4 <style>
   5 dt { font-weight: bold; }
   6 </style>
   7 <body bgcolor=white>
   8 <h1>Internal design of khtml</h1>
   9
  10 <p>
  11 This document tries to give a short overview about the internal design of the khtml
  12 library. I've written this, because the lib has gotten quite big, and it is hard at first to find your
  13 way in the source code. This doesn't mean that you'll understand khtml after reading this
  14 document, but it'll hopefully make it easier for you to read the source code.
  15 </p>
  16 <p>
  17 The library is build up out of several different parts. Basically, when you use the lib, you
  18 create an instance of a KHTMLPart, and feed data to it. That's more or less all you need to
  19 know if you want to use khtml for another application. If you want to start hacking khtml,
  20 here's a sketch of the objects that will get constructed, when eg. running testkhtml with
  21 a url argument.
  22 </p>
  23 <p>
  24 In the following I'll assume that you're familiar with all the buzzwords used in current web
  25 techology. In case you aren't here's a more or less complete list of references:
  26 </p>
  27 <blockquote>
  28 <p>
  29 <b>Document Object model (DOM):</b><br>
  30 <a href="http://www.w3.org/DOM/">DOM Level1 and 2</a><br>
  31 We support DOM Level2 except for the events model at the moment.
  32 </p>
  33 <p>
  34 <b>HTML:</b><br>
  35 <a href="http://www.w3.org/TR/html4/">HTML4 specs</a><br>
  36 <a href="http://www.w3.org/TR/xhtml1/">xhtml specs</a><br>
  37 We support almost all of HTML4 and xhtml.
  38 </p>
  39 <p>
  40 <b>Cascading style sheets (CSS):</b><br>
  41 <a href="http://www.w3.org/TR/REC-CSS2/">CSS2 specs</a><br>
  42 We support almost all of CSS1, and most parts of CSS2.
  43 </p>
  44 <p>
  45 <b>Javascript:</b><br>
  46 <a href="http://msdn.microsoft.com/workshop/author/dhtml/reference/objects.asp">Microsoft javascript bindings</a><br>
  47 <a href="http://docs.sun.com/source/816-6408-10/index.html">Netscape javascript reference</a><br>
  48 Netscapes javascript bindings are outdated. We shouldn't follow them. Let's focus on getting the bindings
  49 compatible to IE.
  50 <a href="http://mozilla.org/docs/dom/domref/">Mozilla JS/DOM reference</a>
  51 </p>
  52 </blockquote>
  53
  54 <p>
  55 <a href="khtml_part.h">KHTMLPart</a> creates one instance of a
  56 <a href="khtmlview.h">KHTMLView</a> (derived from QScrollView),
  57 the widget showing the whole thing.  At the same time a DOM tree
  58 is built up from the HTML or XML found in the specified file.
  59 <p>
  60 Let me describe this with an example.
  61 <p>
  62 khtml makes use of the document object model (DOM) for storing the document
  63 in a tree like structure. Imagine some html like
  64 <pre>
  65 &lt;html&gt;
  66     &lt;head&gt;
  67         &lt;style&gt;
  68             h1: { color: red; }
  69         &lt;/style&gt;
  70     &lt;/head&gt;
  71     &lt;body&gt;
  72         &lt;H1&gt;
  73             some red text
  74         &lt;/h1&gt;
  75         more text
  76         &lt;p&gt;
  77             a paragraph with an
  78             &lt;img src="foo.png"&gt;
  79             embedded image.
  80         &lt;/p&gt;
  81     &lt;/body&gt;
  82 &lt;/html&gt;
  83 </pre>
  84 In the following I'll show how this input will be processed step by step to generate the visible output
  85 you will finally see on your screen. I'm describing the things as if they happen one after the other,
  86 to make the principle more clear. In reality, to get visible output on the screen as soon as possible,
  87 all these things (from tokenization to the build up and layouting of the rendering tree) happen
  88 more or less in parallel.
  89
  90 <h2>Tokenizer and parser</h2>
  91 <p>
  92 The first thing that happens when you start parsing a new document is that a
  93 DocumentImpl* (for XML documents) or an HTMLDocumentImpl* object will get
  94 created by the Part (in khtml_part.cpp::begin()). A Tokenizer*
  95 object is created as soon as DocumentImpl::open() is called by the part, also
  96 in begin() (can be either an XMLTokenizer or an HTMLTokenizer).
  97 <p>
  98 The XMLTokenizer uses the QXML classes in Qt to parse the document, and it's SAX interface
  99 to parse the stuff into khtmls DOM.
 100 <p>
 101 For HTML, the tokenizer is located in khtmltokenizer.cpp. The tokenizer uses the contents
 102 of a HTML-file as input and breaks this contents up in a linked list of
 103 tokens. The tokenizer recognizes HTML-entities and HTML-tags. Text between
 104 begin- and end-tags is handled distinctly for several tags. The distinctions
 105 are in the way how spaces, linefeeds, HTML-entities and other tags are
 106 handled.
 107 <p>
 108 The tokenizer is completely state-driven on a character by character basis.
 109 All text passed over to the tokenizer is directly tokenized. A complete
 110 HTML-file can be passed to the tokenizer as a whole, character by character
 111 (not very efficient) or in blocks of any (variable) size.
 112 <p>
 113 The HTMLTokenizer creates an HTMLParser which
 114 interprets the stream of tokens provided by the tokenizer
 115 and constructs the tree of Nodes representing the document according
 116 to the Document Object Model.
 117 <p>
 118
 119 <h2>The DOM in khtml</h2>
 120 <p>
 121 Parsing the document given above gives the following DOM tree:
 122
 123 <pre>
 124 HTMLDocumentElement
 125   |--> HTMLHeadElement
 126   |       \--> HTMLStyleElement
 127   |              \--> CSSStyleSheet
 128   \--> HTMLBodyElement
 129          |--> HTMLHeadingElement
 130          |      \--> Text
 131          |--> Text
 132          \--> HTMLParagraphElement
 133                 |--> Text
 134                 |--> HTMLImageElement
 135                 \--> Text
 136 </pre>
 137 <p>
 138 Actually, the classes mentioned above are the interfaces for accessing the
 139 DOM. The actual data is stored in *Impl classes, providing the implementation
 140 for all of the above mentioned elements. So internally we have a tree
 141 looking like:
 142 <pre>
 143 HTMLDocumentElementImpl*
 144   |--> HTMLHeadElementImpl*
 145   |       \--> HTMLStyleElementImpl*
 146   |              \--> CSSStyleSheetImpl*
 147   \--> HTMLBodyElementImpl*
 148          |--> HTMLHeadingElementImpl*
 149          |      \--> TextImpl*
 150          |--> TextImpl*
 151          \--> HTMLParagraphElementImpl*
 152                 |--> TextImpl*
 153                 |--> HTMLImageElementImpl*
 154                 \--> TextImpl*
 155 </pre>
 156 <p>
 157 We use a refcounting scheme to assure that all the objects get deleted, in
 158 case the root element gets deleted (as long as there's no interface class
 159 holding a pointer to the Implementation).
 160 <p>
 161 The interface classes (the ones without the Impl) are defined in the <code>dom/</code>
 162 subdirectory, and are not used by khtml itself at all. The only place they are used are in the
 163 javascript bindings, which uses them to access the DOM tree. The big advantage of having this
 164 separation between interface classes and imlementation classes, is that we can have several
 165 interface objects pointing to the same implementation. This implements the requirement of
 166 explicit sharing of the DOM specs.
 167 <p>
 168 Another advantage is, that (as the implementation classes are not exported) it gives us a lot
 169 more freedom to make changes in the implementation without breaking binary compatibility.
 170 <p>
 171 You will find almost a one to one correspondence between the interface classes and the implementation
 172 classes. In the implementation classes we have added a few more intermediate classes, that can
 173 not be seen from the outside for various reasons (make implementation of shared features easier
 174 or to reduce memory consumption).
 175 <p>
 176 In C++, you can access the whole DOM tree from outside KHTML by using the interface classes.
 177 For a description see the <a href="http://developer.kde.org/documentation/library/kdeqt/kde3arch/khtml/index.html">introduction to khtml</a> on <a href="http://developer.kde.org/">developer.kde.org</a>.
 178
 179 One thing that has been omitted in the discussion above is the style sheet defined inside the
 180 <code>&lt;style&gt;</code> element (as an example of a style sheet) and the image element
 181 (as an example of an external resource that needs to be loaded). This will be done in the following
 182 two sections.
 183
 184 <h2>CSS</h2> The contents of the <code>&lt;style&gt;</code> element (in this
 185 case the <code>h1 { color: red; }</code> rule) will get passed to the
 186 <a href="html/html_headimpl.h">HTMLStyleElementImpl object</a>.  This object creates an
 187 <a href="css/cssstylesheetimpl.h">CSSStyleSheetImpl object</a> and passes the
 188 data to it. The <a href="css/cssparser.h">CSS parser</a> will take
 189 the data, and parse it into a DOM structure for CSS (similar to the one for
 190 HTML, see also the DOM level 2 specs). This will be later on used to define the
 191 look of the HTML elements in the DOM tree.
 192 <p>
 193 Actually "later on" is relative, as we will see later, that this happens partly in parallel to
 194 the build up of the DOM tree.
 195
 196 <h2>Loading external objects</h2>
 197 <p>
 198 Some HTML elements (as <code>&lt;img&gt;, &lt;link&gt;, &lt;object&gt;, etc.</code>) contain
 199 references to external objects, that have to be loaded. This is done by the
 200 Loader and related classes (misc/loader.*). Objects that might need to load external objects
 201 inherit from <a href="misc/loader_client.h">CachedObjectClient</a>, and can ask
 202 the <a href="misc/loader.h">loader</a> (that also acts as a memory cache) to
 203 download the object they need for them from the web.
 204 <p>
 205 Once the <a href="misc/loader.h">loader</a> has the requested object ready, it will notify the
 206 <a href="misc/loader_client.h">CachedObjectClient</a> of this, and the client can
 207 then process the received data.
 208
 209 <h2>Making it visible</h2>
 210
 211 Now once we have the DOM tree, and the associated style sheets and external objects, how
 212 do we get the stuff actually displayed on the screen?
 213 <p>
 214 For this we have a rendering engine, that is completely based on CSS. The first
 215 thing that is done is to collect all style sheets that apply to the document
 216 and create a nice list of style rules that need to be applied to the
 217 elements. This is done in the <a href="css/cssstyleselector.h">CSSStyleSelector</a> class.
 218 It takes the <a href="css/html4.css">default HTML style sheet</a> (defined in css/html4.css),
 219 an optional user defined style sheet, and all style sheets from the document,
 220 and combines them to a nice list of parsed style rules (optimised for fast
 221 lookup). The exact rules of how these style sheets should get applied to HTML
 222 or XML documents can be found in the CSS2 specs.
 223 <p>
 224 Once we have this list, we can get a <a
 225 href="rendering/render_style.h">RenderStyle object</a>
 226 for every DOM element from the <a
 227 href="css/cssstyleselector.h">CSSStyleSelector</a> by calling
 228 "styleForElement(DOM::ElementImpl *)".
 229 The style object describes in a compact form all the
 230 <a href="css/css_properties.in">CSS properties</a>
 231 that should get applied to the Node.
 232 <p>
 233 After that, a rendering tree gets built up. Using the style object, the
 234 <a href="xml/dom_nodeimpl.h">DOM Node</a> creates an appropriate render object
 235 (all these are defined in the rendering subdirectory) and adds it to the
 236 rendering tree.  This will give another tree like structure, that resembles in
 237 it's general structure the DOM tree, but might have some significant
 238 differences too. First of all, so called
 239  <a href="http://www.w3.org/TR/REC-CSS2/visuren.html#anonymous-block-level">anonymous boxes</a> - (see
 240  <a href="http://www.w3.org/TR/REC-CSS2/">CSS specs</a>) that
 241 have no DOM counterpart might get inserted into the rendering tree to satisfy
 242 DOM requirements. Second, the display property of the style affects which type
 243 of rendering object is chosen to represent the current DOM object.
 244
 245 <p>
 246 In the above example we would get the following rendering tree:
 247 <pre>
 248 RenderRoot*
 249   \--> RenderBody*
 250          |--> RenderFlow* (&lt;H1&gt;)
 251          |      \--> RenderText* ("some red text")
 252          |--> RenderFlow* (anonymous box)
 253          |      \--> RenderText* ("more text")
 254          \--> RenderFlow* (&lt;P&gt;)
 255                 |--> RenderText* ("a paragraph with an")
 256                 |--> RenderImage*
 257                 \--> RenderText* ("embedded image.")
 258 </pre>
 259
 260 <p>
 261 A call to of <a href="rendering/render_root.cpp">layout()</a> on the
 262 <a href="rendering/render_root.h">RenderRoot </a> (the root of the rendering tree)
 263 object causes the rendering tree to layout itself into the available space
 264 (width) given by the KHTMLView. After that, the drawContents() method of
 265 KHTMLView can call RenderRoot->print() with appropriate parameters to actually
 266 paint the document. This is not 100% correct, when parsing incrementally, but
 267 is exactly what happens when you resize the document.
 268
 269
 270 As you can see, the conversion to the rendering tree removed the head part of
 271 the HTML code, and inserted an anonymous render object around the string "more
 272 text". For an explanation why this is done, see the CSS specs.
 273 <p>
 274
 275 <h2>Directory structure</h2>
 276
 277 A short explanation of the subdirectories in khtml.
 278 <dl>
 279 <dt><a href="css/">css:</a>
 280 <dd>Contains all the stuff relevant to the CSS part of DOM Level2 (implementation classes only),
 281 the <a href="css/cssparser.h">CSS parser</a>, and the stuff to create
 282 RenderStyle object out of Nodes and the CSS style sheets.
 283 <dt><a href="dom/">dom: </a>
 284 <dd>Contains the external DOM API (the DOM interface classes) for all of the DOM
 285 <dt><a href="ecma/">ecma:</a>
 286 <dd>The javascript bindings to the DOM and khtml.
 287 <dt><a href="html/">html:</a>
 288 <dd>The html subpart of the DOM (implementation only), the HTML tokenizer and parser and a class
 289 that defines the DTD to use for HTML (used mainly in the parser).
 290 <dt><a href="java/">java:</a>
 291 <dd>Java related stuff.
 292 <dt><a href="misc/">misc:</a>
 293 <dd>Some misc stuff needed in khtml. Contains the image loader, some misc definitions and the
 294 decoder class that converts the incoming stream to unicode.
 295 <dt><a href="rendering">rendering:</a>
 296 <dd>Everything thats related to bringing a DOM tree with CSS declarations to the screen. Contains
 297 the definition of the objects used in the rendering tree, the layouting code, and the RenderStyle objects.
 298 <dt><a href="xml/">xml:</a>
 299 <dd>The XML part of the DOM implementation, the xml tokenizer.
 300 </dl>
 301
 302 <h2>Exception handling</h2>
 303 To save on library size, C++-exceptions are only enabled in the dom/ subdirectory,
 304 since exceptions are mandated by the DOM API. In the rest of KHTML's code,
 305 we pass an error flag (usually called "exceptionCode"), and the class that
 306 is part of dom/* checks for this flag and throws the exception.
 307
 308 <h2>Final words...</h2>
 309 <p>
 310 All the above is to give you a quick introduction into the way khtml brings an HTML/XML file to the screen.
 311 It is by no way complete or even 100% correct. I left out many problems, I will perhaps add either on request
 312 or when I find some time to do so. Let me name some of the missing things:
 313 <ul>
 314 <li>The decoder to convert the incoming stream to Unicode
 315 <li>interaction with konqueror/applications
 316 <li>javascript
 317 <li>dynamic reflow and how to use the DOM to manipulate khtmls visual output
 318 <li>mouse/event handling
 319 <li>real interactions when parsing incrementally
 320 <li>java
 321 </ul>
 322
 323 Still I hope that this short introduction will make it easier for you to get a first hold of khtml and the way it works.
 324 <p>
 325 Now before I finish let me add a small <b>warning</b> and <b>advice</b> to all of you who plan hacking khtml themselves:
 326 <p>
 327 khtml is by now a quite big library and it takes some time to understand how it works. Don't let yourself get frustrated
 328 if you don't immediately understand how it works. On the other hand, it is by now one of the libraries that
 329 get used a lot, that probably has the biggest number of remaining bugs (even though it's sometimes hard to
 330 know if some behavior is really a bug).
 331 <blockquote>
 332 Some parts of it's code are however <b>extremely touchy</b> (especially the layouting algorithms),
 333 and making changes there (that might fix a bug on one web page) might introduce severe bugs.
 334 All the people developing khtml have already spend huge amounts of time searching for such bugs,
 335 that only showed up on some web pages, and thus were found only a week after the change that
 336 introduced the bug was made. This can be very frustrating for us, and we'd appreciate if people
 337 that are not completely familiar with khtml post changes touching these critical regions to kfm-devel
 338 for review before applying them.
 339 </blockquote>
 340
 341 <div style="margin-top: 2em; font-size: large;">
 342 And now have fun hacking khtml.
 343 <div style="margin-left: 10em; margin-bottom: 1em;">Lars</div>
 344 </div>
 345 </body>
 346 </html>