documentation/manual/en/module_specs/Zend_Markup-Parsers.xml

   1 <?xml version="1.0" encoding="UTF-8"?>
   2 <!-- Reviewed: no -->
   3 <sect1 id="zend.markup.parsers">
   4     <title>Zend_Markup Parsers</title>
   5
   6     <para>
   7         <classname>Zend_Markup</classname> is currently shipped with two parsers, a BBCode parser
   8         and a Textile parser.
   9     </para>
  10
  11     <sect2 id="zend.markup.parsers.theory">
  12         <title>Theory of Parsing</title>
  13
  14         <para>
  15             The parsers of <classname>Zend_Markup</classname> are classes that convert text with
  16             markup to a token tree. Although we are using the BBCode parser as example here, the
  17             idea of the token tree remains the same across all parsers. We will start with this
  18             piece of BBCode for example:
  19         </para>
  20
  21         <programlisting><![CDATA[
  22 [b]foo[i]bar[/i][/b]baz
  23 ]]></programlisting>
  24
  25         <para>
  26             Then the BBCode parser will take that value, tear it apart and create the following
  27             tree:
  28         </para>
  29
  30         <itemizedlist>
  31             <listitem>
  32                 <para>[b]</para>
  33
  34                 <itemizedlist>
  35                     <listitem>
  36                         <para>foo</para>
  37                     </listitem>
  38
  39                     <listitem>
  40                         <para>[i]</para>
  41
  42                         <itemizedlist>
  43                             <listitem>
  44                                 <para>bar</para>
  45                             </listitem>
  46                         </itemizedlist>
  47                     </listitem>
  48                 </itemizedlist>
  49             </listitem>
  50
  51             <listitem>
  52                 <para>baz</para>
  53             </listitem>
  54         </itemizedlist>
  55
  56         <para>
  57             You will notice that the closing tags are gone, they don't show up as content in the
  58             tree structure. This is because the closing tag isn't part of the actual content.
  59             Although, this does not mean that the closing tag is just lost, it is stored inside the
  60             tag information for the tag itself. Also, please note that this is just a simplified
  61             view of the tree itself. The actual tree contains a lot more information, like the tag's
  62             attributes and its name.
  63         </para>
  64     </sect2>
  65
  66     <sect2 id="zend.markup.parsers.bbcode">
  67         <title>The BBCode parser</title>
  68
  69         <para>
  70             The BBCode parser is a <classname>Zend_Markup</classname> parser that converts BBCode to
  71             a token tree. The syntax of all BBCode tags is:
  72         </para>
  73
  74         <programlisting language="text"><![CDATA[
  75 [name(=(value|"value"))( attribute=(value|"value"))*]
  76 ]]></programlisting>
  77
  78         <para>
  79             Some examples of valid BBCode tags are:
  80         </para>
  81
  82         <programlisting><![CDATA[
  83 [b]
  84 [list=1]
  85 [code file=Zend/Markup.php]
  86 [url="http://framework.zend.com/" title="Zend Framework!"]
  87 ]]></programlisting>
  88
  89         <para>
  90             By default, all tags are closed by using the format '[/tagname]'.
  91         </para>
  92     </sect2>
  93
  94     <sect2 id="zend.markup.parsers.textile">
  95         <title>The Textile parser</title>
  96
  97         <para>
  98             The Textile parser is a <classname>Zend_Markup</classname> parser that converts Textile
  99             to a token tree. Because Textile doesn't have a tag structure, the following is a list
 100             of example tags:
 101         </para>
 102
 103         <table id="zend.markup.parsers.textile.tags">
 104             <title>List of basic Textile tags</title>
 105
 106             <tgroup cols="2" align="left" colsep="1" rowsep="1">
 107                 <thead>
 108                     <row>
 109                         <entry>Sample input</entry>
 110                         <entry>Sample output</entry>
 111                     </row>
 112                 </thead>
 113
 114                 <tbody>
 115                     <row>
 116                         <entry>*foo*</entry>
 117                         <entry><![CDATA[<strong>foo</strong>]]></entry>
 118                     </row>
 119
 120                     <row>
 121                         <entry>_foo_</entry>
 122                         <entry><![CDATA[<em>foo</em>]]></entry>
 123                     </row>
 124
 125                     <row>
 126                         <entry>??foo??</entry>
 127                         <entry><![CDATA[<cite>foo</cite>]]></entry>
 128                     </row>
 129
 130                     <row>
 131                         <entry>-foo-</entry>
 132                         <entry><![CDATA[<del>foo</del>]]></entry>
 133                     </row>
 134
 135                     <row>
 136                         <entry>+foo+</entry>
 137                         <entry><![CDATA[<ins>foo</ins>]]></entry>
 138                     </row>
 139
 140                     <row>
 141                         <entry>^foo^</entry>
 142                         <entry><![CDATA[<sup>foo</sup>]]></entry>
 143                     </row>
 144
 145                     <row>
 146                         <entry>~foo~</entry>
 147                         <entry><![CDATA[<sub>foo</sub>]]></entry>
 148                     </row>
 149
 150                     <row>
 151                         <entry>%foo%</entry>
 152                         <entry><![CDATA[<span>foo</span>]]></entry>
 153                     </row>
 154
 155                     <row>
 156                         <entry>PHP(PHP Hypertext Preprocessor)</entry>
 157
 158                         <entry>
 159                             <![CDATA[<acronym title="PHP Hypertext Preprocessor">PHP</acronym>]]>
 160                         </entry>
 161                     </row>
 162
 163                     <row>
 164                         <entry>"Zend Framework":http://framework.zend.com/</entry>
 165
 166                         <entry>
 167                             <![CDATA[<a href="http://framework.zend.com/">Zend Framework</a>]]>
 168                         </entry>
 169                     </row>
 170
 171                     <row>
 172                         <entry>h1. foobar</entry>
 173                         <entry><![CDATA[<h1>foobar</h1>]]></entry>
 174                     </row>
 175
 176                     <row>
 177                         <entry>h6. foobar</entry>
 178                         <entry><![CDATA[<h6>foobar</h6>]]></entry>
 179                     </row>
 180
 181                     <row>
 182                         <entry>!http://framework.zend.com/images/logo.gif!</entry>
 183
 184                         <entry>
 185                             <![CDATA[<img src="http://framework.zend.com/images/logo.gif" />]]>
 186                         </entry>
 187                     </row>
 188                 </tbody>
 189             </tgroup>
 190         </table>
 191
 192         <para>
 193             Also, the Textile parser wraps all tags into paragraphs; a paragraph ends with two
 194             newlines, and if there are more tags, a new paragraph will be added.
 195         </para>
 196
 197         <sect3 id="zend.markup.parsers.textile.lists">
 198             <title>Lists</title>
 199
 200             <para>
 201                 The Textile parser also supports two types of lists. The numeric type, using the "#"
 202                 character and bullit-lists using the "*" character. An example of both lists:
 203             </para>
 204
 205             <programlisting><![CDATA[
 206 # Item 1
 207 # Item 2
 208
 209 * Item 1
 210 * Item 2
 211 ]]></programlisting>
 212
 213             <para>
 214                 The above will generate two lists: the first, numbered; and the second, bulleted.
 215                 Inside list items, you can use normal tags like strong (*), and emphasized (_). Tags
 216                 that need to start on a new line (like 'h1' etc.) cannot be used inside lists.
 217             </para>
 218         </sect3>
 219     </sect2>
 220 </sect1>