1 .\" $Vendor-Id: mdoc.3,v 1.55 2011/01/07 15:07:21 kristaps Exp $
3 .\" Copyright (c) 2009, 2010 Kristaps Dzonsons <kristaps@bsd.lv>
4 .\" Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org>
6 .\" Permission to use, copy, modify, and distribute this software for any
7 .\" purpose with or without fee is hereby granted, provided that the above
8 .\" copyright notice and this permission notice appear in all copies.
10 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
30 .Nd mdoc macro compiler library
34 .Vt extern const char * const * mdoc_macronames;
35 .Vt extern const char * const * mdoc_argnames;
38 .Fa "struct mdoc *mdoc"
39 .Fa "const struct tbl_span *span"
43 .Fa "struct regset *regs"
48 .Fn mdoc_endparse "struct mdoc *mdoc"
50 .Fn mdoc_free "struct mdoc *mdoc"
51 .Ft "const struct mdoc_meta *"
52 .Fn mdoc_meta "const struct mdoc *mdoc"
53 .Ft "const struct mdoc_node *"
54 .Fn mdoc_node "const struct mdoc *mdoc"
57 .Fa "struct mdoc *mdoc"
62 .Fn mdoc_reset "struct mdoc *mdoc"
66 library parses lines of
69 into an abstract syntax tree (AST).
71 In general, applications initiate a parsing sequence with
73 parse each line in a document with
75 close the parsing session with
77 operate over the syntax tree returned by
81 then free all allocated memory with
85 function may be used in order to reset the parser for another input
91 Its values are only used privately within the library.
92 .It Vt struct mdoc_node
95 .Sx Abstract Syntax Tree
104 return 0, calls to any function but
108 will raise an assertion.
111 Add a table span to the parsing stream.
112 Returns 0 on failure, 1 on success.
114 Allocates a parsing structure.
119 Always returns a valid pointer.
120 The pointer must be freed with
123 Reset the parser for another parse routine.
126 behaves as if invoked for the first time.
127 If it returns 0, memory could not be allocated.
129 Free all resources of a parser.
130 The pointer is no longer valid after invocation.
132 Parse a nil-terminated line of input.
133 This line should not contain the trailing newline.
134 Returns 0 on failure, 1 on success.
137 is modified by this function.
139 Signals that the parse is complete.
140 Returns 0 on failure, 1 on success.
142 Returns the first node of the parse.
144 Returns the document's parsed meta-data.
148 .It Va mdoc_macronames
149 An array of string-ified token names.
151 An array of string-ified token argument names.
153 .Ss Abstract Syntax Tree
156 functions produce an abstract syntax tree (AST) describing input in a
158 It may be reviewed at any time with
160 however, if called before
166 fail, it may be incomplete.
168 This AST is governed by the ontological
171 and derives its terminology accordingly.
173 elements described in
175 are described simply as
178 The AST is composed of
180 nodes with block, head, body, element, root and text types as declared
184 Each node also provides its parse point (the
189 fields), its position in the tree (the
196 fields) and some type-specific data, in particular, for nodes generated
197 from macros, the generating macro in the
201 The tree itself is arranged according to the following normal form,
202 where capitalised non-terminals represent nodes.
204 .Bl -tag -width "ELEMENTXX" -compact
208 \(<- BLOCK | ELEMENT | TEXT
210 \(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]]
216 \(<- mnode* [ENDBODY mnode*]
220 \(<- [[:printable:],0x1e]*
223 Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
224 the BLOCK production: these refer to punctuation marks.
225 Furthermore, although a TEXT node will generally have a non-zero-length
226 string, in the specific case of
227 .Sq \&.Bd \-literal ,
228 an empty line will produce a zero-length string.
229 Multiple body parts are only found in invocations of
231 where a new body introduces a new phrase.
232 .Ss Badly-nested Blocks
233 The ENDBODY node is available to end the formatting associated
234 with a given block before the physical end of that block.
237 field, is of the BODY
241 as the BLOCK it is ending, and has a
243 field pointing to that BLOCK's BODY node.
244 It is an indirect child of that BODY node
245 and has no children of its own.
247 An ENDBODY node is generated when a block ends while one of its child
248 blocks is still open, like in the following example:
249 .Bd -literal -offset indent
256 This example results in the following block structure:
257 .Bd -literal -offset indent
262 BLOCK Bo, pending -> Ao
267 ENDBODY Ao, pending -> Ao
272 Here, the formatting of the
274 block extends from TEXT ao to TEXT ac,
275 while the formatting of the
277 block extends from TEXT bo to TEXT bc.
278 It renders as follows in
282 .Dl <ao [bo ac> bc] end
284 Support for badly-nested blocks is only provided for backward
285 compatibility with some older
288 Using badly-nested blocks is
289 .Em strongly discouraged :
294 front-ends are unable to render them in any meaningful way.
295 Furthermore, behaviour when encountering badly-nested blocks is not
296 consistent across troff implementations, especially when using multiple
297 levels of badly-nested blocks.
299 The following example reads lines from stdin and parses them, operating
300 on the finished parse tree with
302 This example does not error-check nor free memory upon failure.
303 .Bd -literal -offset indent
306 const struct mdoc_node *node;
311 bzero(®s, sizeof(struct regset));
313 mdoc = mdoc_alloc(®s, NULL, NULL);
317 while ((len = getline(&buf, &alloc_len, stdin)) >= 0) {
318 if (len && buflen[len - 1] = '\en')
319 buf[len - 1] = '\e0';
320 if ( ! mdoc_parseln(mdoc, line, buf))
321 errx(1, "mdoc_parseln");
325 if ( ! mdoc_endparse(mdoc))
326 errx(1, "mdoc_endparse");
327 if (NULL == (node = mdoc_node(mdoc)))
328 errx(1, "mdoc_node");
334 To compile this, execute
336 .Dl % cc main.c libmdoc.a libmandoc.a
347 library was written by
348 .An Kristaps Dzonsons Aq kristaps@bsd.lv .