1 ! Copyright (C) 2005, 2009 Daniel Ehrenberg
\r
2 ! See http://factorcode.org/license.txt for BSD license.
\r
3 USING: help.markup help.syntax xml.data io strings ;
\r
7 { $values { "string" string } { "xml" xml } }
\r
8 { $description "Converts a string into an " { $link xml }
\r
9 " tree for further processing." } ;
\r
12 { $values { "stream" "an input stream" } { "xml" xml } }
\r
13 { $description "Exausts the given stream, reading an XML document from it. A binary stream, one without encoding, should be used as input, and the encoding is automatically detected." } ;
\r
16 { $values { "filename" string } { "xml" xml } }
\r
17 { $description "Opens the given file, reads it in as XML, closes the file and returns the corresponding XML tree. The encoding is automatically detected." } ;
\r
19 { string>xml read-xml file>xml } related-words
\r
21 HELP: read-xml-chunk
\r
22 { $values { "stream" "an input stream" } { "seq" "a sequence of elements" } }
\r
23 { $description "Rather than parse a document, as " { $link read-xml } " does, this word parses and returns a sequence of XML elements (tags, strings, etc), ie a document fragment. This is useful for pieces of XML which may have more than one main tag. The encoding is not automatically detected, and a stream with an encoding (ie. one which returns strings from " { $link read } ") should be used as input." }
\r
24 { $see-also read-xml } ;
\r
27 { $values { "stream" "an input stream" } { "quot" "a quotation ( xml-elem -- )" } }
\r
28 { $description "Parses the XML document, and whenever an event is encountered (a tag piece, comment, parsing instruction, directive or string element), the quotation is called with that event on the stack. The quotation has all responsibility to deal with the event properly. The encoding of the stream is automatically detected, so a binary input stream should be used." }
\r
29 { $see-also read-xml } ;
\r
32 { $class-description "Represents the state of a pull-parser for XML. Has one slot, " { $snippet "scope" } ", which is a namespace which contains all relevant state information." }
\r
33 { $see-also <pull-xml> pull-event pull-elem } ;
\r
36 { $values { "pull-xml" pull-xml } }
\r
37 { $description "Creates an XML pull-based parser which reads from " { $link input-stream } ", executing all initial XML commands to set up the parser." }
\r
38 { $see-also pull-xml pull-elem pull-event } ;
\r
41 { $values { "pull" "an XML pull parser" } { "xml-elem/f" "an XML tag, string, or f" } }
\r
42 { $description "Gets the next XML element from the given XML pull parser. Returns f upon exhaustion." }
\r
43 { $see-also pull-xml <pull-xml> pull-event } ;
\r
46 { $values { "pull" "an XML pull parser" } { "xml-event/f" "an XML tag event, string, or f" } }
\r
47 { $description "Gets the next XML event from the given XML pull parser. Returns f upon exhaustion." }
\r
48 { $see-also pull-xml <pull-xml> pull-elem } ;
\r
51 { $values { "stream" "an input stream" } { "dtd" dtd } }
\r
52 { $description "Exhausts a stream, producing a " { $link dtd } " from the contents." } ;
\r
55 { $values { "filename" string } { "dtd" dtd } }
\r
56 { $description "Reads a file in UTF-8, converting it into an XML " { $link dtd } "." } ;
\r
59 { $values { "string" string } { "dtd" dtd } }
\r
60 { $description "Interprets a string as an XML " { $link dtd } "." } ;
\r
62 { read-dtd file>dtd string>dtd } related-words
\r
64 ARTICLE: { "xml" "reading" } "Reading XML"
\r
65 "The following words are used to read something into an XML document"
\r
66 { $subsection string>xml }
\r
67 { $subsection read-xml }
\r
68 { $subsection read-xml-chunk }
\r
69 { $subsection string>xml-chunk }
\r
70 { $subsection file>xml }
\r
72 { $subsection read-dtd }
\r
73 { $subsection file>dtd }
\r
74 { $subsection string>dtd } ;
\r
76 ARTICLE: { "xml" "events" } "Event-based XML parsing"
\r
77 "In addition to DOM-style parsing based around " { $link read-xml } ", the XML module also provides SAX-style event-based parsing. This uses much of the same data structures as normal XML, with the exception of the classes " { $link xml } " and " { $link tag } " and as such, the article " { $vocab-link "xml.data" } " may be useful in learning how to process documents in this way. Other useful words are:"
\r
78 { $subsection each-element }
\r
79 { $subsection opener }
\r
80 { $subsection closer }
\r
81 { $subsection contained }
\r
82 "There is also pull-based parsing to augment the push-parsing of SAX. This is probably easier to use and more logical. It uses the same parsing objects as the above style of parsing, except string elements are always in arrays, for example { \"\" }. Relevant pull-parsing words are:"
\r
83 { $subsection <pull-xml> }
\r
84 { $subsection pull-xml }
\r
85 { $subsection pull-event }
\r
86 { $subsection pull-elem } ;
\r
88 ARTICLE: "xml" "XML parser"
\r
89 "The " { $vocab-link "xml" } " vocabulary implements the XML 1.0 and 1.1 standards, converting strings of text into XML and vice versa. The parser checks for well-formedness but is not validating. There is only partial support for processing DTDs."
\r
90 { $subsection { "xml" "reading" } }
\r
91 { $subsection { "xml" "events" } }
\r
92 { $vocab-subsection "Writing XML" "xml.writer" }
\r
93 { $vocab-subsection "XML parsing errors" "xml.errors" }
\r
94 { $vocab-subsection "XML entities" "xml.entities" }
\r
95 { $vocab-subsection "XML data types" "xml.data" }
\r
96 { $vocab-subsection "Utilities for processing XML" "xml.utilities" }
\r
97 { $vocab-subsection "Dispatch on XML tag names" "xml.dispatch" } ;
\r