Emacs NYC talk
[arxana.git] / latex / honey-demo.tex
blob158e904bb2c8a6f92efa801f69898091606cf0fd
1 %% honey-demo.tex - documentation for honey.el
3 %% Copyright (C) 2010 Raymond S. Puzio
5 %% This program is free software: you can redistribute it and/or modify
6 %% it under the terms of the GNU Affero General Public License as published by
7 %% the Free Software Foundation, either version 3 of the License, or
8 %% (at your option) any later version.
10 %% This program is distributed in the hope that it will be useful,
11 %% but WITHOUT ANY WARRANTY; without even the implied warranty of
12 %% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 %% GNU Affero General Public License for more details.
15 %% You should have received a copy of the GNU Affero General Public License
16 %% along with this program. If not, see <http://www.gnu.org/licenses/>.
19 \def\comment{;\it}
20 {\obeyspaces\gdef {\ }}
21 {\catcode`;=\active
22 \gdef\lisp{\everypar={\tt}%
23 \catcode`\#=12\catcode`;=\active\let;=\comment%
24 \tt\obeyspaces\obeylines}}%
25 %\openout1=honey-demo.el
26 \def\boxit#1{\hbox{\vbox{\hrule{\hbox{\vrule\kern2pt
27 \vbox{\kern2pt\hbox{#1}\kern2pt}\kern2pt\vrule}}\hrule}\kern-2.5pt}}
29 \centerline{\bf Higher Order NEtwork Yarnknotter}
30 \centerline{\sl An introduction by example}
32 \beginsection Introduction
34 This document introduces the HONEY platform for networks by working
35 through examples, starting with the basics and gradually working up to
36 advanced features. ``HONEY'' stands for ``Higher Order NEsted
37 Yarnknotter'' and is a system for creating, maintaining, and
38 interacting with higher-order nested semantic networks. By ``higher
39 order'', it is meant that links can point to other links. By
40 ``nested'', it is meant that the nodes which make up a network may
41 contain other networks, which themselves may have nodes containing yet
42 other networks, etc., ad nauseam or ad infinitum, whichever comes
43 first. In addition, all the links are bidirectional and no
44 fundamental distinction is made between links and nodes. This
45 sophistication makes the package suitable as a platform to construct
46 knowledge bases for such advanced applications as hyperreal
47 dictionaries.
49 For convenience and flexibility, this package has been designed in a
50 modular and extensible fashion. The basic functionality is provided
51 by a back-end which manages the data of the network. In the
52 implementation described here, this data is stored in hash tables, but
53 one could instead have it stored in, say, a relational database or a
54 file system by writing a suitable back-end and using that instead.
55 For higher-level access to the data is mediated by means of a handful
56 of primitive commands and, as long as these commands behave similarly,
57 it does not matter to the end user how they are implemented or how the
58 data is represented internally.
60 Since these primitive command are, well, primitive, restricting
61 oneself to them would be quite a tedious way to interact with the
62 knowledge base except for basic maintenance operations. Thus, on top
63 of the built-in layer, one would normally add a layer or two of
64 application packages which accomplish more sophisticated tasks such as
65 rendering documents or updating links en masse or implement different
66 types of documents as semantic nets. In the interests of integrity of
67 the knowledge base and modularity with respect to its representation,
68 these routines should not access the underlying databases directly but
69 call upon the primitive access functions to do their bidding. In a
70 finished application, one might even have several layers, say a middle
71 end which implements layered services upon the database and a
72 front-end which provides a user-friendly interface to these services.
74 The basic unit from which we will construct our networks is called an
75 ``article''. Articles are data objects which serve to encode both
76 links and nodes and are characterized by the following components:
77 \item{$\bullet$} Identifier
78 \item{$\bullet$} Source
79 \item{$\bullet$} Sink
80 \item{$\bullet$} Content
82 Each article has a unique identifier which the computer assigns to it
83 when it is entered into the database. This identifier is used by the
84 access functions to refer to the article for subsequent operations.
85 Since talking about, say, article 12 pointing to article 5 is not
86 intuitive for most users, the system also has a mechanism for
87 assigning names to articles so that one could instead, say, refer to
88 ``walrus'' pointing to ``mammal'' instead.
90 Source and sink are pointers to other articles and content is a place
91 in which to store a lisp object, which might be some text, or an
92 expression, or maybe a number or maybe something else depending on
93 what intends to represent with one's network and how one intends to
94 use it. In particular, the content of an article can even be another
95 network constructed out of more articles.
97 In order to provide a foundation for building up networks, there is a
98 special article ``ground'' which serves as a foundation. It is the
99 zero object of our system; its source and sink is itself and its
100 content is nil. To enter an article which is meant as a node, we set
101 its source and its sink to ground. We can also set the source to
102 ground and sink to a non-ground article or vice-versa; this provides a
103 means for creating ``stickies'' which attach directly to articles, a
104 construction for which we shall find use from time to time.
106 This system of articles each of which point to two other articles and
107 a ground article is similar to the cons cells of lisp, but expands the
108 paradigm in two important ways --- the cells carry content and the
109 links are bidirectional. These differences allow the system under
110 consideration to accomplish things which would not be feasible with
111 plain old s-expressions.
113 Bidirectional linking means that the links are stored and accessed in
114 a manner which makes it no more costly to find and traverse links in
115 the backwards than the forward direction. This is what makes it
116 practical to make links as mentioned above and done explicitly below.
118 Storing content in articles is what allows us to construct nodes.
119 While, in lisp, {\lisp(cons nil nil)} is certainly valid, it's not
120 terribly useful since it doesn't carry much information. However,
121 when we can attach information to a cell, then each such object is
122 able to serve as a distinct carrier of information and can be quite
123 useful. In the case of links, this room for extra information can be
124 used to do things like, say, specify what part of a story is being
125 considered in a link to a comment on a story.
127 \beginsection Getting Started
129 In this section, we will show how the system works by starting up an
130 instance, entering a simple example, and saving it.
132 The first step is to load the Honey package. We fire up Emacs in
133 lisp mode and type in the following command:
134 \smallskip
135 {\lisp
136 (load-file "honey.el")
137 $\Rightarrow$ t}
138 \smallskip\noindent
139 Then we need to make a network with which to work. We can do this
140 as follows:
141 \smallskip
142 {\lisp
143 (set-net (new-net root-level))
144 $\Rightarrow$
145 (*network*
147 #<hash-table 'equal nil 2/65 0xa770c68>
148 #<hash-table 'equal nil 1/65 0xa2f18c0>
149 #<hash-table 'equal nil 1/65 0xa3fd678>
150 #<hash-table 'equal nil 2/65 0xa710ad0>
151 #<hash-table 'equal nil 2/65 0xa74eec8> 0)}
152 \smallskip\noindent
153 The command ``{\tt new-net}'' sets up the database for a new network
154 and initializes it. The argument ``{\tt root-level}'' means that this
155 network resides at the fundamental level of the system as opposed to
156 residing within a node of some other network. Finally, the command
157 ``{\tt set-net}'' selects this net as the one to which the commands
158 for entering and viewing data will apply. The output is the internal
159 representation of this network, which ``{\tt set-net}'' has passed
160 through for possible use by some other function.
162 Now, we can start entering data. For our first example, we begin with
163 the following basic network of two nodes connected by a link which
164 expresses the fact that walruses are mammals:
166 \matrix{\lower3pt\boxit{\tt walrus} &
167 \hbox to 40pt{\rightarrowfill} &
168 \lower3pt\boxit{\tt mammal}}
170 As we proceed, we will expand this to something non-trivial but first
171 the fundamentals.
173 In the example, there are two nodes, ``walrus'' and ``mammal''. In
174 our system, nodes are represented by articles whose source and sink
175 are both ``ground''. To construct these nodes, we will need the
176 identifier of the special article ``ground'', which we can find by
177 using the command ``label2uid'' :
178 \smallskip
179 {\lisp
180 (label2uid "ground")
181 $\Rightarrow$ 0}
182 \smallskip\noindent
183 We proceed to construct the two articles using the command
184 ``put-article'' as follows:
185 \smallskip
186 {\lisp
187 (put-article 0 0 "walrus")
188 $\Rightarrow$ 2
189 (put-article 0 0 "mammal")
190 $\Rightarrow$ 3}
191 \smallskip\noindent
192 The first two arguments of the command are the source and sink of the
193 article, respectively, which, in this case are both ground. The third
194 argument is the content of the article, which we chose to be the
195 obvious text string. The values returned are the identifiers of the
196 newly made articles, which we shall use in the next step.
198 The link from ``walrus'' to ``mammal'' will be represented by an
199 article whose source is ``walrus'' and whose sink is ``mammal''.
200 Using the values for identifier we noted above, this command to make
201 this article goes as follows:
202 \smallskip
203 {\lisp
204 (put-article 2 3 nil)
205 $\Rightarrow$ 4} \smallskip
207 For our convenience, we will name our two nodes. We do this using the
208 command ``label-article'' which takes the numerical identifier of the
209 article and the string by which we will refer to it as arguments:
210 \smallskip
211 {\lisp
212 (label-article 2 "walrus")
213 $\Rightarrow$ 2
214 (label-article 3 "mammal")
215 $\Rightarrow$ 3}
216 \smallskip\noindent
217 Just as we did above with ``ground'', so too we can now look up the
218 identifier of one of our nodes using the command ``label2uid''
219 {\lisp
220 (label2uid "mammal")
221 $\Rightarrow$ 3}
222 \smallskip\noindent
223 Conversely, we can use ``uid2label'' to find the name of one of our
224 nodes from its numerical identifier:
225 \smallskip
226 {\lisp
227 (uid2label 2)
228 $\Rightarrow$ "walrus"}
229 \smallskip\noindent
231 We may examine our network by using access functions to view the
232 content, source, and sink of articles:
233 \smallskip
234 {\lisp
235 (get-content 2)
236 $\Rightarrow$ "walrus"
237 (get-source 4)
238 $\Rightarrow$ 2
239 (get-sink 4)
240 $\Rightarrow$ 3}
241 \smallskip\noindent
242 These can be combined with the previous commands to make for more
243 user-friendly interaction:
244 \smallskip
245 {\lisp
246 (uid2label (get-sink 4))
247 $\Rightarrow$ "mammal"
248 (get-content (label2uid "walrus"))
249 $\Rightarrow$ "walrus"}
250 \smallskip
252 Having entered our example net and performed a few operations on it, we
253 will now save it as ``simple-example'':
254 \smallskip
255 {\lisp
256 (setq simple-example (download-en-masse))
257 $\Rightarrow$
258 ((0 "ground" 0 0)
259 (1 "article-type" 0 0)
260 (2 "walrus" 0 0 . "walrus")
261 (3 "mammal" 0 0 . "mammal")
262 (4 nil 2 3))}
263 \smallskip\noindent
264 The result of this operation is a compact representation of our
265 network as a list of quintuplets of the form
266 {\lisp (uid label source sink . content)}. We can store this away in
267 a file or wherever for safekeeping and later use it to rebuild our
268 knowledge base. To demonstrate how this goes, we will first use the
269 command ``reset-net'' which resets the net to the state in which it
270 was when newly created, removing everything added subsequently:
271 \smallskip
272 {\lisp
273 (reset-net)
274 $\Rightarrow$ nil}
275 \smallskip\noindent
276 We can check that this command acts as advertised by calling
277 ``download-en-masse'' once more:
278 \smallskip
279 {\lisp
280 (download-en-masse)
281 $\Rightarrow$
282 ((0 "ground" 0 0)
283 (1 "article-type" 0 0))}
284 \smallskip\noindent
285 The walrus and the rest of what we typed in are gone, sure enough. To
286 bring them back, we will use the command ``{\tt upload-en-masse}''
287 which takes a list of quintuplets generated by ``{\tt
288 download-en-masse}'' and constructs the database recorded therein:
289 \smallskip
290 {\lisp
291 (upload-en-masse simple-example)
292 $\Rightarrow$ t}
293 \smallskip\noindent
294 When we ask about the nodes now, we see that the walrus has returned:
295 \smallskip
296 {\lisp
297 \smallskip\noindent
298 (uid2label (get-source 4)) $\Rightarrow$ "walrus"}
299 \smallskip\noindent
300 To be sure, we can double check by comparing the saved value from
301 earlier with what we get by downloading now:
302 \smallskip
303 {\lisp
304 (equal simple-example (download-en-masse))
305 $\Rightarrow$ t}
307 \beginsection Types, adding, and editing
309 Having seen how the basics go, we will now examine more features of
310 the fundamental system --- types of articles, primitive commands for
311 editing content, and mass addition of articles.
313 In a typical semantic network, the links and nodes have types which
314 are important for the semantics. For instance, in our example, the
315 link would have type ``isa'' and the nodes might have type ``thing'':
317 \matrix{\mathop{\lower3pt\boxit{\tt walrus}}\limits_{\rm thing} &
318 \mathop{\hbox to 40pt{\rightarrowfill}}\limits^{\rm isa} &
319 \mathop{\lower3pt\boxit{\tt mammal}}\limits_{\rm thing}}
321 In order to type articles in a systematic way which is easily
322 accessible to processes which depend on article type, we make use of
323 the special article {\tt article-type} (which is automatically created
324 by the system; e.g. see the output of {\tt download-en-masse} above).
325 To this special article are attached articles corresponding to the
326 different types; in our example these will include {\tt isa} and {\tt
327 thing}. To record the type of a given article, we make a link from
328 the home node for the type to the article in question.
330 So we begin by making the home nodes for our types. For convenience,
331 we combine article creation and labeling and use user-friendly names.
332 \smallskip
333 {\lisp
334 (label-article
335 (put-article (label2uid "article-type")
336 (label2uid "ground")
337 nil)
338 "isa-type")
339 $\Rightarrow$ 5
340 (label-article
341 (put-article (label2uid "article-type")
342 (label2uid "ground")
343 nil)
344 "thing-type")
345 $\Rightarrow$ 6}
346 \smallskip\noindent
347 Note that the source of these articles is {\tt ground} and that they
348 are attached to the main node {\tt article-type} by having it as their
349 sink. This is an example of a ``stickie'' of the sort mentioned in
350 the introduction. Such a construction is useful in cases like this
351 where we want to attach one article to another but don't need to have
352 a link between them as a separate object.
354 Whoopsie! we said the source was {\tt ground} and the sink was {\tt
355 article-type}, but when we look at what we told the machine, we got it
356 backwards:
357 \smallskip
358 {\lisp
359 (uid2label (get-source 5))
360 $\Rightarrow$ "article-type"
361 (uid2label (get-sink 5))
362 $\Rightarrow$ "ground"}
363 \smallskip\noindent
364 As it sometimes happens when writing exposition, your author has made
365 a serendiptious mistake which will be used for illustrating a point,
366 in this case how to edit articles. Even more serendipitously, he has
367 made the mistake twice, providing an occasion to illustrate two
368 different methods.
370 The first method is to erase the incorrect article and enter it again
371 correctly. To effect the erasure, we use the command {\tt
372 remove-article}:
373 \smallskip
374 {\lisp
375 (remove-article 5)
376 $\Rightarrow$ nil}
377 \smallskip\noindent
378 Then we enter it again as it should have been:
379 \smallskip
380 {\lisp
381 (label-article
382 (put-article (label2uid "ground")
383 (label2uid "article-type")
384 "isa")
385 "isa-type")
386 $\Rightarrow$ 7}
387 \smallskip\noindent
388 While we were at it, we took advantage of the occasion to put
389 something more interesting than {\tt nil} in the content slot. Also
390 note that the new article now has uid 7 as opposed to uid 5 for the
391 one we deleted. This, of course, is no surprise because there is no
392 reason to suppose that the machine would happen to assign the same
393 number to the new article as belonged to the deleted article it is
394 intended to replace. Because, as yet, nothing links to this article,
395 this change of idntifying number does not have any effect and we can
396 refer to the new article using the same label.
398 The other way of rectifying the booboo is to change the source, sink,
399 and content of the erroneous article. We carry this out by using a
400 trio of appropriately named commands, as illustrated below:
401 \smallskip
402 {\lisp
403 (update-source 6 (label2uid "ground"))
404 $\Rightarrow$ ((6 . 0) (7 . 0))
405 (update-sink 6 (label2uid "article-type"))
406 $\Rightarrow$ ((6 . 1) (6 . 0))
407 (update-content 6 "thing")
408 $\Rightarrow$ (0 1 . "thing")}
409 \smallskip\noindent
410 Making the change this way ensures that the article retains the same
411 identifier. As stated above, while this is unimportant for the
412 current example, it would be a different matter if there were other
413 articles pointing to the artice being edited since deleting it and
414 replacing it with a new article would mean losing those links.
416 Having straightened out this flap over the article types, we now
417 proceed to add types to our articles. To do this, we make links from
418 our newly-created nodes isa-type and thing-type to the nodes to be
419 assigned types:
420 \smallskip
421 {\lisp
422 (put-article (label2uid "thing-type")
423 (label2uid "walrus")
424 nil)
425 $\Rightarrow$ 8
426 (put-article (label2uid "thing-type")
427 (label2uid "mammal")
428 nil)
429 $\Rightarrow$ 9
430 (put-article (label2uid "isa-type")
432 nil)
433 $\Rightarrow$ 10}
434 \smallskip\noindent
435 Note that, in the case of our link which states that a walrus is a
436 mammal, recording its type entails making a link to a link.
439 \beginsection Query and Search
441 \end
443 \hbox{\vbox{\hrule% upper side
444 \hbox{\vrule% left side
445 \kern3pt\vbox{\kern22pt% top marin
446 \hbox{walrus}\kern5pt %bottom margin
447 }\kern3pt % right margin
448 \vrule% right side
449 }\hrule%bottom side
450 \kern-5.5pt}}