2 # -*- coding: mule-utf-8-unix -*-
5 #+TAGS: OPTIMIZE PRETTIER
7 #+TITLE: DMV/CCM -- todo-list / progress ARCHIVED ENTRIES
8 #+AUTHOR: Kevin Brubeck Unhammer
9 #+EMAIL: K.BrubeckUnhammer at student uva nl
12 #+SEQ_TODO: TOGROK TODO DONE
15 Archived entries from file /Users/kiwibird/dmvccm/DMVCCM.org
16 * DONE [#A] test and debug my brilliant idea
17 CLOSED: [2008-06-08 Sun 10:28]
19 :ARCHIVE_TIME: 2008-06-08 Sun 12:55
20 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
21 :ARCHIVE_OLPATH: Adjacency and combining it with inner()
22 :ARCHIVE_CATEGORY: DMVCCM
25 * DONE implement my brilliant idea.
26 CLOSED: [2008-06-01 Sun 17:19]
28 :ARCHIVE_TIME: 2008-06-08 Sun 12:55
29 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
30 :ARCHIVE_OLPATH: Adjacency and combining it with inner()
31 :ARCHIVE_CATEGORY: DMVCCM
34 [[file:src/dmv.py::def%20e%20s%20t%20LHS%20Lattach%20Rattach][e(sti) in dmv.py]]
36 * DONE [#A] test inner() on sentences with duplicate words
38 :ARCHIVE_TIME: 2008-06-08 Sun 12:55
39 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
40 :ARCHIVE_OLPATH: Adjacency and combining it with inner()
41 :ARCHIVE_CATEGORY: DMVCCM
44 Works with eg. the sentence "h h h"
45 * DONE [#A] How do we only count from completed trees?
46 CLOSED: [2008-06-13 Fri 11:40]
48 :ARCHIVE_TIME: 2008-06-15 Sun 23:52
49 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
50 :ARCHIVE_OLPATH: P_STOP and P_CHOOSE for IO/EM (reestimation)
51 :ARCHIVE_CATEGORY: DMVCCM
54 Use c(s,t,Node); inner * outer / P_sent
56 * DONE [#A] c(s,t,Node)
57 CLOSED: [2008-06-13 Fri 11:38]
59 :ARCHIVE_TIME: 2008-06-15 Sun 23:52
60 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
61 :ARCHIVE_OLPATH: P_STOP and P_CHOOSE for IO/EM (reestimation)
62 :ARCHIVE_CATEGORY: DMVCCM
65 = inner * outer / P_sent
67 implemented as inner * outer / inner_sent
68 * DONE if loc_h == t, no need to try right-attachment rules &v.v. :OPTIMIZE:
69 CLOSED: [2008-06-10 Tue 14:34]
71 :ARCHIVE_TIME: 2008-06-15 Sun 23:52
72 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
73 :ARCHIVE_OLPATH: Deferred
74 :ARCHIVE_CATEGORY: DMVCCM
77 (and if loc_h == s, no need to try left-attachment rules.)
79 Modest speed increase (5%).
80 * DONE io.debug parameters should not call functions :OPTIMIZE:
81 CLOSED: [2008-06-10 Tue 12:26]
83 :ARCHIVE_TIME: 2008-06-15 Sun 23:52
84 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
85 :ARCHIVE_OLPATH: Deferred
86 :ARCHIVE_CATEGORY: DMVCCM
89 Exchanged all io.debug(str,'level') calls with statements of the form:
90 :if 'level' in io.DEBUG:
93 and got an almost threefold speed increase on inner().
94 * DONE inner_dmv() should disregard rules with heads not in sent :OPTIMIZE:
95 CLOSED: [2008-06-08 Sun 10:18]
97 :ARCHIVE_TIME: 2008-06-15 Sun 23:52
98 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
99 :ARCHIVE_OLPATH: Deferred
100 :ARCHIVE_CATEGORY: DMVCCM
103 If the sentence is "nn vbd det nn", we should not even look at rules
105 : rule.head() not in "nn vbd det nn".split()
106 This is ruled out by getting rules from g.rules(LHS, sent).
108 Also, we optimize this further by saying we don't even recurse into
109 attachment rules where
110 : rule.head() not in sent[ s :r+1]
111 : rule.head() not in sent[r+1:t+1]
112 meaning, if we're looking at the span "vbd det", we only use
113 attachment rules where both daughters are members of ['vbd','det']
114 (although we don't (yet) care about removing rules that rewrite to the
115 same tag if there are no duplicate tags in the span, etc., that would
116 be a lot of trouble for little potential gain).
117 * DONE Problem with this formula:
119 :ARCHIVE_TIME: 2008-06-23 Mon 15:10
120 :ARCHIVE_FILE: ~/V08/Probability/dmvccm/DMVCCM.org
121 :ARCHIVE_OLPATH: P_STOP and P_CHOOSE for IO/EM (reestimation)/Implement P_CHOOSE formula.
122 :ARCHIVE_CATEGORY: DMVCCM
123 :ARCHIVE_TODO: TOGROK
125 On calculating P_{CHOOSE}(det | vbd, L) from the 1-sentence corpus
126 "det nn vbd", there are two ways in which 'det' could be left
127 attached; one is where it is attached non-adjacently (after 'vbd' has
129 :>>> c_L = c(0,0,(SEAL,g.tagnum('det')),0,g,'det nn vbd'.split(),{},{})
131 :0.62669683257918563 # so far so good
133 :>>> c_R = c(1,2,(RGO_L,g.tagnum('vbd')),2,g,'det nn vbd'.split(),{},{})
135 :0.11312217194570134 # still seeems an OK probability
137 :>>> c_M = c(0,2,(RGO_L,g.tagnum('vbd')),2,g,'det nn vbd'.split(),{},{})
139 :0.31674208144796384 # and this seems good
141 :>>> c_L / (c_R * c_M)
142 :17.490571428571432 # but this is Way off...
144 * DONE L&Y formula (20) or c()-formula?
145 CLOSED: [2008-07-23 Wed 10:52]
147 :ARCHIVE_TIME: 2008-07-23 Wed 10:52
148 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
149 :ARCHIVE_OLPATH: P_STOP and P_CHOOSE for IO/EM (reestimation)
150 :ARCHIVE_CATEGORY: DMVCCM
153 For P_CHOOSE, use this formula to get a[h_,_a_,h_]:
154 | w_sent = 1/P_sent * \sum_{r} prob(h_->_a_ h_) * e(s,r,_a_) * e(r+1,t, h_) * f(s,t,h_) |
156 | v_sent = 1/P_sent * e(s,t,h_) * f(s,t,h_) = c(s,t,h_) |
158 then divide a[h_,_a_,h_] by sum of a[h_,_x_,h_] for all x. \\
159 Similarly, P_CHOOSE(a|h,R) = a[h,h,_a_] / \sum_{x} a[h,h,_x_].
162 For stop rules, on the other hand, we use the following:\\
163 PSTOP(h|left,...) =eg. c(s,t,_h_) / c(s,t,h_) for certain s,t depending on adjacency \\
165 | 1/P_sent * e(s,t,_h_) * f(s,t,_h_) |
167 | 1/P_sent * e(s,t, h_) * f(s,t, h_) |
169 A direct translation of h_->STOP h_ into L&Y formula (20) would give:
171 | 1/P_sent * prob(_ h_ -> STOP h_) * e(s,t,h_) * f(s,t,_h_) |
173 | 1/P_sent * e(s,t,_h_) * f(s,t,_h_) |
175 But we don't want that, since what we're really after is the
176 "upside-down" probability of stopping when "generating upwards" in the
177 PCFG tree, so just keep using c()/c() like we've been doing.
179 In stop rules, the prob() is PSTOP, while for
180 attachment rules, it's PCHOOSE*(1-PSTOP).
181 * DONE [#A] Implement P_CHOOSE formula.
183 :ARCHIVE_TIME: 2008-07-23 Wed 10:52
184 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
185 :ARCHIVE_OLPATH: P_STOP and P_CHOOSE for IO/EM (reestimation)
186 :ARCHIVE_CATEGORY: DMVCCM
189 Earlier was assuming this, but have to change it into the above configurations:
191 | P_{CHOOSE}(a : h,R) = | \sum_{corpus} \sum_{s=loc(h)} \sum_{t > loc(h)} \sum_{loc(h) < r <= t} c(r,t,_a_) |
192 | | \sum_{corpus} \sum_{s=loc(h)} \sum_{t > loc(h)} \sum_{loc(h) < r <= t} c(s,t,h) * c(s, r-1, h_) |
194 | P_{CHOOSE}(a : h,L) = | \sum_{corpus} \sum_{s<loc(h)} \sum_{t>=loc(h)} \sum_{r<loc(h)} c(s,r,_a_) |
195 | | \sum_{corpus} \sum_{s<loc(h)} \sum_{t>=loc(h)} \sum_{r<loc(h)} c(s,t,h_) * c(r+1, t, h_) |
196 t >= loc(h) since there are many possibilites for right-attachments
197 below, and each of them alone gives a lower probability (through
198 multiplication) to the upper tree (so add them all)
200 The reason we have to check /both/ children of the attachments is that we
201 have to make sure they are contiguous (otherwise we would have no way
202 of ruling out eg. h_->_b_,_b_->b_->_a_, where h_ covers *s* and *t*,_b_ is
203 from *s* to *x<r* and _ a_ is from *s* to *r*).
205 * DONE P_STOP formulas for various dir and adj:
206 CLOSED: [2008-06-15 Sun 23:40]
208 :ARCHIVE_TIME: 2008-07-23 Wed 10:52
209 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
210 :ARCHIVE_OLPATH: P_STOP and P_CHOOSE for IO/EM (reestimation)
211 :ARCHIVE_CATEGORY: DMVCCM
216 | P_{STOP}(STOP : h,L,non_adj) = | \sum_{corpus} \sum_{s<loc(h)} \sum_{t>=loc(h)} c(s,t,_h_) |
217 | | \sum_{corpus} \sum_{s<loc(h)} \sum_{t>=loc(h)} c(s,t,h_) |
219 | P_{STOP}(STOP : h,L,adj) = | \sum_{corpus} \sum_{s=loc(h)} \sum_{t>=loc(h)} c(s,t,_h_) |
220 | | \sum_{corpus} \sum_{s=loc(h)} \sum_{t>=loc(h)} c(s,t,h_) |
222 | P_{STOP}(STOP : h,R,non_adj) = | \sum_{corpus} \sum_{s=loc(h)} \sum_{t>loc(h)} c(s,t,h_) |
223 | | \sum_{corpus} \sum_{s=loc(h)} \sum_{t>loc(h)} c(s,t,h) |
225 | P_{STOP}(STOP : h,R,adj) = | \sum_{corpus} \sum_{s=loc(h)} \sum_{t=loc(h)} c(s,t,h_) |
226 | | \sum_{corpus} \sum_{s=loc(h)} \sum_{t=loc(h)} c(s,t,h) |
228 (And P_{STOP}(-STOP|...) = 1 - P_{STOP}(STOP|...) )
229 * DONE COMMENT write out tex formulas for outer
231 :ARCHIVE_TIME: 2008-07-23 Wed 10:53
232 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
233 :ARCHIVE_OLPATH: outer probabilities
234 :ARCHIVE_CATEGORY: DMVCCM
236 [[file:tex/formulas.tex::P_%20OUTSIDE%20SEAL%20w%20i%20j%20P_%20STOP%20stop%20w%20left%20adj%20i][formulas.tex]]
237 * DONE outer probabilities
238 CLOSED: [2008-06-12 Thu 11:11]
240 :ARCHIVE_TIME: 2008-07-23 Wed 10:55
241 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
242 :ARCHIVE_CATEGORY: DMVCCM
246 See also [[http://www.student.uib.no/~kun041/dmvccm/tex/formulas.pdf][pdf of P_{OUTER}]], in the style of Klein's thesis appendix.
248 ** outer probabilities -- the algorithm
249 When looping through the rules which rewrite to Node, there are 6
250 different configurations, based on what the above (mother) node is,
251 and what the Node for which we're computing is.
253 Here *r* is not between *s* and *t* as in inner(), but an /outer/ index. *loc_N*
254 is the location of the Node head in the sentence, *loc_m* for the head
255 of the mother of Node.
257 + mother is a RIGHT-stop:
258 - outer(*s, t*, mother.LHS, *loc_N*), no inner-call
259 - adjacent iff *t* == *loc_m*
260 + mother is a LEFT-stop:
261 - outer(*s, t*, mother.LHS, *loc_N*), no inner-call
262 - adjacent iff *s* == *loc_m*
264 + Node is on the LEFT branch (mother.L == Node)
265 * and mother is a LEFT attachment:
266 - *loc_N* will be in the LEFT branch, can be anything here.
267 - In the RIGHT, non-attached, branch we find inner(*t+1, r*, mother.R,
268 *loc_m*) for all possible *loc_m* in the right part of the sentence.
269 - outer(*s, r*, mother.LHS, *loc_m*).
270 - adjacent iff *t+1* == *loc_m*
271 * and mother is a RIGHT attachment:
273 - In the RIGHT, attached, branch we find inner(*t+1, r*, mother.R, *loc_R*) for
274 all possible *loc_R* in the right part of the sentence.
275 - outer(*s, r*, mother.LHS, *loc_N*).
276 - adjacent iff *t* == *loc_m*
278 + Node is on the RIGHT branch (mother.R == Node)
279 * and mother is a LEFT attachment:
281 - In the LEFT, attached, branch we find inner(*r, s-1*, mother.L, *loc_L*) for
282 all possible *loc_L* in the left part of the sentence.
283 - outer(*r, t*, mother.LHS, *loc_m*).
284 - adjacent iff *s* == *loc_m*
285 * and mother is a RIGHT attachment:
286 - *loc_N* will be in the RIGHT branch, can be anything here.
287 - In the LEFT, non-attached, branch we find inner(*r, s-1*, mother.L, *loc_m*) for
288 all possible *loc_m* in the left part of the sentence.
289 - outer(*r, t*, mother.LHS, *loc_N*).
290 - adjacent iff *s-1* == *loc_m*
292 [[file:outer_attachments.jpg]]
294 : in notes: in code (constants): in Klein thesis:
295 :-------------------------------------------------------------------
296 : _h_ SEAL bar over h
297 : h_ RGO_L right-under-left-arrow over h
298 : h GO_R right-arrow over h
300 : LGO_R left-under-right-arrow over h
301 : GO_L left-arrow over h
303 Also, unlike in [[http://bibsonomy.org/bibtex/2b9f6798bb092697da7042ca3f5dee795][Lari & Young]], non-ROOT ('S') symbols may cover the
304 whole sentence, but ROOT may /only/ appear if it covers the whole
308 * DONE P_STOP and P_CHOOSE for IO/EM (reestimation)
310 :ARCHIVE_TIME: 2008-07-23 Wed 10:55
311 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
312 :ARCHIVE_CATEGORY: DMVCCM
314 [[file:src/dmv.py::DMV%20probabilities][dmv-P_STOP]]
315 Remember: The P_{STOP} formula is upside-down (left-to-right also).
316 (In the article..not the [[http://www.eecs.berkeley.edu/~klein/papers/klein_thesis.pdf][thesis]])
317 * DONE Separate initialization to another file? :PRETTIER:
318 CLOSED: [2008-06-08 Sun 12:51]
320 :ARCHIVE_TIME: 2008-07-23 Wed 11:12
321 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
322 :ARCHIVE_OLPATH: Initialization
323 :ARCHIVE_CATEGORY: DMVCCM
326 [[file:src/harmonic.py::harmonic%20py%20initialization%20for%20dmv][harmonic.py]]
327 * DONE DMV Initialization probabilities
329 :ARCHIVE_TIME: 2008-07-23 Wed 11:12
330 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
331 :ARCHIVE_OLPATH: Initialization
332 :ARCHIVE_CATEGORY: DMVCCM
335 (from initialization frequency)
336 * DONE DMV Initialization frequencies
337 CLOSED: [2008-05-27 Tue 20:04]
339 :ARCHIVE_TIME: 2008-07-23 Wed 11:12
340 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
341 :ARCHIVE_OLPATH: Initialization
342 :ARCHIVE_CATEGORY: DMVCCM
346 P_{STOP} is not well defined by K&M. One possible interpretation given
347 the sentence [det nn vb nn] is
348 : f_{STOP}( STOP|det, L, adj) +1
349 : f_{STOP}(-STOP|det, L, adj) +0
350 : f_{STOP}( STOP|det, L, non_adj) +1
351 : f_{STOP}(-STOP|det, L, non_adj) +0
352 : f_{STOP}( STOP|det, R, adj) +0
353 : f_{STOP}(-STOP|det, R, adj) +1
355 : f_{STOP}( STOP|nn, L, adj) +0
356 : f_{STOP}(-STOP|nn, L, adj) +1
357 : f_{STOP}( STOP|nn, L, non_adj) +1 # since there's at least one to the left
358 : f_{STOP}(-STOP|nn, L, non_adj) +0
361 : f[head, 'STOP', 'LN'] += (i_h <= 1) # first two words
362 : f[head, '-STOP', 'LN'] += (not i_h <= 1)
363 : f[head, 'STOP', 'LA'] += (i_h == 0) # very first word
364 : f[head, '-STOP', 'LA'] += (not i_h == 0)
365 : f[head, 'STOP', 'RN'] += (i_h >= n - 2) # last two words
366 : f[head, '-STOP', 'RN'] += (not i_h >= n - 2)
367 : f[head, 'STOP', 'RA'] += (i_h == n - 1) # very last word
368 : f[head, '-STOP', 'RA'] += (not i_h == n - 1)
370 : # this one requires some additional rewriting since it
371 : # introduces divisions by zero
372 : f[head, 'STOP', 'LN'] += (i_h == 1) # second word
373 : f[head, '-STOP', 'LN'] += (not i_h <= 1) # not first two
374 : f[head, 'STOP', 'LA'] += (i_h == 0) # first word
375 : f[head, '-STOP', 'LA'] += (not i_h == 0) # not first
376 : f[head, 'STOP', 'RN'] += (i_h == n - 2) # second-to-last
377 : f[head, '-STOP', 'RN'] += (not i_h >= n - 2) # not last two
378 : f[head, 'STOP', 'RA'] += (i_h == n - 1) # last word
379 : f[head, '-STOP', 'RA'] += (not i_h == n - 1) # not last
381 : f[head, 'STOP', 'LN'] += (i_h == 1) # second word
382 : f[head, '-STOP', 'LN'] += (not i_h == 1) # not second
383 : f[head, 'STOP', 'LA'] += (i_h == 0) # first word
384 : f[head, '-STOP', 'LA'] += (not i_h == 0) # not first
385 : f[head, 'STOP', 'RN'] += (i_h == n - 2) # second-to-last
386 : f[head, '-STOP', 'RN'] += (not i_h == n - 2) # not second-to-last
387 : f[head, 'STOP', 'RA'] += (i_h == n - 1) # last word
388 : f[head, '-STOP', 'RA'] += (not i_h == n - 1) # not last
390 "all words take the same number of arguments" interpreted as
392 : p_STOP(head, 'STOP', 'LN') = 0.3
393 : p_STOP(head, 'STOP', 'LA') = 0.5
394 : p_STOP(head, 'STOP', 'RN') = 0.4
395 : p_STOP(head, 'STOP', 'RA') = 0.7
396 (which we easily may tweak in init_zeros())
398 Go through the corpus, counting distances between heads and
399 arguments. In [det nn vb nn], we give
400 - f_{CHOOSE}(nn|det, R) +1/1 + C
401 - f_{CHOOSE}(vb|det, R) +1/2 + C
402 - f_{CHOOSE}(nn|det, R) +1/3 + C
403 - If this were the full corpus, P_{CHOOSE}(nn|det, R) would have
404 (1+1/3+2C) / sum_a f_{CHOOSE}(a|det, R)
406 The ROOT gets "each argument with equal probability", so in a sentence
407 of three words, 1/3 for each (in [nn vb nn], 'nn' gets 2/3). Basically
408 just a frequency count of the corpus...
410 In a sense there are no terminal probabilities, since an /h/ can only
411 rewrite to an 'h' anyway (it's just a check for whether, at this
412 location in the sentence, we have the right POS-tag).
413 * DONE Expectation Maximation in IO/DMV-terms
415 :ARCHIVE_TIME: 2008-07-23 Wed 11:16
416 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
417 :ARCHIVE_CATEGORY: DMVCCM
419 outer(i,j,Node) and inner(i,j,Node) calculates the expected number of
420 trees (CNF-)headed by Node from =i= to =j= (sentence locations). This uses
421 the P_STOP and P_CHOOSE values.
423 When re-estimating, we use the expected values from outer() and
424 inner() to get new values for P_STOP and P_CHOOSE. When we've
425 re-estimated for the entire corpus, we copy the new P_STOP and
426 P_CHOOSE probabilities into our DMV_Grammar(), so that in the next
427 round we use new probN and probA to find outer- and
430 Since "adjacency" is not captured in regular CNF rules, we need two
431 probabilites for each "rule", and outer() and inner() have to know when
434 * DONE [#A] Reestimate P_ROOT
435 CLOSED: [2008-07-23 Wed 14:42]
437 :ARCHIVE_TIME: 2008-07-23 Wed 14:42
438 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
439 :ARCHIVE_CATEGORY: DMVCCM
442 Should be easy, assuming my new [[file:tex/formulas.pdf][formula]] (section 4.4) is correct.
444 * DONE Make inner() and outer() also allow left-first attachment
445 CLOSED: [2008-07-23 Wed 13:57]
447 :ARCHIVE_TIME: 2008-07-23 Wed 14:42
448 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
449 :ARCHIVE_OLPATH: Combine CCM with DMV
450 :ARCHIVE_CATEGORY: DMVCCM
453 Using P_{ORDER}(/left-first/ | w) etc.
455 Time increased only from .8 to .11 sec, seems good.
456 * DONE Alternate CNF-style rules
458 :ARCHIVE_TIME: 2008-07-30 Wed 00:39
459 :ARCHIVE_FILE: ~/dmvccm/DMVCCM.org
460 :ARCHIVE_OLPATH: Alternative CNF for DMV
461 :ARCHIVE_CATEGORY: DMVCCM
464 : h[RA] Non-Terminal, attaching for the first time to the right
465 : h[RN] Non-Terminal, attaching non-adjacently to the right
466 : h_[RA] Non-Terminal, stopping to the right adjacently
467 : h_[RN] Non-Terminal, stopping to the right non-adjacently
468 : h_[LA] Non-Terminal, attaching for the first time to the left
469 : h_[LN] Non-Terminal, attaching non-adjacently to the left
470 : _h_[LA] Non-Terminal, stopping to the left adjacently
471 : _h_[LN] Non-Terminal, stopping to the left non-adjacently
473 : h[RA] -> h _a_[LA] # adjacent right attachment must go to "terminal"
474 : h[RA] -> h _a_[LN] # adjacent right attachment must go to "terminal"
476 : h[RN] -> h[RA] _a_[LA] # already attached to right
477 : h[RN] -> h[RN] _a_[LN]
479 : h_[RA] -> h STOP # adjacent right stop must go to "terminal"
480 : h_[RN] -> h[RN] STOP # o/w non-adjacent
481 : h_[RN] -> h[RA] STOP
483 : h_[LA] -> _a_[LA] h_[RA] # adjacent left attachment must
484 : h_[LA] -> _a_[LN] h_[RN] # go to mothers of stop rules
486 : h_[LN] -> _a_[LA] h_[LN] # already attached to left
487 : h_[LN] -> _a_[LN] h_[LA]
489 : _h_[LA] -> STOP h_[RA] # adjacent left stop goes
490 : _h_[LA] -> STOP h_[RN] # straight to a right stop
492 : _h_[LN] -> STOP h_[LA] # non-adjacent left stop
493 : _h_[LN] -> STOP h_[LN] # goes to a left attachment rule
495 The reestimation function still has to sum over the various
496 possibilities of N's and A's; but it seems to be simpler than the
497 loc_h-method altogether.
499 One might reduce the number of rules a tiny bit, by having eg. unary rules
502 etc. (although that might just make it all more confusing)