1 <!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Strict//EN"
2 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3 <html xmlns=
"http://www.w3.org/1999/xhtml"
4 lang=
"en" xml:
lang=
"en">
6 <title>DMV/CCM
– todo-list / progress
</title>
7 <meta http-equiv=
"Content-Type" content=
"text/html;charset=utf-8"/>
8 <meta name=
"generator" content=
"Org-mode"/>
9 <meta name=
"generated" content=
"2008-09-21 19:00:58 CEST"/>
10 <meta name=
"author" content=
"Kevin Brubeck Unhammer"/>
11 <style type=
"text/css">
12 html
{ font-family: Times
, serif
; font-size: 12pt; }
13 .title { text-align: center
; }
15 .done { color: green
; }
16 .tag { background-color:lightblue
; font-weight:normal
}
18 .timestamp { color: grey
}
19 .timestamp-kwd { color: CadetBlue
}
20 p
.verse
{ margin-left: 3% }
22 border: 1pt solid
#AEBDCC;
23 background-color: #F3F5F7;
25 font-family: courier
, monospace
;
29 table
{ border-collapse: collapse
; }
30 td
, th
{ vertical-align: top
; }
31 dt
{ font-weight: bold
; }
32 </style><link rel=
"stylesheet" type=
"text/css" href=
"http://www.student.uib.no/~kun041/org.css">
33 <!-- override with local style.css: -->
34 <link rel=
"stylesheet" type=
"text/css" href=
"./style.css">
36 <h1 class=
"title">DMV/CCM
– todo-list / progress
</h1>
37 <div id=
"table-of-contents">
38 <h2>Table of Contents
</h2>
39 <div id=
"text-table-of-contents">
41 <li><a href=
"#sec-1">1 DMV/CCM report and project
</a></li>
42 <li><a href=
"#sec-2">2 Notation
</a></li>
43 <li><a href=
"#sec-3">3 Testing the dependency parsed WSJ
</a>
45 <li><a href=
"#sec-3.1">3.1 [#A] Should
<code>def evaluate
</code> use add_root?
</a></li>
48 <li><a href=
"#sec-4">4 Combine CCM with DMV
</a></li>
49 <li><a href=
"#sec-5">5 Reestimate P_ORDER ?
</a></li>
50 <li><a href=
"#sec-6">6 Most Probable Parse
</a>
52 <li><a href=
"#sec-6.1">6.1 Find MPP with CCM
</a></li>
53 <li><a href=
"#sec-6.2">6.2 Find Most Probable Parse of given test sentence, in DMV
</a></li>
56 <li><a href=
"#sec-7">7 Initialization
</a>
58 <li><a href=
"#sec-7.1">7.1 CCM Initialization
</a></li>
61 <li><a href=
"#sec-8">8 [#C] Alternative CNF for DMV
</a>
63 <li><a href=
"#sec-8.1">8.1 [#A] Make and implement an equivalent grammar that's
<i>pure
</i> CNF
</a></li>
64 <li><a href=
"#sec-8.2">8.2 [#A] convert L
&Y-based reestimation into P_ATTACH and P_STOP values
</a></li>
65 <li><a href=
"#sec-8.3">8.3 [#C] move as much as possible into common_dmv.py
</a></li>
66 <li><a href=
"#sec-8.4">8.4 L
&Y-based reestimation for cnf_dmv
</a></li>
67 <li><a href=
"#sec-8.5">8.5 dmv2cnf re-estimation formulas
</a></li>
68 <li><a href=
"#sec-8.6">8.6 inner and outer for cnf_dmv.py, also cnf_harmonic.py
</a></li>
71 <li><a href=
"#sec-9">9 [#C] Deferred
</a>
73 <li><a href=
"#sec-9.1">9.1 Clean up reestimation code
</a></li>
74 <li><a href=
"#sec-9.2">9.2 [#A] compare speed of w_left/right(
…) and w(LEFT/RIGHT,
…)
</a></li>
75 <li><a href=
"#sec-9.3">9.3 when reestimating P_STOP etc, remove rules with p
< epsilon
</a></li>
76 <li><a href=
"#sec-9.4">9.4 inner_dmv, short ranges and impossible attachment
</a></li>
77 <li><a href=
"#sec-9.5">9.5 clean up the module files
</a></li>
78 <li><a href=
"#sec-9.6">9.6 Some (tagged) sentences are bound to come twice
</a></li>
79 <li><a href=
"#sec-9.7">9.7 tags as numbers or tags as strings?
</a></li>
82 <li><a href=
"#sec-10">10 Adjacency and combining it with the inside-outside algorithm
</a>
84 <li><a href=
"#sec-10.1">10.1 Possible alternate type of adjacency
</a></li>
87 <li><a href=
"#sec-11">11 Python-stuff
</a></li>
88 <li><a href=
"#sec-12">12 Git
</a></li>
93 <div id=
"outline-container-1" class=
"outline-2">
94 <h2 id=
"sec-1">1 DMV/CCM report and project
</h2>
97 <p><span class=
"timestamp-kwd">DEADLINE:
</span> <span class=
"timestamp">2008-
09-
21 Sun
</span><br/>
100 <a href=
"http://www.student.uib.no/~kun041/dmvccm/report.pdf">report.pdf
</a> – Draft report for the whole project, including formulas
101 for the full algorithms
105 <a href=
"src/main.py">main.py
</a> – evaluation, corpus likelihoods
108 <a href=
"src/wsjdep.py">wsjdep.py
</a> – corpus reader for the dependency parsed WSJ
112 <a href=
"src/loc_h_dmv.py">loc_h_dmv.py
</a> – DMV-IO and reestimation
115 <a href=
"src/loc_h_harmonic.py">loc_h_harmonic.py
</a> – DMV initialization
119 <a href=
"src/common_dmv.py">common_dmv.py
</a> – various functions used by loc_h_dmv and others
122 <a href=
"src/io.py">io.py
</a> – non-DMV IO
130 <a href=
"src/cnf_dmv.py">cnf_dmv.py
</a> – cnf-like implementation of DMV
133 <a href=
"src/cnf_harmonic.py">cnf_harmonic.py
</a> – initialization for cnf_dmv
138 <p><a href=
"http://www.student.uib.no/~kun041/dmvccm/DMVCCM_archive.html">Archived entries
</a> from this file.
143 <div id=
"outline-container-2" class=
"outline-2">
144 <h2 id=
"sec-2">2 Notation
</h2>
147 <p><pre class=
"example">
148 old notes: new notes: in tex/code (constants): in Klein thesis:
149 --------------------------------------------------------------------------------------
150 _h_ _h_ SEAL bar over h
151 h_ h
>< RGOL right-under-left-arrow over h
152 h h
> GOR right-arrow over h
154 ><h LGOR left-under-right-arrow over h
155 <h GOL left-arrow over h
157 These are represented in the code as pairs
<code>(s_h,h)
</code>, where
<code>h
</code> is an
158 integer (POS-tag) and
<code>s_h
</code> ∈ <code>{SEAL,RGOL,GOR,LGOR,GOL}
</code>.
161 <code>P_ATTACH
</code> and
<code>P_CHOOSE
</code> are synonymous, I try to use the
163 <pre class=
"example">
164 P_GO_AT(a|h,dir,adj) := P_ATTACH(a|h,dir)*(
1-P_STOP(STOP|h,dir,adj)
168 (precalculated after each reestimation with
<code>g.p_GO_AT = make_GO_AT(g.p_STOP,g.p_ATTACH)
</code>)
174 <div id=
"outline-container-3" class=
"outline-2">
175 <h2 id=
"sec-3">3 Testing the dependency parsed WSJ
</h2>
178 <p><a href=
"src/wsjdep.py">wsjdep.py
</a> uses NLTK (sort of) to get a dependency parsed version of
179 WSJ10 into the format used in mpp() in loc_h_dmv.py.
182 As a default,
<code>WSJDepCorpusReader
</code> looks for the file
<code>wsj.combined
.10.dep
</code> in
183 <code>../corpus/wsjdep
</code>.
186 Only
<code>sents()
</code>,
<code>tagged_sents()
</code> and
<code>parsed_sents()
</code> (plus a new function
187 <code>tagonly_sents()
</code>) are implemented, the other NLTK corpus functions are
188 ..um.. undefined
…
192 <div id=
"outline-container-3.1" class=
"outline-3">
193 <h3 id=
"sec-3.1">3.1 <span class=
"todo">TODO
</span> [#A] Should
<code>def evaluate
</code> use add_root?
</h3>
196 <p><a href=
"src/main.py">main.py
</a> evaluate
197 <a href=
"src/wsjdep.py">wsjdep.py
</a> add_root
200 (just has to count how many pairs are in there; Precision and Recall)
206 <div id=
"outline-container-4" class=
"outline-2">
207 <h2 id=
"sec-4">4 <span class=
"todo">TOGROK
</span> Combine CCM with DMV
</h2>
212 <a name=
"comboquestions"> </a>
215 Questions about the
<code>P_COMBO
</code> info in
<a href=
"http://www.eecs.berkeley.edu/~klein/papers/klein_thesis.pdf">Klein's thesis
</a>:
218 Page
109 (pdf:
125): We have to premultiply
"all our probabilities"
219 by the CCM base product
<i>Π<sub><i,j
></sub> P
<sub>SPAN
</sub>(
α(i,j,s)|false)P
<sub>CONTEXT
</sub>(
β(i,j,s)|false)
</i>; which
220 probabilities are included under
"all"? I'm assuming this includes
221 <code>P_ATTACH
</code> since each time
<code>P_ATTACH
</code> is used,
<i>φ</i> is multiplied in
222 (pp
.110-
111 ibid.); but
<i>φ</i> is not used for STOPs, so should we not
223 have our CCM product multiplied in there? How about
<code>P_ROOT
</code>?
224 (Guessing
<code>P_ORDER
</code> is way out of the question
…)
227 For the outside probabilities, is it correct to assume we multiply
228 in
<i>φ(j,k)
</i> or
<i>φ(k,i)
</i> when calculating
<code>inner(i,j...)
</code>? (Eg., only
229 for the outside part, not for the whole range.) I don't understand
230 the notation in
<code>O()
</code> on p
.103.
237 <div id=
"outline-container-5" class=
"outline-2">
238 <h2 id=
"sec-5">5 <span class=
"todo">TOGROK
</span> Reestimate P_ORDER ?
</h2>
245 <div id=
"outline-container-6" class=
"outline-2">
246 <h2 id=
"sec-6">6 Most Probable Parse
</h2>
252 <div id=
"outline-container-6.1" class=
"outline-3">
253 <h3 id=
"sec-6.1">6.1 <span class=
"todo">TOGROK
</span> Find MPP with CCM
</h3>
260 <div id=
"outline-container-6.2" class=
"outline-3">
261 <h3 id=
"sec-6.2">6.2 <span class=
"done">DONE
</span> Find Most Probable Parse of given test sentence, in DMV
</h3>
264 <p><span class=
"timestamp-kwd">CLOSED:
</span> <span class=
"timestamp">2008-
07-
23 Wed
10:
56</span><br/>
265 inner() optionally keeps track of the highest probability children of
266 any node in
<code>mpptree
</code>. Say we're looking for
<code>inner(i,j,(s_h,h),loc_h)
</code> in
267 a certain sentence, and we find some possible left and right children,
268 we add to
<code>mpptree[i,j,(s_h,h),loc_h]
</code> the triple
<code>(p, L, R)
</code> where
<code>L
</code> and
269 <code>R
</code> are of the same form as the key (
<code>i,j,(s_h,h),loc_h
</code>) and
<code>p
</code> is the
270 probability of this node rewriting to
<code>L
</code> and
<code>R
</code>,
271 eg.
<code>inner(L)*inner(R)*p_GO_AT
</code> or
<code>p_STOP
</code> or whatever. We only add this
272 entry to
<code>mpptree
</code> if there wasn't a higher-probability entry there
276 Then, after
<code>inner_sent
</code> makes an
<code>mpptree
</code>, we find the
<i>relevant
</i>
277 head-argument pairs by searching through the tree using a queue,
278 adding the
<code>L
</code> and
<code>R
</code> keys of any entry to the queue as we find them
279 (skipping
<code>STOP
</code> keys), and adding any attachment entries to a set of
280 triples
<code>(head,argument,dir)
</code>. Thus we have our most probable parse,
282 <pre class=
"example">
283 set([( ROOT, (vbd,
2),RIGHT),
284 ((vbd,
2),(nn,
1),LEFT),
285 ((vbd,
2),(nn,
3),RIGHT),
286 ((nn,
1),(det,
0),LEFT)])
293 <div id=
"outline-container-7" class=
"outline-2">
294 <h2 id=
"sec-7">7 Initialization
</h2>
297 <p><a href=
"/Users/kiwibird/Documents/Skole/V08/Probability/dmvccm/src/dmv.py">dmv-inits
</a>
300 We go through the corpus, since the probabilities are based on how far
301 away in the sentence arguments are from their heads.
305 <div id=
"outline-container-7.1" class=
"outline-3">
306 <h3 id=
"sec-7.1">7.1 <span class=
"todo">TOGROK
</span> CCM Initialization
</h3>
309 <p>P
<sub>SPLIT
</sub> used here
… how, again?
315 <div id=
"outline-container-8" class=
"outline-2">
316 <h2 id=
"sec-8">8 <span class=
"todo">TODO
</span> [#C] Alternative CNF for DMV
</h2>
321 <a name=
"dmv2cnf"> </a>
324 <a href=
"src/cnf_dmv.py">cnf_dmv.py
</a>
327 <a href=
"src/cnf_harmonic.py">cnf_harmonic.py
</a>
332 <p>See section
5 of
<a href=
"tex/formulas.pdf">formulas.pdf
</a>.
335 Given a grammar with certain p_ATTACH, p_STOP and p_ROOT, we get:
336 <pre class=
"example">
337 >>> print testgrammar_h():
338 h
>< --
> h
> STOP [
0.30]
339 h
>< --
> >h
> STOP [
0.40]
340 _h_ --
> STOP h
>< [
1.00]
341 _h_ --
> STOP
<h
>< [
1.00]
342 >h
> --
> h
> _h_ [
1.00]
343 >h
> --
> >h
> _h_ [
1.00]
344 <h
>< --
> _h_ h
>< [
0.70]
345 <h
>< --
> _h_
<h
>< [
0.60]
346 ROOT --
> STOP _h_ [
1.00]
352 <div id=
"outline-container-8.1" class=
"outline-3">
353 <h3 id=
"sec-8.1">8.1 <span class=
"todo">TODO
</span> [#A] Make and implement an equivalent grammar that's
<i>pure
</i> CNF
</h3>
356 <p>…since I'm not sure about my unary reestimation rules (section
5 of
357 <a href=
"tex/formulas.pdf">formulas
</a>).
360 For any rule where LHS is
<code>_h_
</code> we also have a corresponding one with
361 LHS
<code>ROOT
</code>, only difference being that we multiply in
<code>p_ROOT(h)
</code>.
364 For any rule where LHS is
<code>.h
></code>, we use adjacent probabilities for the
365 left child; if LHS is
<code><h.
</code> we use adjacent probabilities for the right
366 child. Only
<code>_h_
</code> and
<code>_h
>_
</code> (plus
<code>ROOT
</code>) get to introduce the pre-terminal
367 <code>h
</code> (where
<code>h
</code>,
<code>ROOT
</code> and
<code>_h_
</code> all rewrite to the terminal
368 <code>'h'
</code>), and only
<code>_h_
</code> and
<code>_h
>_
</code> (plus
<code>ROOT
</code>) act as STOP
369 rules (eg. get to multiply in
<code>p(STOP)
</code>).
372 <pre class=
"example">
374 _h_ --
> 'h' p(STOP|h,L,adj) * p(STOP|h,R,adj)
375 ROOT --
> 'h' p(STOP|h,L,adj) * p(STOP|h,R,adj) * p_ROOT(h)
377 _h_ --
> h _a_ p(STOP|h,L,adj) * p(STOP|h,R,non) * p(a|h,R)*p(-STOP|h,R,adj)
378 _h_ --
> h .h
> p(STOP|h,L,adj) * p(STOP|h,R,non)
379 .h
> --
> _a_ _b_ p(a|h,R)*p(-STOP|h,R,adj) * p(b|h,R)*p(-STOP|h,R,non)
380 .h
> --
> _a_ h
> p(a|h,R)*p(-STOP|h,R,adj)
381 h
> --
> _a_ _b_ p(a|h,R)*p(-STOP|h,R,non) * p(b|h,R)*p(-STOP|h,R,non)
382 h
> --
> _a_ h
> p(a|h,R)*p(-STOP|h,R,non)
384 _h_ --
> _a_ h p(STOP|h,L,non) * p(STOP|h,R,adj) * p(a|h,L)*p(-STOP|h,L,adj)
385 _h_ --
> <h. h p(STOP|h,L,non) * p(STOP|h,R,adj)
386 <h. --
> _b_ _a_ p(b|h,L)*p(-STOP|h,L,non) * p(a|h,L)*p(-STOP|h,L,adj)
387 <h. --
> <h _a_ p(a|h,L)*p(-STOP|h,L,adj)
388 <h --
> _a_ _b_ p(a|h,L)*p(-STOP|h,L,non) * p(b|h,L)*p(-STOP|h,L,non)
389 <h --
> <h _a_ p(a|h,L)*p(-STOP|h,L,non)
391 _h_ --
> <h. _h
>_ p(STOP|h,L,non)
392 _h_ --
> _a_ _h
>_ p(STOP|h,L,non) * p(a|h,L)*p(-STOP|h,L,adj)
393 _h
>_ --
> h .h
> p(STOP|h,R,non)
394 _h
>_ --
> h _a_ p(STOP|h,R,non) * p(a|h,R)*p(-STOP|h,R,adj)
396 ROOT --
> h _a_ p(STOP|h,L,adj) * p(STOP|h,R,non) * p(a|h,R)*p(-STOP|h,R,adj) * p_ROOT(h)
397 ROOT --
> h .h
> p(STOP|h,L,adj) * p(STOP|h,R,non) * p_ROOT(h)
399 ROOT --
> _a_ h p(STOP|h,L,non) * p(STOP|h,R,adj) * p(a|h,L)*p(-STOP|h,L,adj) * p_ROOT(h)
400 ROOT --
> <h. h p(STOP|h,L,non) * p(STOP|h,R,adj) * p_ROOT(h)
402 ROOT --
> <h. _h
>_ p(STOP|h,L,non) * p_ROOT(h)
403 ROOT --
> _a_ _h
>_ p(STOP|h,L,non) * p(a|h,L)*p(-STOP|h,L,adj) * p_ROOT(h)
408 Since we have rules rewriting
<code>h
</code> to
<code>a
</code> and
<code>b
</code>, we have a rule-set
409 numbering more than n
<sub>tags
</sub><sup>2</sup>.
415 <div id=
"outline-container-8.2" class=
"outline-3">
416 <h3 id=
"sec-8.2">8.2 <span class=
"todo">TOGROK
</span> [#A] convert L
&Y-based reestimation into P_ATTACH and P_STOP values
</h3>
419 <p>Sum over the various rules? Or something? Must think of this.
424 <div id=
"outline-container-8.3" class=
"outline-3">
425 <h3 id=
"sec-8.3">8.3 <span class=
"todo">TODO
</span> [#C] move as much as possible into common_dmv.py
</h3>
428 <p><a href=
"src/common_dmv.py">common_dmv.py
</a>
433 <div id=
"outline-container-8.4" class=
"outline-3">
434 <h3 id=
"sec-8.4">8.4 <span class=
"done">DONE
</span> L
&Y-based reestimation for cnf_dmv
</h3>
437 <p><span class=
"timestamp-kwd">CLOSED:
</span> <span class=
"timestamp">2008-
08-
21 Thu
16:
35</span><br/>
442 <div id=
"outline-container-8.5" class=
"outline-3">
443 <h3 id=
"sec-8.5">8.5 <span class=
"done">DONE
</span> dmv2cnf re-estimation formulas
</h3>
446 <p><span class=
"timestamp-kwd">CLOSED:
</span> <span class=
"timestamp">2008-
08-
21 Thu
16:
36</span><br/>
451 <div id=
"outline-container-8.6" class=
"outline-3">
452 <h3 id=
"sec-8.6">8.6 <span class=
"done">DONE
</span> inner and outer for cnf_dmv.py, also cnf_harmonic.py
</h3>
460 <div id=
"outline-container-9" class=
"outline-2">
461 <h2 id=
"sec-9">9 [#C] Deferred
</h2>
464 <p><a href=
"http://wiki.python.org/moin/PythonSpeed/PerformanceTips">http://wiki.python.org/moin/PythonSpeed/PerformanceTips
</a> Eg., use
465 map/reduce/filter/[i for i in [i's]]/(i for i in [i's]) instead of
466 for-loops; use local variables for globals (global variables or or
471 <div id=
"outline-container-9.1" class=
"outline-3">
472 <h3 id=
"sec-9.1">9.1 <span class=
"todo">TODO
</span> Clean up reestimation code
<span class=
"tag">PRETTIER
</span></h3>
479 <div id=
"outline-container-9.2" class=
"outline-3">
480 <h3 id=
"sec-9.2">9.2 <span class=
"todo">TODO
</span> [#A] compare speed of w_left/right(
…) and w(LEFT/RIGHT,
…)
<span class=
"tag">OPTIMIZE
</span></h3>
487 <div id=
"outline-container-9.3" class=
"outline-3">
488 <h3 id=
"sec-9.3">9.3 <span class=
"todo">TODO
</span> when reestimating P_STOP etc, remove rules with p
< epsilon
<span class=
"tag">OPTIMIZE
</span></h3>
495 <div id=
"outline-container-9.4" class=
"outline-3">
496 <h3 id=
"sec-9.4">9.4 <span class=
"todo">TODO
</span> inner_dmv, short ranges and impossible attachment
<span class=
"tag">OPTIMIZE
</span></h3>
499 <p>If s-t
<=
2, there can be only one attachment below, so don't recurse
500 with both Lattach=True and Rattach=True.
503 If s-t
<=
1, there can be no attachment below, so only recurse with
504 Lattach=False, Rattach=False.
507 Put this in the loop under rewrite rules (could also do it in the STOP
508 section, but that would only have an effect on very short sentences).
513 <div id=
"outline-container-9.5" class=
"outline-3">
514 <h3 id=
"sec-9.5">9.5 <span class=
"todo">TODO
</span> clean up the module files
<span class=
"tag">PRETTIER
</span></h3>
517 <p>Is there better way to divide dmv and harmonic? There's a two-way
518 dependency between the modules. Guess there could be a third file that
519 imports both the initialization and the actual EM stuff, while a file
520 containing constants and classes could be imported by all others:
521 <pre class=
"example">
522 dmv.py imports dmv_EM.py imports dmv_classes.py
523 dmv.py imports dmv_inits.py imports dmv_classes.py
530 <div id=
"outline-container-9.6" class=
"outline-3">
531 <h3 id=
"sec-9.6">9.6 <span class=
"todo">TOGROK
</span> Some (tagged) sentences are bound to come twice
<span class=
"tag">OPTIMIZE
</span></h3>
534 <p>Eg, first sort and count, so that the corpus
535 [['nn','vbd','det','nn'],
536 ['vbd','nn','det','nn'],
537 ['nn','vbd','det','nn']]
539 [(['nn','vbd','det','nn'],
2),
540 (['vbd','nn','det','nn'],
1)]
541 and then in each loop through sentences, make sure we handle the
545 Is there much to gain here?
551 <div id=
"outline-container-9.7" class=
"outline-3">
552 <h3 id=
"sec-9.7">9.7 <span class=
"todo">TOGROK
</span> tags as numbers or tags as strings?
<span class=
"tag">OPTIMIZE
</span></h3>
555 <p>Need to clean up the representation.
558 Stick with tag-strings in initialization then switch to numbers for
559 IO-algorithm perhaps? Can probably afford more string-matching in
566 <div id=
"outline-container-10" class=
"outline-2">
567 <h2 id=
"sec-10">10 Adjacency and combining it with the inside-outside algorithm
</h2>
570 <p>Each DMV probability (for a certain PCFG node) has both an adjacent
571 and a non-adjacent probability. inner() and outer() needs the correct
575 In each inner() call, loc_h is the location of the head of this
576 dependency structure. In each outer() call, it's the head of the
<i>Node
</i>,
577 the structure we're looking outside of.
580 We call inner() for each location of a head, and on each terminal,
581 loc_h must equal
<code>i
</code> (and
<code>loc_h+
1</code> equal
<code>j
</code>). In the recursive attachment
582 calls, we use the locations (sentence indices) of words to the left or
583 right of the head in calls to inner().
<i>loc_h lets us check whether we need probN or probA
</i>.
587 <div id=
"outline-container-10.1" class=
"outline-3">
588 <h3 id=
"sec-10.1">10.1 Possible alternate type of adjacency
</h3>
591 <p>K
&M's adjacency is just whether or not an argument has been generated
592 in the current direction yet. One could also make a stronger type of
593 adjacency, where h and a are not adjacent if b is in between, eg. with
594 the sentence
"a b h" and the structure ((h-
>a), (a-
>b)), h is
595 K
&M-adjacent to a, but not next to a, since b is in between. It's easy
596 to check this type of adjacency in inner(), but it needs new rules for
603 <div id=
"outline-container-11" class=
"outline-2">
604 <h2 id=
"sec-11">11 Python-stuff
</h2>
607 <p>Make those debug statements steal a bit less attention in emacs:
608 <pre class=
"example">
609 (font-lock-add-keywords
610 'python-mode ; not really regexp, a bit slow
611 '((
"^\\( *\\)\\(\\if +'.+' +in +io.DEBUG. *\\(
612 \\1 .+$\\)+\\)" 2 font-lock-preprocessor-face t)))
613 (font-lock-add-keywords
615 '((
"\\<\\(\\(io\\.\\)?debug(.+)\\)" 1 font-lock-preprocessor-face t)))
620 <a href=
"src/pseudo.py">pseudo.py
</a>
623 <a href=
"http://nltk.org/doc/en/structured-programming.html">http://nltk.org/doc/en/structured-programming.html
</a> recursive dynamic
626 <a href=
"http://nltk.org/doc/en/advanced-parsing.html">http://nltk.org/doc/en/advanced-parsing.html
</a>
629 <a href=
"http://jaynes.colorado.edu/PythonIdioms.html">http://jaynes.colorado.edu/PythonIdioms.html
</a>
639 <div id=
"outline-container-12" class=
"outline-2">
640 <h2 id=
"sec-12">12 Git
</h2>
643 <p>Repository web page:
<a href=
"http://repo.or.cz/w/dmvccm.git">http://repo.or.cz/w/dmvccm.git
</a>
646 Setting up a new project:
647 <pre class=
"example">
650 git commit -m
"first release"
654 Later on: (
<code>-a
</code> does
<code>git rm
</code> and
<code>git add
</code> automatically)
655 <pre class=
"example">
657 git commit -a -m
"some subsequent release"
661 Then push stuff up to the remote server:
662 <pre class=
"example">
663 git push git+ssh://username@repo.or.cz/srv/git/dmvccm.git master
667 (
<code>eval `ssh-agent`
</code> and
<code>ssh-add
</code> to avoid having to type in keyphrase all
671 Make a copy of the (remote) master branch:
672 <pre class=
"example">
673 git clone git://repo.or.cz/dmvccm.git
677 Make and name a new branch in this folder
678 <pre class=
"example">
679 git checkout -b mybranch
683 To save changes in
<code>mybranch
</code>:
684 <pre class=
"example">
689 Go back to the master branch (uncommitted changes from
<code>mybranch
</code> are
691 <pre class=
"example">
697 <pre class=
"example">
698 git add --interactive
703 <a href=
"http://www-cs-students.stanford.edu/~blynn//gitmagic/">http://www-cs-students.stanford.edu/~blynn//gitmagic/
</a>
706 <div id=
"postamble"><p class=
"author"> Author: Kevin Brubeck Unhammer
707 <a href=
"mailto:K.BrubeckUnhammer at student uva nl "><K.BrubeckUnhammer at student uva nl
></a>
709 <p class=
"date"> Date:
2008-
09-
21 19:
00:
58 CEST
</p>
710 <p>HTML generert av
<a href='http://orgmode.org/'
>org-mode
</a> 6.06b in emacs
22<p>
711 </div><script src=
"./post-script.js" type=
"text/JavaScript">