1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
2 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml"
4 lang="en" xml:lang="en">
7 <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
8 <meta name="generator" content="Org-mode"/>
9 <meta name="generated" content="2008/05/25 22:45:46"/>
10 <meta name="author" content="Kevin Brubeck Unhammer"/>
11 <link rel="stylesheet" type="text/css" href="http://www.student.uib.no/~kun041/org.css">
13 <h1 class="title">DMV/CCM</h1>
14 <div id="table-of-contents">
15 <h2>Table of Contents</h2>
16 <div id="text-table-of-contents">
18 <li><a href="#sec-1">1 dmvccm</a>
20 <li><a href="#sec-1.1">1.1 [#A] P<sub>STOP</sub></a></li>
21 <li><a href="#sec-1.2">1.2 P<sub>CHOOSE</sub></a></li>
22 <li><a href="#sec-1.3">1.3 Initialization </a></li>
23 <li><a href="#sec-1.4">1.4 Adjacency and combining it with inner()</a></li>
24 <li><a href="#sec-1.5">1.5 What exactly is the E-step of DMV? Is the M-step just inner on the full sentence?</a></li>
25 <li><a href="#sec-1.6">1.6 Meet Yoav again about dmvccm</a>
27 <li><a href="#sec-1.6.1">1.6.1 Initialization</a></li>
28 <li><a href="#sec-1.6.2">1.6.2 Corpus access?</a></li>
29 <li><a href="#sec-1.6.3">1.6.3 How do we interpret DMV as an inside/outside process?</a></li>
30 <li><a href="#sec-1.6.4">1.6.4 The upside-down P<sub>STOP</sub> formula (left-to-right also)?</a></li>
31 <li><a href="#sec-1.6.5">1.6.5 Technical: sentences or rules as the "outer loop"?</a></li>
32 <li><a href="#sec-1.6.6">1.6.6 What are the formulas for P<sub>CHOOSE</sub> etc?</a></li>
33 <li><a href="#sec-1.6.7">1.6.7 How is the P<sub>STOP</sub> formula different given other values for dir and adj?</a></li>
34 <li><a href="#sec-1.6.8">1.6.8 CCM-questions</a></li>
35 <li><a href="#sec-1.6.9">1.6.9 (Answered already?) How do we know whether we are 'adjacent' or not? </a></li>
39 <li><a href="#sec-2">2 Python-stuff</a></li>
44 <div id="outline-container-1" class="outline-2">
45 <h2 id="sec-1">1 dmvccm</h2>
48 <p><span class="timestamp-kwd">DEADLINE: </span> <span class="timestamp">2008-06-30 Mon</span><br/>
49 (But absolute, extended, really-quite-dead-now deadline: August 31…)
50 <a href="src/dmv.py">dmv.py</a>
51 <a href="src/io.py">io.py</a>
55 <div id="outline-container-1.1" class="outline-3">
56 <h3 id="sec-1.1">1.1 <span class="todo">TODO</span> [#A] P<sub>STOP</sub></h3>
59 <p><a href="src/dmv.py">dmv.py</a>
64 <div id="outline-container-1.2" class="outline-3">
65 <h3 id="sec-1.2">1.2 <span class="todo">TOGROK</span> P<sub>CHOOSE</sub></h3>
72 <div id="outline-container-1.3" class="outline-3">
73 <h3 id="sec-1.3">1.3 <span class="todo">TOGROK</span> Initialization </h3>
80 <div id="outline-container-1.4" class="outline-3">
81 <h3 id="sec-1.4">1.4 <span class="todo">TOGROK</span> Adjacency and combining it with inner()</h3>
88 <div id="outline-container-1.5" class="outline-3">
89 <h3 id="sec-1.5">1.5 <span class="todo">TOGROK</span> What exactly is the E-step of DMV? Is the M-step just inner on the full sentence?</h3>
98 <div id="outline-container-1.6" class="outline-3">
99 <h3 id="sec-1.6">1.6 Meet Yoav again about dmvccm</h3>
102 <p><span class="timestamp-kwd">SCHEDULED: </span> <span class="timestamp">2008-05-26 Mon</span><br/>
110 <div id="outline-container-1.6.1" class="outline-4">
111 <h4 id="sec-1.6.1">1.6.1 Initialization</h4>
112 <div id="text-1.6.1">
114 <p>Do we have to go through the corpus, since the probabilities are based
115 on how far away in the sentence arguments are from their heads?
120 <div id="outline-container-1.6.2" class="outline-4">
121 <h4 id="sec-1.6.2">1.6.2 Corpus access?</h4>
122 <div id="text-1.6.2">
128 <div id="outline-container-1.6.3" class="outline-4">
129 <h4 id="sec-1.6.3">1.6.3 How do we interpret DMV as an inside/outside process?</h4>
130 <div id="text-1.6.3">
132 <p>c<sub>s</sub>(x : i, j) is "the expected fraction of parses of s" with x from
133 i to j; expectation then uses the probabilities gotten from
134 initialization and previously gained probabilities, but these are of
135 the form P<sub>STOP</sub> and P<sub>CHOOSE</sub>, how do we translate this to inside
136 outside, which just uses the probabilities of CFG-rules?
141 <div id="outline-container-1.6.4" class="outline-4">
142 <h4 id="sec-1.6.4">1.6.4 The upside-down P<sub>STOP</sub> formula (left-to-right also)?</h4>
143 <div id="text-1.6.4">
149 <div id="outline-container-1.6.5" class="outline-4">
150 <h4 id="sec-1.6.5">1.6.5 Technical: sentences or rules as the "outer loop"?</h4>
151 <div id="text-1.6.5">
157 <div id="outline-container-1.6.6" class="outline-4">
158 <h4 id="sec-1.6.6">1.6.6 What are the formulas for P<sub>CHOOSE</sub> etc?</h4>
159 <div id="text-1.6.6">
161 <p>Is this the same as the regular E-step (M-step?) summation of
162 Lari&Young? (Equation 20)
167 <div id="outline-container-1.6.7" class="outline-4">
168 <h4 id="sec-1.6.7">1.6.7 How is the P<sub>STOP</sub> formula different given other values for dir and adj?</h4>
169 <div id="text-1.6.7">
173 (Presumably, the P<sub>STOP</sub> formula where STOP is True is just the
174 rule-probability of <u>h</u> -> STOP h_ or h_ -> h STOP, but how does
175 adjacency fit in here?)
178 (And P<sub>STOP</sub>(-STOP|…) = 1 - P<sub>STOP</sub>(STOP|…) )
183 <div id="outline-container-1.6.8" class="outline-4">
184 <h4 id="sec-1.6.8">1.6.8 CCM-questions</h4>
185 <div id="text-1.6.8">
189 True or false: half of the CCM model probabilities (the one where
190 the span is a constituent) are exactly inside and outside
205 <div id="outline-container-1.6.9" class="outline-4">
206 <h4 id="sec-1.6.9">1.6.9 (Answered already?) How do we know whether we are 'adjacent' or not? </h4>
207 <div id="text-1.6.9">
212 <div id="outline-container-1.6.9.1" class="outline-5">
213 <h5 id="sec-1.6.9.1">1.6.9.1 One configuration that I'm fairly certain of: right w/CHOOSE</h5>
214 <div id="text-1.6.9.1">
217 \Tree [<sub>b</sub> [<sub>b</sub> b <u>c</u> ] <u>d</u> ]
218 then the lower tree [<sub>b</sub> b <u>c</u> ] is adjacent since, working your way up
219 the tree, no argument has been created to the right "yet"; while the
220 outer tree [<sub>b</sub> [<sub>b</sub> … ] <u>d</u> ] is non-adjacent, since there is something in
221 between… Is it thus always adjacent to the right if the distance
222 is 2? (That is, in e(s,t,i) for the adjacent rule: t - s == 2; while
223 in the non_adj rule: t - s == 4)
225 <li id="sec-1.6.9.1.1">Implementing this:<br/>
226 Two different DMVRules? Or just two different prob-values per rule?
233 <div id="outline-container-1.6.9.2" class="outline-5">
234 <h5 id="sec-1.6.9.2">1.6.9.2 left w/CHOOSE</h5>
235 <div id="text-1.6.9.2">
242 <div id="outline-container-1.6.9.3" class="outline-5">
243 <h5 id="sec-1.6.9.3">1.6.9.3 R/L without CHOOSE, the "sealing operations"</h5>
244 <div id="text-1.6.9.3">
246 <p><u>h</u> -> STOP h_ and h_ -> h STOP
249 What is "adjacency" here? That t - s == 1?
262 <div id="outline-container-2" class="outline-2">
263 <h2 id="sec-2">2 Python-stuff</h2>
268 <a href="src/pseudo.py">pseudo.py</a>
271 <a href="http://nltk.org/doc/en/structured-programming.html">http://nltk.org/doc/en/structured-programming.html</a> recursive dynamic
274 <a href="http://nltk.org/doc/en/advanced-parsing.html">http://nltk.org/doc/en/advanced-parsing.html</a>
281 <div id="postamble"><p class="author"> Author: Kevin Brubeck Unhammer
282 <a href="mailto:K.BrubeckUnhammer at student uva nl "><K.BrubeckUnhammer at student uva nl ></a>
284 <p class="date"> Date: 2008/05/25 22:45:46</p>
285 </div><p class="postamble">Skrive vha. emacs + <a href='http://orgmode.org/'>org-mode</a></p></body>