org/arxana-redux.org

   1 #+TITLE:     Arxana Redux
   2 #+AUTHOR:    Joe Corneli, Raymond Puzio
   3 #+EMAIL:     contact@planetmath.org
   4 #+DATE:      April 15, 2017 for Friday, 26 May, 2017 deadline
   5 #+DESCRIPTION: Organizer for presentation on arxana and math text analysis at Oxford.
   6 #+KEYWORDS: arxana, hypertext, inference anchoring
   7 #+LANGUAGE: en
   8 #+OPTIONS: H:3 num:t toc:t \n:nil @:t ::t |:t ^:nil -:t f:t *:t <:t
   9 #+OPTIONS: TeX:t LaTeX:nil skip:nil d:nil todo:t pri:nil tags:not-in-toc
  10 #+INFOJS_OPT: view:nil toc:nil ltoc:t mouse:underline buttons:0 path:http://orgmode.org/org-info.js
  11 #+EXPORT_SELECT_TAGS: export
  12 #+EXPORT_EXCLUDE_TAGS: noexport
  13 #+LINK_UP:
  14 #+LINK_HOME:
  15 #+HTML_HEAD: <link rel="stylesheet" type="text/css" href="http://metameso.org/~joe/solarized-css/build/solarized-light.css" />
  16 #+BIBLIOGRAPHY: arxana-redux-refs plain
  17 #+LATEX_HEADER: \usepackage[backend=bibtex,sorting=none]{biblatex}
  18 #+LATEX_HEADER: \addbibresource{arxana-redux-refs}
  19 #+STARTUP: showeverything
  20
  21 * Overview (Introduction)
  22
  23 For the Scheme 2017 workshop at the International Conference on
  24 Functional Programming to take place at Oxford this Autumn, we will be
  25 making a presentation on Arxana and math texts.
  26
  27 We plan to get some version of /Arxana/ working, then use it to mark
  28 up some mathematical text along the lines of /IAT/ and then come up
  29 with a demo which shows what such a /scholiumific representation/ is
  30 good for.
  31
  32 ** Glossary of keywords
  33
  34 - IAT :: Inference Anchoring Theory was devised by Budzynska et
  35      al \cite{budzynska2014model} to model the logical structure
  36      of dialogical arguments.  This has since been adapted to
  37      model mathematical arguments \cite{corneli2017qand}.  The
  38      primary features of this adaptation are: (1) to introduce
  39      more explicit relationships between pieces of mathematical
  40      content; and (2) to advance a range of ``predicates'' that
  41      relate mathematical propositions.  These predicates
  42      describe inferential structure, judgements of validity or
  43      usefulness, and reasoning tactics.
  44 - scholium :: A scholium is a word for a marginal annotation or
  45      scholarly, explanatory, remark. Corneli and Krowne proposed
  46      a "scholia-based document model" \cite{corneli2005scholia}
  47      for "commons-based peer production," that is, the kind of
  48      collaborative online knowledge-building that happens on
  49      sites like Wikipedia or in online forums
  50      \cite{benkler2002coase}, or in other digital libraries
  51      broadly construed \cite{krowne2003building}.  The proposal
  52      is that online text resources grow ``by attaching.''  One
  53      familiar class of examples is provided by revision control
  54      systems, e.g., Git, which builds a network history of
  55      changes to documents in a directed acyclic graph (DAG).
  56 - Arxana :: Arxana is the name given to a series of prototype
  57      implementations of the scholium-based document model with
  58      an Emacs front-end.  Arxana is inspired in part by Ted
  59      Nelson's Xanadu™ project, which was the first effort to
  60      work with ``hypertext'' per se
  61      \cite{nelson1981literary}. We have been specifically
  62      interested in applications within mathematics, which has
  63      given Arxana a unique flavour.  In particular, we propose
  64      to use a scholium model to model the logic of proofs, not
  65      just to represent mathematical texts for reading.
  66
  67 ** Statement of the research problem
  68
  69 The kinds of structures that are described by ``IAT+Content''
  70 are still more or less graphs.  However, to work with
  71 mathematics we need to be able to talk about /hypergraphs/.  For
  72 example, if we say ``In September 1993, an error was found in
  73 Wiles's proposed proof of Fermat's Last Theorem'' we would like
  74 to point to the entire proposed proof as an object.  Similarly,
  75 we would like to be able to draw a connection between this 1993
  76 proposal with the corrected version published in 1995.
  77
  78 As we will see (*I should think! -JC*), even a proof itself is
  79 more conveniently represented by hypergraph structures.  In
  80 particular, some of the ``predicates'' are really generative
  81 functions.  Indeed, some of them are ``heuristics'' which take
  82 some work to specify.
  83
  84 Another feature of our representation is that hypergraphs can be
  85 placed inside nodes and links of other hypergraphs.  This allows
  86 one to organize ther representation by levels of generality.  For
  87 instance, at the highest level, there might be the structure of
  88 statements and proofs of theorems.  At thje next level, there
  89 would be the individual propositions and the argument structure.
  90 At the lowest level, there would be the grammatical structure of
  91 the propositions.  Having a way of ``coarse-graining'' the
  92 representation is important in order to have it be managable and
  93 allow users to zoom in and out of the map to find the appropriate
  94 level of detail.   This is related to Fare's categorical theory
  95 of levels of a computer system [add citation].
  96
  97 One effect of this is that the markup we are devising is
  98 ``progressive'' in the sense that we can describe heuristics with
  99 nodes without fully formalising them.  A corresponding user feature is
 100 to be able to fold things at many levels.  This is important because
 101 the diagrams get quite large.
 102
 103 *** Proofs and prologues
 104
 105 In several of our examples, such as the Gowers problem and Artin's
 106 book, on sees a structure in which a proof or calculation is preceded
 107 by a kind of prologue in which the author describes motivation and
 108 strategy.  Unlike the proof proper which consists of mathematical
 109 statements and deductions, the language here is more general and
 110 the reasoning may be abductive, inductive, or heuristic.
 111
 112 We would like to have a representation for these sorts of prologues
 113 and indicate how they connect with the proof which they introduce
 114 as well as the rest of the text.  A challenge here is that the
 115 texts in question can allude and refer to all sorts of referents
 116 including themselves at a high level of abstraction and require
 117 a certain amount of common sense to be understood.
 118
 119 Since the issues that arise here are similar to those which arise in
 120 undestanding and representing natural language stories, we should be
 121 able to adapt techniques such as conceptual dependency relation or
 122 goals and themes to a mathematical context.
 123
 124 ** Suitability of venue
 125
 126 As we will also see, it is not just incidental that our work is
 127 implemented in LISP.  Actually, the kind of hypergraph programs
 128 we use are a ``natural'' extension of the basic LISP data model.
 129
 130 Emacs (and Org mode) are suitable for interacting with other
 131 programming languages and long-running processes, e.g., MAXIMA.
 132
 133 *** What is on offer in this approach
 134
 135 One thing we have here is links that contain code.
 136
 137 Another thing hypergraphs give us is links to links - of course
 138 we can also think about higher-dimensional versions of the same,
 139 but links to link.
 140
 141 * The core language
 142
 143 Here we'll describe the structures coming from several different
 144 places, e.g., Gabi, Lakatos, etc.
 145
 146 ** Content (=structure=)                                               :GRAY:
 147 *** object $O$ is =used_in= proposition $P$
 148 **** Example: ``The Caley graph of group \(G\)'' has sub-object $G$
 149 *** proposition $Q$ is a =sub-proposition= of proposition $P$
 150 **** Example: ``$P$ is difficult to prove'' has sub-proposition $P$
 151 *** proposition $R$ is a =reformed= version of proposition $P$
 152 **** Example: ``Every $x$ is a \(P\)'' might be reformed to ``Some $x$ is a \(P\)''
 153 *** property $p$ is as the =extensional_set= of all objects with property $p$
 154 **** Example: the set $\mathcal{P}$ of primes is the extension of the predicate $\mathit{prime}$.
 155 *** $O$ =instantiates= some class $C$
 156 **** Comment: This is not quite syntactic sugar for =has_property=!
 157 If we have a class of items $C$ we want to be able say that $x$ is a
 158 member of $C$.  In principle, we could write $x$ =has_property= $C$.
 159 But I think there are cases where we would want to
 160 distinguish between, e.g.,
 161 $2$ =has_property= prime, and $prime(2)$ =instantiates=
 162 $prime(x)$.  This =instantiates= operation is a natural extension of
 163 backquote; so e.g., we could write the true statement
 164 $prime(4)$ =instantiates= $prime(x)$, even though $prime(4)$ is false.
 165 **** Comment: Perhaps we need a further distinction between objects and 'schemas'
 166 So we might write =has_property(member_of(x,primes))= to talk about
 167 class membership.  Presumably (except in Russell paradox situations)
 168 this is the same as =has_property(x,prime)=, or whatever.
 169 *** $s$ =expands= $y$
 170 **** Comment: This is something that a computer algebra system like MAXIMA could confirm
 171 *** $s$ =sums= $y$
 172 **** Comment: This is something that a computer algebra system like MAXIMA could confirm
 173 *** $s$ =contains as summand= $t$
 174 **** Comment: This is more specific than the =used in= relation
 175 This relation is interesting, for example, in the case of sums like
 176 $s+t$ where $t$ is, or might be, ``small''.  Note that there
 177 would be similar ways to talk about any algebraic or
 178 syntactic relation (e.g., multiplicand or integrand or whatever).
 179 In my opinion this is better than having one general-purpose 'structural'
 180 relation that loses the distinction between all of these.  However,
 181 generalising in this way does open a can of worms for addressing.
 182 I.e. do we need XPATH or something like that for indexing into
 183 something deeply nested?
 184 *** $M$ =generalises= $m$
 185 **** Comment: Here, content relationships between methods.
 186 Does this work?  Instead of talking about content relationships 'at
 187 the content level' we are now interested in assertions like
 188 "binomial theorem" =generalises= "eliminate cross terms",
 189 where both of the quoted concepts are heuristics, not 'predicates'
 190 *** Proposition vs Object vs Binder vs Inference rule?
 191 Note that there are different kinds of `content'.  If for example I
 192 say `\(0<a<b<1 \Rightarrow a^N < b^N\)', then this is more than just a
 193 proposition, but something that can be used as a ``binder''; i.e., if
 194 I fill in specific values of $a$ and $b$ (and, optionally, $N$) I know
 195 the property holds for them.  At the same time, this sort of "binder"
 196 is presumably also a *theorem* that has been proved.  This is,
 197 additionally, similar to an *inference rule* since it allows us to move
 198 from one statement to another, schematically (see discussion above
 199 about "instantiates").  Anyway, this isn't yet about another kind of
 200 structural relation, but it does suggest that we should be a bit
 201 careful about how we think about content.
 202 ** inferential structure (=rel=)                                       :PINK:
 203 *** =stronger= (s,t)
 204 *** =not= (s)
 205 *** =conjunction= (s,t)
 206 *** =has_property= (o,p)
 207 **** Comment: It seems useful in some case to be able to say that a 'method' or heuristic has a property
 208 E.g. =has_property(strategyx, HAL)=
 209 **** Comment: has_property should probably 'emit' the property as a side-effect
 210 This way we can more easily see which is the object and which is the property.
 211 It seems that thing that has a property may also be a 'proposition',
 212 or, at least, a bit of 'object code' (like: "compute XYZ").
 213 Furthermore, at least in this case, the 'property' might be a heuristic
 214 or method (like "we actually know how to compute this").
 215 *** =instance_of= (o,m)
 216 *** =independent_of= (o,d)
 217 *** =case_split= (s, {si}.i)
 218 *** =wlog= (s,t)
 219 ** reasoning tactics; ``meta''/guiding (=meta=)                        :BLUE:
 220 *** =goal= (s)
 221 *** =strategy= (s,m)
 222 **** Note: use method $m$ to prove $s$
 223 **** Comment: perhaps this should be =strategy= (m [, s ]) so that s is optional
 224 This is because, just for graphical simplicity, we might assume that
 225 the statement that we want to prove is known in advance (e.g., it is
 226 the statement of the problem).
 227
 228 This is the style at the beginning of gowers-example.pdf
 229 **** Comment: how to show that a suggestion 'has been implemented'?
 230 We might need some other form of 'certificate' for that.
 231 E.g., it would say =implements(BLOB OF CODE,strategyx)=.
 232 *** =auxiliary= (s,a)
 233 *** =analogy= (s,t)
 234 **** Comment: analogies between statements are different from auxiliaries to prove
 235 Introducing some set of 'analogies' in a way sets us up for
 236 'inductive' reasoning across a class of examples.  What do they all
 237 have in common?  Which one is the most suitable?  Etc.
 238 ** heuristics and value judgements (=value=)                          :GREEN:
 239 *** =easy= (s,t)
 240 *** =plausible= (s)
 241 *** =beautiful= (s)
 242 *** =useful= (s)
 243 *** =heuristic= (x)
 244 **** Comment: It seems that we will have several layers of 'guidance'
 245 Many of them will be neither value judgements nor metamathematical tags
 246 like =strategy=, =goal=, or =auxiliary=.  Rather, the heuristics may be
 247 used at an even higher level to produce or to implement these.  So,
 248 for example, the heuristic might be what a =strategy= concretely
 249 suggests.  It's also true that a given =auxiliary= might need a
 250 heuristic "header", that explains /why/ this auxiliary problem has
 251 been suggested.  Similar to a =strategy= the heuristic might look
 252 for something that 'satisfies' or 'implements' the heuristics.
 253 In some cases we've looked at, this 'satisficing' object will appear
 254 directly as the argument of the heuristic, hence =heuristic(x)=.
 255 For example: =specialise(THE PROBLEM, (sqrt(2)+sqrt(3))^2)=.
 256 Incidentally, this particular example is only meaningful in connection
 257 with *another* heuristic that has told us how to *generalise* the
 258 problem, i.e., that has turned a static expression into something
 259 with a variable in it.
 260 ** Performatives (=perf=)                                            :YELLOW:
 261 *** =Agree= (s [, a ])
 262 **** Note: optional argument $a$ is a justification for $s$
 263 *** =Assert= (s [, a ])
 264 **** Note: optional argument $a$ is support for $s$
 265 **** Comment: It seems like we might want to be able to assert a 'graph structure'
 266 Here is an example from early in gowers-example.pdf
 267 #+BEGIN_SRC
 268 Assert(exists(strategy(PROBLEM, strategyx))
 269        AND has_property(strategyx,HAL))
 270 #+END_SRC
 271 *** =Challenge= (s [, c ])
 272 **** Note: $c$ is a counter-argument or counter-example to $s$
 273 *** =Define= (o,p)
 274 **** Note: object with name $o$ satisfies property $p$
 275 *** =Judge= (s)
 276 **** Note: takes a =value= argument
 277 *** =Query= (s)
 278 *** =QueryE= ({pi(X)}.i)
 279 **** Note: Asks for the class of objects  for which all of the properties pi hold
 280 *** =Retract= (s [, a ])
 281 **** Note: can only be made by someone who asserted $s$; $a$ is an optional reason
 282 *** =Suggest= (s)
 283 **** Note: Takes a =meta= argument
 284
 285
 286 * Task tree
 287
 288 ** Infrastructure
 289
 290 This file is in versioned storage in the "mob" branch of our arxana
 291 repo; instructions to access are here:
 292 http://repo.or.cz/w/arxana.git/blob_plain/d187e244a8eb7d2208f1fe98db1ef895c6cd4a36:/README.mob
 293
 294 A working copy of this repository is checked out at
 295 =/home/shared-folder/arxana-redux/arxana/= on the Linode.
 296
 297 And a "scratch" copy of the file has been copied to:
 298 =/home/shared-folder/arxana-redux/arxana-redux.org=
 299
 300 This can be edited in live sessions using =togetherly.el=.
 301
 302 ** Progress on paper
 303
 304 *** Corpus/Examples: Hand-drawn diagrams in IAT+C
 305
 306 /We are allowed to submit an appendix without a size limit, so if we don't have room for all of the diagrams in the paper, we could put them in the appendix./
 307
 308 - Minipolymath, treated by Gabi in the Transactions on Social Computing paper, pdf available from https://www.overleaf.com/read/jyhzpqsythzq
 309 - Our randomly selected MathOverflow discussion (from Q&A article)
 310 - Gowers 2012 trick problem, http://metameso.org/ar/gowers-example.pdf
 311 - Also there's Schuam's Abstract Algebra exercise 337, a diagram is shown on arxana.net.
 312  - Redone using IATC: http://metameso.org/ar/schuams-example.pdf
 313 - An example from the Ganesalingam and Gowers paper: http://metameso.org/ar/robotone-example.pdf
 314 - Galois theory. Emil Artin's book.
 315 - Euclid algorithm
 316 - possibly an example from metamath or coq or whatever to show that we can model that stuff as discourse...
 317 - ...
 318
 319 *** Other Background
 320
 321 **** Items we've worked on before
 322
 323 Over the years as we've worked on Arxana, we have generated all sorts
 324 of files.  Here are descriptions and locations of some of them which
 325 are relevant to the current project.
 326
 327 Many of these are stored in =/home/shared-folder/public_html/=, and
 328 accessible on the web at [[http://metameso.org/ar/]].
 329
 330 ***** Research notes
 331
 332 - "nemas" document:  http://metameso.org/ar/metamathematics.pdf
 333 - Joe's thesis Appendix B: http://metameso.org/ar/thesis-appendix-b.pdf
 334 - Metamathematics preprint: http://metameso.org/ar/mathematical-semantics.pdf
 335 - hcode function spec: http://metameso.org/ar/hcode.pdf
 336 - Lakatos game spec: http://metameso.org/ar/lakatos-flowchart.pdf
 337
 338 ***** Wiki
 339
 340 - [[http://metameso.org/cgi-bin/current.pl?action=browse;oldid=Arxana;id=A_scholium-based_document_model][A scholium-based document model]] is the wiki homepage for Arxana and has several relevant discussions.
 341 - [[http://metameso.org/cgi-bin/current.pl/Two_years_later][Two years later]] collects many of the mathematical themes in
 342   overview
 343 - [[http://metameso.org/cgi-bin/current.pl/samples_of_formal_mathematical_writing][samples of formal mathematical writing]] is a small collection
 344   of statements in mathematical language (leaving out ``the part
 345   in between dollar signs'')
 346
 347 ***** Website
 348
 349 - [[http://arxana.net][arxana.net]] has links to source code from various prototypes
 350  - It also has a small gallery of hand-drawn diagrams
 351
 352 **** And other things elsewhere:
 353
 354 - Handbook of artificial intelligence by Feigenbaum (Sam and Pam)
 355 - Winograd, Natural language and stories --- MIT AI Technical Report 235, February 1971 with the title "Procedures as a Representation for Data in a Computer Program for Understanding Natural Language" ([[http://hci.stanford.edu/winograd/shrdlu/AITR-235.pdf][link]])
 356  - Could we reuse this stuff for the "narrative"
 357  - How do we know how to read the text and transform it into IATC?
 358  - An example is Gowers's "total stuckness": how do we turn that into code?
 359 - Conceptual dependency representation - from "Deep Blue" website ([[https://deepblue.lib.umich.edu/bitstream/handle/2027.42/30278/0000679.pdf?sequence=1][link]]).
 360 - AI textbook in wiki.
 361
 362 *** TODO Initial non-working pseudocode prototype
 363
 364 Here it makes sense to at least try and transcribe the main examples
 365 in some "pseudocode", inspired by IATC and hcode, which we can then
 366 refine into a working version later.
 367
 368 *** TODO Basic working version
 369
 370 *** TODO Can the system come up with answers to new, basic, questions?
 371
 372 - Inspired by Nuamah et al's geography examples
 373 - Simple comparisons, like, how does this concept relate to that concept?  We informally talk about this as ``analogy'' between concepts.  But...
 374
 375 *** TODO Foldable views (like in Org mode) so that people can browse proofs
 376
 377 - This may come after the May submission
 378 - Folding and unfolding the definitions of terms in something like an APM context is relevant example.  Just `unpacking' terms.
 379
 380 * Related work
 381
 382 ** In a mathematics context
 383
 384 - ``Lakatos-style Collaborative Mathematics through Dialectical,
 385   Structured and Abstract Argumentation'' shows how it is possible to
 386   ``formalize'' the informal logic proposed by Lakatos
 387   \cite{pease2017lakatos}.
 388 - [[http://www.emis.de/monographs/Trzeciak/biglist.html][Mathematical English Usage - a Dictionary]] (Jerzy Trzeciak)
 389 - [[https://case.edu/artsci/math/wells/pub/html/abouthbk.html][The Handbook of Mathematical Discourse]] (Charles Wells)
 390 - [[http://universaar.uni-saarland.de/monographien/volltexte/2015/111/pdf/wolska_mathematicalProofs_komplett_e.pdf][Students'language in computer-assisted tutoring of mathematical proofs]] (Magdalena A. Wolska)
 391
 392 ** Does what we're doing relate in any way to other formal proof checkers?
 393
 394 - Default answer: no, because people have build large repositories of formal math (Mizar, Coq, metamath, etc.).  These other projects don't make a serious effort to deal with mathematical text.  TeX is in one place, and logic is in another place.
 395
 396  - There's another paper by Josef Urban et al. that is the exception that proves the rule \cite{kaliszyk2014developing} (see also the poster that accompanied this paper \cite{kaliszyk2014developing-misc})
 397
 398 - There's also a generative program by Ganesalingam and Gowers thath meshes natural language and formal mathematical reasoning.
 399
 400 - With these examples in mind, we assert that what we are doing can be combined with proof checkers synergistically.  For instance, one could attach scholia to the statements in an IAT representation which contain formalized representations of the content, and scholia which indicate that certain statements have been formally verified.
 401
 402 ** Further afield
 403
 404 /We're probably better off pruning these out of the final paper, but some topics of interest can be noted here./
 405
 406 - One interesting project within the DAG paradigm is the
 407   [[https://github.com/ipfs/ipfs/blob/master/papers/ipfs-cap2pfs/ipfs-p2p-file-system.pdf][InterPlanetary File System]] (IPFS).
 408
 409 * Future work
 410
 411 - I think it is probably not realistic to get the clojure.spec
 412   type-based work running *in addition* to the scholium-based
 413   stuff, but the topics seem related.  In particular, the
 414   objects that we would describe using clojure.spec are somehow
 415   the "C" in IAT+C.  So, we could have a think about how
 416   structural relationships between objects and propositions (per
 417   Gabi's description of IAT+C) might relate to the clojure.spec
 418   stuff.  I think would be appropriate to have a few pages of
 419   ``future work'' describing these issues.
 420
 421 * References                                                       :noexport:
 422
 423 # bibliography items are rendered
 424 # with the following command
 425
 426 \printbibliography
 427
 428
 429 * Annex
 430
 431 Tasks, timeline, and other things we don't need to include in the
 432 final report.
 433
 434 ** Timeline
 435
 436 Note: this could be more nicely organized using the org agenda.
 437
 438 ** Week 17th April
 439 *** Science March on Saturday
 440 ** Week 24th April
 441 *** Met: Tuesday 25th April
 442 *** DONE Joe get this document under version control
 443 ** Week 1 May
 444 *** Meet Monday 1 May (note: this is a Bank Holiday in the UK)
 445 ** Week 8 May
 446 *** Meet Saturday May 13
 447 ** Week 15 May
 448 *** *outline* of talk May 16th
 449 ** Week 22 May
 450 *** *Present* 23/24th May at [[http://dream.inf.ed.ac.uk/events/ciao-2017/][CIAO 2017]].                                :TALK:
 451 *** *Submit* 26th May                                              :DEADLINE:
 452
 453 ** Completed tasks
 454
 455 *** DONE Figure out what to do and record it in this document.         :meta:
 456 *** DONE Set up collaborative editing for files.
 457 *** WONTFIX Some way to share images (e.g. whiteboard in BBB or screenshare or...)
 458 *** DONE Debug Togetherly
 459 *** DONE make =visible-mode= on by default in this buffer
 460
 461 #+BEGIN_COMMENT
 462 The following is for controlling Togetherly and Org mode.  This might
 463 not be necessary if line 454 of togetherly.el is commented
 464 out/changed.  Also, it isn't actually evaluated.  How to fix?
 465 #+END_COMMENT
 466
 467 #+BEGIN_SRC emacs-lisp
 468 (defun execute-startup-block ()
 469   (interactive)
 470   (show-all))
 471 #+END_SRC
 472
 473 # Local Variables:
 474 # eval: (+ 1 1)
 475 # End: