2 # Parent 414ff9016349c51367bc5f9c1a815ac14177109b
4 diff --git a/other-licenses/ply/COPYING b/other-licenses/ply/COPYING
5 --- a/other-licenses/ply/COPYING
6 +++ b/other-licenses/ply/COPYING
8 - GNU LESSER GENERAL PUBLIC LICENSE
9 - Version 2.1, February 1999
10 +Copyright (C) 2001-2009,
11 +David M. Beazley (Dabeaz LLC)
14 - Copyright (C) 1991, 1999 Free Software Foundation, Inc.
15 - 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
16 - Everyone is permitted to copy and distribute verbatim copies
17 - of this license document, but changing it is not allowed.
18 +Redistribution and use in source and binary forms, with or without
19 +modification, are permitted provided that the following conditions are
22 -[This is the first released version of the Lesser GPL. It also counts
23 - as the successor of the GNU Library Public License, version 2, hence
24 - the version number 2.1.]
25 +* Redistributions of source code must retain the above copyright notice,
26 + this list of conditions and the following disclaimer.
27 +* Redistributions in binary form must reproduce the above copyright notice,
28 + this list of conditions and the following disclaimer in the documentation
29 + and/or other materials provided with the distribution.
30 +* Neither the name of the David Beazley or Dabeaz LLC may be used to
31 + endorse or promote products derived from this software without
32 + specific prior written permission.
36 - The licenses for most software are designed to take away your
37 -freedom to share and change it. By contrast, the GNU General Public
38 -Licenses are intended to guarantee your freedom to share and change
39 -free software--to make sure the software is free for all its users.
41 - This license, the Lesser General Public License, applies to some
42 -specially designated software packages--typically libraries--of the
43 -Free Software Foundation and other authors who decide to use it. You
44 -can use it too, but we suggest you first think carefully about whether
45 -this license or the ordinary General Public License is the better
46 -strategy to use in any particular case, based on the explanations below.
48 - When we speak of free software, we are referring to freedom of use,
49 -not price. Our General Public Licenses are designed to make sure that
50 -you have the freedom to distribute copies of free software (and charge
51 -for this service if you wish); that you receive source code or can get
52 -it if you want it; that you can change the software and use pieces of
53 -it in new free programs; and that you are informed that you can do
56 - To protect your rights, we need to make restrictions that forbid
57 -distributors to deny you these rights or to ask you to surrender these
58 -rights. These restrictions translate to certain responsibilities for
59 -you if you distribute copies of the library or if you modify it.
61 - For example, if you distribute copies of the library, whether gratis
62 -or for a fee, you must give the recipients all the rights that we gave
63 -you. You must make sure that they, too, receive or can get the source
64 -code. If you link other code with the library, you must provide
65 -complete object files to the recipients, so that they can relink them
66 -with the library after making changes to the library and recompiling
67 -it. And you must show them these terms so they know their rights.
69 - We protect your rights with a two-step method: (1) we copyright the
70 -library, and (2) we offer you this license, which gives you legal
71 -permission to copy, distribute and/or modify the library.
73 - To protect each distributor, we want to make it very clear that
74 -there is no warranty for the free library. Also, if the library is
75 -modified by someone else and passed on, the recipients should know
76 -that what they have is not the original version, so that the original
77 -author's reputation will not be affected by problems that might be
78 -introduced by others.
80 - Finally, software patents pose a constant threat to the existence of
81 -any free program. We wish to make sure that a company cannot
82 -effectively restrict the users of a free program by obtaining a
83 -restrictive license from a patent holder. Therefore, we insist that
84 -any patent license obtained for a version of the library must be
85 -consistent with the full freedom of use specified in this license.
87 - Most GNU software, including some libraries, is covered by the
88 -ordinary GNU General Public License. This license, the GNU Lesser
89 -General Public License, applies to certain designated libraries, and
90 -is quite different from the ordinary General Public License. We use
91 -this license for certain libraries in order to permit linking those
92 -libraries into non-free programs.
94 - When a program is linked with a library, whether statically or using
95 -a shared library, the combination of the two is legally speaking a
96 -combined work, a derivative of the original library. The ordinary
97 -General Public License therefore permits such linking only if the
98 -entire combination fits its criteria of freedom. The Lesser General
99 -Public License permits more lax criteria for linking other code with
102 - We call this license the "Lesser" General Public License because it
103 -does Less to protect the user's freedom than the ordinary General
104 -Public License. It also provides other free software developers Less
105 -of an advantage over competing non-free programs. These disadvantages
106 -are the reason we use the ordinary General Public License for many
107 -libraries. However, the Lesser license provides advantages in certain
108 -special circumstances.
110 - For example, on rare occasions, there may be a special need to
111 -encourage the widest possible use of a certain library, so that it becomes
112 -a de-facto standard. To achieve this, non-free programs must be
113 -allowed to use the library. A more frequent case is that a free
114 -library does the same job as widely used non-free libraries. In this
115 -case, there is little to gain by limiting the free library to free
116 -software only, so we use the Lesser General Public License.
118 - In other cases, permission to use a particular library in non-free
119 -programs enables a greater number of people to use a large body of
120 -free software. For example, permission to use the GNU C Library in
121 -non-free programs enables many more people to use the whole GNU
122 -operating system, as well as its variant, the GNU/Linux operating
125 - Although the Lesser General Public License is Less protective of the
126 -users' freedom, it does ensure that the user of a program that is
127 -linked with the Library has the freedom and the wherewithal to run
128 -that program using a modified version of the Library.
130 - The precise terms and conditions for copying, distribution and
131 -modification follow. Pay close attention to the difference between a
132 -"work based on the library" and a "work that uses the library". The
133 -former contains code derived from the library, whereas the latter must
134 -be combined with the library in order to run.
136 - GNU LESSER GENERAL PUBLIC LICENSE
137 - TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
139 - 0. This License Agreement applies to any software library or other
140 -program which contains a notice placed by the copyright holder or
141 -other authorized party saying it may be distributed under the terms of
142 -this Lesser General Public License (also called "this License").
143 -Each licensee is addressed as "you".
145 - A "library" means a collection of software functions and/or data
146 -prepared so as to be conveniently linked with application programs
147 -(which use some of those functions and data) to form executables.
149 - The "Library", below, refers to any such software library or work
150 -which has been distributed under these terms. A "work based on the
151 -Library" means either the Library or any derivative work under
152 -copyright law: that is to say, a work containing the Library or a
153 -portion of it, either verbatim or with modifications and/or translated
154 -straightforwardly into another language. (Hereinafter, translation is
155 -included without limitation in the term "modification".)
157 - "Source code" for a work means the preferred form of the work for
158 -making modifications to it. For a library, complete source code means
159 -all the source code for all modules it contains, plus any associated
160 -interface definition files, plus the scripts used to control compilation
161 -and installation of the library.
163 - Activities other than copying, distribution and modification are not
164 -covered by this License; they are outside its scope. The act of
165 -running a program using the Library is not restricted, and output from
166 -such a program is covered only if its contents constitute a work based
167 -on the Library (independent of the use of the Library in a tool for
168 -writing it). Whether that is true depends on what the Library does
169 -and what the program that uses the Library does.
171 - 1. You may copy and distribute verbatim copies of the Library's
172 -complete source code as you receive it, in any medium, provided that
173 -you conspicuously and appropriately publish on each copy an
174 -appropriate copyright notice and disclaimer of warranty; keep intact
175 -all the notices that refer to this License and to the absence of any
176 -warranty; and distribute a copy of this License along with the
179 - You may charge a fee for the physical act of transferring a copy,
180 -and you may at your option offer warranty protection in exchange for a
183 - 2. You may modify your copy or copies of the Library or any portion
184 -of it, thus forming a work based on the Library, and copy and
185 -distribute such modifications or work under the terms of Section 1
186 -above, provided that you also meet all of these conditions:
188 - a) The modified work must itself be a software library.
190 - b) You must cause the files modified to carry prominent notices
191 - stating that you changed the files and the date of any change.
193 - c) You must cause the whole of the work to be licensed at no
194 - charge to all third parties under the terms of this License.
196 - d) If a facility in the modified Library refers to a function or a
197 - table of data to be supplied by an application program that uses
198 - the facility, other than as an argument passed when the facility
199 - is invoked, then you must make a good faith effort to ensure that,
200 - in the event an application does not supply such function or
201 - table, the facility still operates, and performs whatever part of
202 - its purpose remains meaningful.
204 - (For example, a function in a library to compute square roots has
205 - a purpose that is entirely well-defined independent of the
206 - application. Therefore, Subsection 2d requires that any
207 - application-supplied function or table used by this function must
208 - be optional: if the application does not supply it, the square
209 - root function must still compute square roots.)
211 -These requirements apply to the modified work as a whole. If
212 -identifiable sections of that work are not derived from the Library,
213 -and can be reasonably considered independent and separate works in
214 -themselves, then this License, and its terms, do not apply to those
215 -sections when you distribute them as separate works. But when you
216 -distribute the same sections as part of a whole which is a work based
217 -on the Library, the distribution of the whole must be on the terms of
218 -this License, whose permissions for other licensees extend to the
219 -entire whole, and thus to each and every part regardless of who wrote
222 -Thus, it is not the intent of this section to claim rights or contest
223 -your rights to work written entirely by you; rather, the intent is to
224 -exercise the right to control the distribution of derivative or
225 -collective works based on the Library.
227 -In addition, mere aggregation of another work not based on the Library
228 -with the Library (or with a work based on the Library) on a volume of
229 -a storage or distribution medium does not bring the other work under
230 -the scope of this License.
232 - 3. You may opt to apply the terms of the ordinary GNU General Public
233 -License instead of this License to a given copy of the Library. To do
234 -this, you must alter all the notices that refer to this License, so
235 -that they refer to the ordinary GNU General Public License, version 2,
236 -instead of to this License. (If a newer version than version 2 of the
237 -ordinary GNU General Public License has appeared, then you can specify
238 -that version instead if you wish.) Do not make any other change in
241 - Once this change is made in a given copy, it is irreversible for
242 -that copy, so the ordinary GNU General Public License applies to all
243 -subsequent copies and derivative works made from that copy.
245 - This option is useful when you wish to copy part of the code of
246 -the Library into a program that is not a library.
248 - 4. You may copy and distribute the Library (or a portion or
249 -derivative of it, under Section 2) in object code or executable form
250 -under the terms of Sections 1 and 2 above provided that you accompany
251 -it with the complete corresponding machine-readable source code, which
252 -must be distributed under the terms of Sections 1 and 2 above on a
253 -medium customarily used for software interchange.
255 - If distribution of object code is made by offering access to copy
256 -from a designated place, then offering equivalent access to copy the
257 -source code from the same place satisfies the requirement to
258 -distribute the source code, even though third parties are not
259 -compelled to copy the source along with the object code.
261 - 5. A program that contains no derivative of any portion of the
262 -Library, but is designed to work with the Library by being compiled or
263 -linked with it, is called a "work that uses the Library". Such a
264 -work, in isolation, is not a derivative work of the Library, and
265 -therefore falls outside the scope of this License.
267 - However, linking a "work that uses the Library" with the Library
268 -creates an executable that is a derivative of the Library (because it
269 -contains portions of the Library), rather than a "work that uses the
270 -library". The executable is therefore covered by this License.
271 -Section 6 states terms for distribution of such executables.
273 - When a "work that uses the Library" uses material from a header file
274 -that is part of the Library, the object code for the work may be a
275 -derivative work of the Library even though the source code is not.
276 -Whether this is true is especially significant if the work can be
277 -linked without the Library, or if the work is itself a library. The
278 -threshold for this to be true is not precisely defined by law.
280 - If such an object file uses only numerical parameters, data
281 -structure layouts and accessors, and small macros and small inline
282 -functions (ten lines or less in length), then the use of the object
283 -file is unrestricted, regardless of whether it is legally a derivative
284 -work. (Executables containing this object code plus portions of the
285 -Library will still fall under Section 6.)
287 - Otherwise, if the work is a derivative of the Library, you may
288 -distribute the object code for the work under the terms of Section 6.
289 -Any executables containing that work also fall under Section 6,
290 -whether or not they are linked directly with the Library itself.
292 - 6. As an exception to the Sections above, you may also combine or
293 -link a "work that uses the Library" with the Library to produce a
294 -work containing portions of the Library, and distribute that work
295 -under terms of your choice, provided that the terms permit
296 -modification of the work for the customer's own use and reverse
297 -engineering for debugging such modifications.
299 - You must give prominent notice with each copy of the work that the
300 -Library is used in it and that the Library and its use are covered by
301 -this License. You must supply a copy of this License. If the work
302 -during execution displays copyright notices, you must include the
303 -copyright notice for the Library among them, as well as a reference
304 -directing the user to the copy of this License. Also, you must do one
307 - a) Accompany the work with the complete corresponding
308 - machine-readable source code for the Library including whatever
309 - changes were used in the work (which must be distributed under
310 - Sections 1 and 2 above); and, if the work is an executable linked
311 - with the Library, with the complete machine-readable "work that
312 - uses the Library", as object code and/or source code, so that the
313 - user can modify the Library and then relink to produce a modified
314 - executable containing the modified Library. (It is understood
315 - that the user who changes the contents of definitions files in the
316 - Library will not necessarily be able to recompile the application
317 - to use the modified definitions.)
319 - b) Use a suitable shared library mechanism for linking with the
320 - Library. A suitable mechanism is one that (1) uses at run time a
321 - copy of the library already present on the user's computer system,
322 - rather than copying library functions into the executable, and (2)
323 - will operate properly with a modified version of the library, if
324 - the user installs one, as long as the modified version is
325 - interface-compatible with the version that the work was made with.
327 - c) Accompany the work with a written offer, valid for at
328 - least three years, to give the same user the materials
329 - specified in Subsection 6a, above, for a charge no more
330 - than the cost of performing this distribution.
332 - d) If distribution of the work is made by offering access to copy
333 - from a designated place, offer equivalent access to copy the above
334 - specified materials from the same place.
336 - e) Verify that the user has already received a copy of these
337 - materials or that you have already sent this user a copy.
339 - For an executable, the required form of the "work that uses the
340 -Library" must include any data and utility programs needed for
341 -reproducing the executable from it. However, as a special exception,
342 -the materials to be distributed need not include anything that is
343 -normally distributed (in either source or binary form) with the major
344 -components (compiler, kernel, and so on) of the operating system on
345 -which the executable runs, unless that component itself accompanies
348 - It may happen that this requirement contradicts the license
349 -restrictions of other proprietary libraries that do not normally
350 -accompany the operating system. Such a contradiction means you cannot
351 -use both them and the Library together in an executable that you
354 - 7. You may place library facilities that are a work based on the
355 -Library side-by-side in a single library together with other library
356 -facilities not covered by this License, and distribute such a combined
357 -library, provided that the separate distribution of the work based on
358 -the Library and of the other library facilities is otherwise
359 -permitted, and provided that you do these two things:
361 - a) Accompany the combined library with a copy of the same work
362 - based on the Library, uncombined with any other library
363 - facilities. This must be distributed under the terms of the
366 - b) Give prominent notice with the combined library of the fact
367 - that part of it is a work based on the Library, and explaining
368 - where to find the accompanying uncombined form of the same work.
370 - 8. You may not copy, modify, sublicense, link with, or distribute
371 -the Library except as expressly provided under this License. Any
372 -attempt otherwise to copy, modify, sublicense, link with, or
373 -distribute the Library is void, and will automatically terminate your
374 -rights under this License. However, parties who have received copies,
375 -or rights, from you under this License will not have their licenses
376 -terminated so long as such parties remain in full compliance.
378 - 9. You are not required to accept this License, since you have not
379 -signed it. However, nothing else grants you permission to modify or
380 -distribute the Library or its derivative works. These actions are
381 -prohibited by law if you do not accept this License. Therefore, by
382 -modifying or distributing the Library (or any work based on the
383 -Library), you indicate your acceptance of this License to do so, and
384 -all its terms and conditions for copying, distributing or modifying
385 -the Library or works based on it.
387 - 10. Each time you redistribute the Library (or any work based on the
388 -Library), the recipient automatically receives a license from the
389 -original licensor to copy, distribute, link with or modify the Library
390 -subject to these terms and conditions. You may not impose any further
391 -restrictions on the recipients' exercise of the rights granted herein.
392 -You are not responsible for enforcing compliance by third parties with
395 - 11. If, as a consequence of a court judgment or allegation of patent
396 -infringement or for any other reason (not limited to patent issues),
397 -conditions are imposed on you (whether by court order, agreement or
398 -otherwise) that contradict the conditions of this License, they do not
399 -excuse you from the conditions of this License. If you cannot
400 -distribute so as to satisfy simultaneously your obligations under this
401 -License and any other pertinent obligations, then as a consequence you
402 -may not distribute the Library at all. For example, if a patent
403 -license would not permit royalty-free redistribution of the Library by
404 -all those who receive copies directly or indirectly through you, then
405 -the only way you could satisfy both it and this License would be to
406 -refrain entirely from distribution of the Library.
408 -If any portion of this section is held invalid or unenforceable under any
409 -particular circumstance, the balance of the section is intended to apply,
410 -and the section as a whole is intended to apply in other circumstances.
412 -It is not the purpose of this section to induce you to infringe any
413 -patents or other property right claims or to contest validity of any
414 -such claims; this section has the sole purpose of protecting the
415 -integrity of the free software distribution system which is
416 -implemented by public license practices. Many people have made
417 -generous contributions to the wide range of software distributed
418 -through that system in reliance on consistent application of that
419 -system; it is up to the author/donor to decide if he or she is willing
420 -to distribute software through any other system and a licensee cannot
423 -This section is intended to make thoroughly clear what is believed to
424 -be a consequence of the rest of this License.
426 - 12. If the distribution and/or use of the Library is restricted in
427 -certain countries either by patents or by copyrighted interfaces, the
428 -original copyright holder who places the Library under this License may add
429 -an explicit geographical distribution limitation excluding those countries,
430 -so that distribution is permitted only in or among countries not thus
431 -excluded. In such case, this License incorporates the limitation as if
432 -written in the body of this License.
434 - 13. The Free Software Foundation may publish revised and/or new
435 -versions of the Lesser General Public License from time to time.
436 -Such new versions will be similar in spirit to the present version,
437 -but may differ in detail to address new problems or concerns.
439 -Each version is given a distinguishing version number. If the Library
440 -specifies a version number of this License which applies to it and
441 -"any later version", you have the option of following the terms and
442 -conditions either of that version or of any later version published by
443 -the Free Software Foundation. If the Library does not specify a
444 -license version number, you may choose any version ever published by
445 -the Free Software Foundation.
447 - 14. If you wish to incorporate parts of the Library into other free
448 -programs whose distribution conditions are incompatible with these,
449 -write to the author to ask for permission. For software which is
450 -copyrighted by the Free Software Foundation, write to the Free
451 -Software Foundation; we sometimes make exceptions for this. Our
452 -decision will be guided by the two goals of preserving the free status
453 -of all derivatives of our free software and of promoting the sharing
454 -and reuse of software generally.
458 - 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
459 -WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
460 -EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
461 -OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
462 -KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
463 -IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
464 -PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
465 -LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
466 -THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
468 - 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
469 -WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
470 -AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
471 -FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
472 -CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
473 -LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
474 -RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
475 -FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
476 -SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
479 - END OF TERMS AND CONDITIONS
481 - How to Apply These Terms to Your New Libraries
483 - If you develop a new library, and you want it to be of the greatest
484 -possible use to the public, we recommend making it free software that
485 -everyone can redistribute and change. You can do so by permitting
486 -redistribution under these terms (or, alternatively, under the terms of the
487 -ordinary General Public License).
489 - To apply these terms, attach the following notices to the library. It is
490 -safest to attach them to the start of each source file to most effectively
491 -convey the exclusion of warranty; and each file should have at least the
492 -"copyright" line and a pointer to where the full notice is found.
494 - <one line to give the library's name and a brief idea of what it does.>
495 - Copyright (C) <year> <name of author>
497 - This library is free software; you can redistribute it and/or
498 - modify it under the terms of the GNU Lesser General Public
499 - License as published by the Free Software Foundation; either
500 - version 2.1 of the License, or (at your option) any later version.
502 - This library is distributed in the hope that it will be useful,
503 - but WITHOUT ANY WARRANTY; without even the implied warranty of
504 - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
505 - Lesser General Public License for more details.
507 - You should have received a copy of the GNU Lesser General Public
508 - License along with this library; if not, write to the Free Software
509 - Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
511 -Also add information on how to contact you by electronic and paper mail.
513 -You should also get your employer (if you work as a programmer) or your
514 -school, if any, to sign a "copyright disclaimer" for the library, if
515 -necessary. Here is a sample; alter the names:
517 - Yoyodyne, Inc., hereby disclaims all copyright interest in the
518 - library `Frob' (a library for tweaking knobs) written by James Random Hacker.
520 - <signature of Ty Coon>, 1 April 1990
521 - Ty Coon, President of Vice
523 -That's all there is to it!
526 +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
527 +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
528 +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
529 +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
530 +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
531 +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
532 +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
533 +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
534 +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
535 +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
536 +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
537 diff --git a/other-licenses/ply/README b/other-licenses/ply/README
538 --- a/other-licenses/ply/README
539 +++ b/other-licenses/ply/README
541 David Beazley's PLY (Python Lex-Yacc)
542 http://www.dabeaz.com/ply/
544 -Licensed under the GPL (v2.1 or later).
547 -This directory contains just the code and license from PLY version 2.5;
548 +This directory contains just the code and license from PLY version 3.3;
549 the full distribution (see the URL) also contains examples, tests,
550 documentation, and a longer README.
552 diff --git a/other-licenses/ply/ply/lex.py b/other-licenses/ply/ply/lex.py
553 --- a/other-licenses/ply/ply/lex.py
554 +++ b/other-licenses/ply/ply/lex.py
556 # -----------------------------------------------------------------------------
559 -# Author: David M. Beazley (dave@dabeaz.com)
560 +# Copyright (C) 2001-2009,
561 +# David M. Beazley (Dabeaz LLC)
562 +# All rights reserved.
564 -# Copyright (C) 2001-2008, David M. Beazley
565 +# Redistribution and use in source and binary forms, with or without
566 +# modification, are permitted provided that the following conditions are
569 +# * Redistributions of source code must retain the above copyright notice,
570 +# this list of conditions and the following disclaimer.
571 +# * Redistributions in binary form must reproduce the above copyright notice,
572 +# this list of conditions and the following disclaimer in the documentation
573 +# and/or other materials provided with the distribution.
574 +# * Neither the name of the David Beazley or Dabeaz LLC may be used to
575 +# endorse or promote products derived from this software without
576 +# specific prior written permission.
578 -# This library is free software; you can redistribute it and/or
579 -# modify it under the terms of the GNU Lesser General Public
580 -# License as published by the Free Software Foundation; either
581 -# version 2.1 of the License, or (at your option) any later version.
583 -# This library is distributed in the hope that it will be useful,
584 -# but WITHOUT ANY WARRANTY; without even the implied warranty of
585 -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
586 -# Lesser General Public License for more details.
588 -# You should have received a copy of the GNU Lesser General Public
589 -# License along with this library; if not, write to the Free Software
590 -# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
592 -# See the file COPYING for a complete copy of the LGPL.
593 +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
594 +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
595 +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
596 +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
597 +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
598 +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
599 +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
600 +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
601 +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
602 +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
603 +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
604 # -----------------------------------------------------------------------------
607 -__tabversion__ = "2.4" # Version of table file used
609 +__tabversion__ = "3.2" # Version of table file used
611 import re, sys, types, copy, os
613 +# This tuple contains known string types
616 + StringTypes = (types.StringType, types.UnicodeType)
617 +except AttributeError:
619 + StringTypes = (str, bytes)
621 +# Extract the code attribute of a function. Different implementations
622 +# are for Python 2/3 compatibility.
624 +if sys.version_info[0] < 3:
631 # This regular expression is used to match valid token names
632 _is_identifier = re.compile(r'^[a-zA-Z0-9_]+$')
634 -# _INSTANCETYPE sets the valid set of instance types recognized
635 -# by PLY when lexers are defined by a class. In order to maintain
636 -# backwards compatibility with Python-2.0, we have to check for
637 -# the existence of ObjectType.
640 - _INSTANCETYPE = (types.InstanceType, types.ObjectType)
641 -except AttributeError:
642 - _INSTANCETYPE = types.InstanceType
643 - class object: pass # Note: needed if no new-style classes present
645 # Exception thrown when invalid token encountered and no default error
646 # handler is defined.
648 class LexError(Exception):
649 def __init__(self,message,s):
650 self.args = (message,)
653 -# An object used to issue one-time warning messages for various features
655 -class LexWarning(object):
656 - def __init__(self):
658 - def __call__(self,msg):
659 - if not self.warned:
660 - sys.stderr.write("ply.lex: Warning: " + msg+"\n")
663 -_SkipWarning = LexWarning() # Warning for use of t.skip() on tokens
665 # Token class. This class is used to represent the tokens produced.
666 class LexToken(object):
668 return "LexToken(%s,%r,%d,%d)" % (self.type,self.value,self.lineno,self.lexpos)
673 - _SkipWarning("Calling t.skip() on a token is deprecated. Please use t.lexer.skip()")
675 +# This object is a stand-in for a logging object created by the
678 +class PlyLogger(object):
679 + def __init__(self,f):
681 + def critical(self,msg,*args,**kwargs):
682 + self.f.write((msg % args) + "\n")
684 + def warning(self,msg,*args,**kwargs):
685 + self.f.write("WARNING: "+ (msg % args) + "\n")
687 + def error(self,msg,*args,**kwargs):
688 + self.f.write("ERROR: " + (msg % args) + "\n")
693 +# Null logger is used when no output is generated. Does nothing.
694 +class NullLogger(object):
695 + def __getattribute__(self,name):
697 + def __call__(self,*args,**kwargs):
700 # -----------------------------------------------------------------------------
702 +# === Lexing Engine ===
704 -# This class encapsulates all of the methods and data associated with a lexer.
705 +# The following Lexer class implements the lexer runtime. There are only
706 +# a few public methods and attributes:
708 # input() - Store a new string in the lexer
709 # token() - Get the next token
710 +# clone() - Clone the lexer
712 +# lineno - Current line number
713 +# lexpos - Current position in the input string
714 # -----------------------------------------------------------------------------
718 self.lexre = None # Master regular expression. This is a list of
719 # tuples (re,findex) where re is a compiled
720 # regular expression and findex is a list
721 # mapping regex group numbers to rules
722 @@ -100,17 +131,16 @@ class Lexer:
723 self.lexpos = 0 # Current position in input text
724 self.lexlen = 0 # Length of the input text
725 self.lexerrorf = None # Error rule (if any)
726 self.lextokens = None # List of valid tokens
727 self.lexignore = "" # Ignored characters
728 self.lexliterals = "" # Literal characters that can be passed through
729 self.lexmodule = None # Module
730 self.lineno = 1 # Current line number
731 - self.lexdebug = 0 # Debugging mode
732 self.lexoptimize = 0 # Optimized mode
734 def clone(self,object=None):
737 # If the object parameter has been supplied, it means we are attaching the
738 # lexer to a new object. In this case, we have to rebind all methods in
739 # the lexstatere and lexstateerrorf tables.
740 @@ -140,16 +170,17 @@ class Lexer:
741 # ------------------------------------------------------------
742 def writetab(self,tabfile,outputdir=""):
743 if isinstance(tabfile,types.ModuleType):
745 basetabfilename = tabfile.split(".")[-1]
746 filename = os.path.join(outputdir,basetabfilename)+".py"
747 tf = open(filename,"w")
748 tf.write("# %s.py. This file automatically created by PLY (version %s). Don't edit!\n" % (tabfile,__version__))
749 + tf.write("_tabversion = %s\n" % repr(__version__))
750 tf.write("_lextokens = %s\n" % repr(self.lextokens))
751 tf.write("_lexreflags = %s\n" % repr(self.lexreflags))
752 tf.write("_lexliterals = %s\n" % repr(self.lexliterals))
753 tf.write("_lexstateinfo = %s\n" % repr(self.lexstateinfo))
756 # Collect all functions in the initial state
757 initial = self.lexstatere["INITIAL"]
758 @@ -179,55 +210,64 @@ class Lexer:
760 # ------------------------------------------------------------
761 # readtab() - Read lexer information from a tab file
762 # ------------------------------------------------------------
763 def readtab(self,tabfile,fdict):
764 if isinstance(tabfile,types.ModuleType):
767 - exec "import %s as lextab" % tabfile
768 + if sys.version_info[0] < 3:
769 + exec("import %s as lextab" % tabfile)
772 + exec("import %s as lextab" % tabfile, env,env)
773 + lextab = env['lextab']
775 + if getattr(lextab,"_tabversion","0.0") != __version__:
776 + raise ImportError("Inconsistent PLY version")
778 self.lextokens = lextab._lextokens
779 self.lexreflags = lextab._lexreflags
780 self.lexliterals = lextab._lexliterals
781 self.lexstateinfo = lextab._lexstateinfo
782 self.lexstateignore = lextab._lexstateignore
783 self.lexstatere = { }
784 self.lexstateretext = { }
785 for key,lre in lextab._lexstatere.items():
788 for i in range(len(lre)):
789 - titem.append((re.compile(lre[i][0],lextab._lexreflags),_names_to_funcs(lre[i][1],fdict)))
790 + titem.append((re.compile(lre[i][0],lextab._lexreflags | re.VERBOSE),_names_to_funcs(lre[i][1],fdict)))
791 txtitem.append(lre[i][0])
792 self.lexstatere[key] = titem
793 self.lexstateretext[key] = txtitem
794 self.lexstateerrorf = { }
795 for key,ef in lextab._lexstateerrorf.items():
796 self.lexstateerrorf[key] = fdict[ef]
797 self.begin('INITIAL')
799 # ------------------------------------------------------------
800 # input() - Push a new string into the lexer
801 # ------------------------------------------------------------
803 # Pull off the first character to see if s looks like a string
805 - if not (isinstance(c,types.StringType) or isinstance(c,types.UnicodeType)):
806 - raise ValueError, "Expected a string"
807 + if not isinstance(c,StringTypes):
808 + raise ValueError("Expected a string")
813 # ------------------------------------------------------------
814 # begin() - Changes the lexing state
815 # ------------------------------------------------------------
816 def begin(self,state):
817 - if not self.lexstatere.has_key(state):
818 - raise ValueError, "Undefined state"
819 + if not state in self.lexstatere:
820 + raise ValueError("Undefined state")
821 self.lexre = self.lexstatere[state]
822 self.lexretext = self.lexstateretext[state]
823 self.lexignore = self.lexstateignore.get(state,"")
824 self.lexerrorf = self.lexstateerrorf.get(state,None)
825 self.lexstate = state
827 # ------------------------------------------------------------
828 # push_state() - Changes the lexing state and saves old on stack
829 @@ -250,17 +290,17 @@ class Lexer:
831 # ------------------------------------------------------------
832 # skip() - Skip ahead n characters
833 # ------------------------------------------------------------
837 # ------------------------------------------------------------
838 - # token() - Return the next token from the Lexer
839 + # opttoken() - Return the next token from the Lexer
841 # Note: This function has been carefully implemented to be as fast
842 # as possible. Don't make changes unless you really know what
844 # ------------------------------------------------------------
846 # Make local copies of frequently referenced attributes
848 @@ -294,39 +334,35 @@ class Lexer:
849 self.lexpos = m.end()
857 - # if func not callable, it means it's an ignored token
858 - if not callable(func):
861 # If token is processed by a function, call it
863 tok.lexer = self # Set additional attributes useful in token rules
869 # Every function must return a token, if nothing, we just move to next token
871 lexpos = self.lexpos # This is here in case user has updated lexpos.
872 lexignore = self.lexignore # This is here in case there was a state change
875 # Verify type of the token. If not in the token map, raise an error
876 if not self.lexoptimize:
877 - if not self.lextokens.has_key(newtok.type):
878 - raise LexError, ("%s:%d: Rule '%s' returned an unknown token type '%s'" % (
879 - func.func_code.co_filename, func.func_code.co_firstlineno,
880 + if not newtok.type in self.lextokens:
881 + raise LexError("%s:%d: Rule '%s' returned an unknown token type '%s'" % (
882 + func_code(func).co_filename, func_code(func).co_firstlineno,
883 func.__name__, newtok.type),lexdata[lexpos:])
887 # No match, see if in literals
888 if lexdata[lexpos] in self.lexliterals:
890 tok.value = lexdata[lexpos]
891 @@ -343,70 +379,70 @@ class Lexer:
892 tok.lineno = self.lineno
897 newtok = self.lexerrorf(tok)
898 if lexpos == self.lexpos:
899 # Error method didn't change text position at all. This is an error.
900 - raise LexError, ("Scanning error. Illegal character '%s'" % (lexdata[lexpos]), lexdata[lexpos:])
901 + raise LexError("Scanning error. Illegal character '%s'" % (lexdata[lexpos]), lexdata[lexpos:])
903 if not newtok: continue
907 - raise LexError, ("Illegal character '%s' at index %d" % (lexdata[lexpos],lexpos), lexdata[lexpos:])
908 + raise LexError("Illegal character '%s' at index %d" % (lexdata[lexpos],lexpos), lexdata[lexpos:])
910 self.lexpos = lexpos + 1
911 if self.lexdata is None:
912 - raise RuntimeError, "No input string given with input()"
913 + raise RuntimeError("No input string given with input()")
916 + # Iterator interface
917 + def __iter__(self):
923 + raise StopIteration
928 # -----------------------------------------------------------------------------
930 +# ==== Lex Builder ===
932 -# This checks to see if there are duplicated t_rulename() functions or strings
933 -# in the parser input file. This is done using a simple regular expression
934 -# match on each line in the given file. If the file can't be located or opened,
935 -# a true result is returned by default.
936 +# The functions and classes below are used to collect lexing information
937 +# and build a Lexer object from it.
938 # -----------------------------------------------------------------------------
940 -def _validate_file(filename):
942 - base,ext = os.path.splitext(filename)
943 - if ext != '.py': return 1 # No idea what the file is. Return OK
944 +# -----------------------------------------------------------------------------
945 +# get_caller_module_dict()
947 +# This function returns a dictionary containing all of the symbols defined within
948 +# a caller further down the call stack. This is used to get the environment
949 +# associated with the yacc() call if none was provided.
950 +# -----------------------------------------------------------------------------
952 +def get_caller_module_dict(levels):
955 - lines = f.readlines()
958 - return 1 # Couldn't find the file. Don't worry about it
960 + except RuntimeError:
961 + e,b,t = sys.exc_info()
966 + ldict = f.f_globals.copy()
967 + if f.f_globals != f.f_locals:
968 + ldict.update(f.f_locals)
970 - fre = re.compile(r'\s*def\s+(t_[a-zA-Z_0-9]*)\(')
971 - sre = re.compile(r'\s*(t_[a-zA-Z_0-9]*)\s*=')
982 - prev = counthash.get(name)
984 - counthash[name] = linen
986 - print >>sys.stderr, "%s:%d: Rule %s redefined. Previously defined on line %d" % (filename,linen,name,prev)
992 # -----------------------------------------------------------------------------
995 # Given a list of regular expression functions, this converts it to a list
996 # suitable for output to a table file
997 # -----------------------------------------------------------------------------
999 @@ -461,17 +497,17 @@ def _form_master_re(relist,reflags,ldict
1000 elif handle is not None:
1001 lexindexnames[i] = f
1002 if f.find("ignore_") > 0:
1003 lexindexfunc[i] = (None,None)
1005 lexindexfunc[i] = (None, toknames[f])
1007 return [(lexre,lexindexfunc)],[regex],[lexindexnames]
1008 - except Exception,e:
1010 m = int(len(relist)/2)
1012 llist, lre, lnames = _form_master_re(relist[:m],reflags,ldict,toknames)
1013 rlist, rre, rnames = _form_master_re(relist[m:],reflags,ldict,toknames)
1014 return llist+rlist, lre+rre, lnames+rnames
1016 # -----------------------------------------------------------------------------
1017 # def _statetoken(s,names)
1018 @@ -481,360 +517,487 @@ def _form_master_re(relist,reflags,ldict
1019 # is a tuple of state names and tokenname is the name of the token. For example,
1020 # calling this with s = "t_foo_bar_SPAM" might return (('foo','bar'),'SPAM')
1021 # -----------------------------------------------------------------------------
1023 def _statetoken(s,names):
1025 parts = s.split("_")
1026 for i in range(1,len(parts)):
1027 - if not names.has_key(parts[i]) and parts[i] != 'ANY': break
1028 + if not parts[i] in names and parts[i] != 'ANY': break
1030 states = tuple(parts[1:i])
1032 states = ('INITIAL',)
1035 - states = tuple(names.keys())
1036 + states = tuple(names)
1038 tokenname = "_".join(parts[i:])
1039 return (states,tokenname)
1042 +# -----------------------------------------------------------------------------
1045 +# This class represents information needed to build a lexer as extracted from a
1046 +# user's input file.
1047 +# -----------------------------------------------------------------------------
1048 +class LexerReflect(object):
1049 + def __init__(self,ldict,log=None,reflags=0):
1050 + self.ldict = ldict
1051 + self.error_func = None
1053 + self.reflags = reflags
1054 + self.stateinfo = { 'INITIAL' : 'inclusive'}
1059 + self.log = PlyLogger(sys.stderr)
1063 + # Get all of the basic information
1064 + def get_all(self):
1066 + self.get_literals()
1070 + # Validate all of the information
1071 + def validate_all(self):
1072 + self.validate_tokens()
1073 + self.validate_literals()
1074 + self.validate_rules()
1077 + # Get the tokens map
1078 + def get_tokens(self):
1079 + tokens = self.ldict.get("tokens",None)
1081 + self.log.error("No token list is defined")
1085 + if not isinstance(tokens,(list, tuple)):
1086 + self.log.error("tokens must be a list or tuple")
1091 + self.log.error("tokens is empty")
1095 + self.tokens = tokens
1097 + # Validate the tokens
1098 + def validate_tokens(self):
1100 + for n in self.tokens:
1101 + if not _is_identifier.match(n):
1102 + self.log.error("Bad token name '%s'",n)
1104 + if n in terminals:
1105 + self.log.warning("Token '%s' multiply defined", n)
1108 + # Get the literals specifier
1109 + def get_literals(self):
1110 + self.literals = self.ldict.get("literals","")
1112 + # Validate literals
1113 + def validate_literals(self):
1115 + for c in self.literals:
1116 + if not isinstance(c,StringTypes) or len(c) > 1:
1117 + self.log.error("Invalid literal %s. Must be a single character", repr(c))
1122 + self.log.error("Invalid literals specification. literals must be a sequence of characters")
1125 + def get_states(self):
1126 + self.states = self.ldict.get("states",None)
1129 + if not isinstance(self.states,(tuple,list)):
1130 + self.log.error("states must be defined as a tuple or list")
1133 + for s in self.states:
1134 + if not isinstance(s,tuple) or len(s) != 2:
1135 + self.log.error("Invalid state specifier %s. Must be a tuple (statename,'exclusive|inclusive')",repr(s))
1138 + name, statetype = s
1139 + if not isinstance(name,StringTypes):
1140 + self.log.error("State name %s must be a string", repr(name))
1143 + if not (statetype == 'inclusive' or statetype == 'exclusive'):
1144 + self.log.error("State type for state %s must be 'inclusive' or 'exclusive'",name)
1147 + if name in self.stateinfo:
1148 + self.log.error("State '%s' already defined",name)
1151 + self.stateinfo[name] = statetype
1153 + # Get all of the symbols with a t_ prefix and sort them into various
1154 + # categories (functions, strings, error functions, and ignore characters)
1156 + def get_rules(self):
1157 + tsymbols = [f for f in self.ldict if f[:2] == 't_' ]
1159 + # Now build up a list of functions and a list of strings
1161 + self.toknames = { } # Mapping of symbols to token names
1162 + self.funcsym = { } # Symbols defined as functions
1163 + self.strsym = { } # Symbols defined as strings
1164 + self.ignore = { } # Ignore strings by state
1165 + self.errorf = { } # Error functions by state
1167 + for s in self.stateinfo:
1168 + self.funcsym[s] = []
1169 + self.strsym[s] = []
1171 + if len(tsymbols) == 0:
1172 + self.log.error("No rules of the form t_rulename are defined")
1176 + for f in tsymbols:
1178 + states, tokname = _statetoken(f,self.stateinfo)
1179 + self.toknames[f] = tokname
1181 + if hasattr(t,"__call__"):
1182 + if tokname == 'error':
1184 + self.errorf[s] = t
1185 + elif tokname == 'ignore':
1186 + line = func_code(t).co_firstlineno
1187 + file = func_code(t).co_filename
1188 + self.log.error("%s:%d: Rule '%s' must be defined as a string",file,line,t.__name__)
1192 + self.funcsym[s].append((f,t))
1193 + elif isinstance(t, StringTypes):
1194 + if tokname == 'ignore':
1196 + self.ignore[s] = t
1198 + self.log.warning("%s contains a literal backslash '\\'",f)
1200 + elif tokname == 'error':
1201 + self.log.error("Rule '%s' must be defined as a function", f)
1205 + self.strsym[s].append((f,t))
1207 + self.log.error("%s not defined as a function or string", f)
1210 + # Sort the functions by line number
1211 + for f in self.funcsym.values():
1212 + if sys.version_info[0] < 3:
1213 + f.sort(lambda x,y: cmp(func_code(x[1]).co_firstlineno,func_code(y[1]).co_firstlineno))
1216 + f.sort(key=lambda x: func_code(x[1]).co_firstlineno)
1218 + # Sort the strings by regular expression length
1219 + for s in self.strsym.values():
1220 + if sys.version_info[0] < 3:
1221 + s.sort(lambda x,y: (len(x[1]) < len(y[1])) - (len(x[1]) > len(y[1])))
1224 + s.sort(key=lambda x: len(x[1]),reverse=True)
1226 + # Validate all of the t_rules collected
1227 + def validate_rules(self):
1228 + for state in self.stateinfo:
1229 + # Validate all rules defined by functions
1233 + for fname, f in self.funcsym[state]:
1234 + line = func_code(f).co_firstlineno
1235 + file = func_code(f).co_filename
1236 + self.files[file] = 1
1238 + tokname = self.toknames[fname]
1239 + if isinstance(f, types.MethodType):
1243 + nargs = func_code(f).co_argcount
1244 + if nargs > reqargs:
1245 + self.log.error("%s:%d: Rule '%s' has too many arguments",file,line,f.__name__)
1249 + if nargs < reqargs:
1250 + self.log.error("%s:%d: Rule '%s' requires an argument", file,line,f.__name__)
1255 + self.log.error("%s:%d: No regular expression defined for rule '%s'",file,line,f.__name__)
1260 + c = re.compile("(?P<%s>%s)" % (fname,f.__doc__), re.VERBOSE | self.reflags)
1262 + self.log.error("%s:%d: Regular expression for rule '%s' matches empty string", file,line,f.__name__)
1265 + _etype, e, _etrace = sys.exc_info()
1266 + self.log.error("%s:%d: Invalid regular expression for rule '%s'. %s", file,line,f.__name__,e)
1267 + if '#' in f.__doc__:
1268 + self.log.error("%s:%d. Make sure '#' in rule '%s' is escaped with '\\#'",file,line, f.__name__)
1271 + # Validate all rules defined by strings
1272 + for name,r in self.strsym[state]:
1273 + tokname = self.toknames[name]
1274 + if tokname == 'error':
1275 + self.log.error("Rule '%s' must be defined as a function", name)
1279 + if not tokname in self.tokens and tokname.find("ignore_") < 0:
1280 + self.log.error("Rule '%s' defined for an unspecified token %s",name,tokname)
1285 + c = re.compile("(?P<%s>%s)" % (name,r),re.VERBOSE | self.reflags)
1287 + self.log.error("Regular expression for rule '%s' matches empty string",name)
1290 + _etype, e, _etrace = sys.exc_info()
1291 + self.log.error("Invalid regular expression for rule '%s'. %s",name,e)
1293 + self.log.error("Make sure '#' in rule '%s' is escaped with '\\#'",name)
1296 + if not self.funcsym[state] and not self.strsym[state]:
1297 + self.log.error("No rules defined for state '%s'",state)
1300 + # Validate the error function
1301 + efunc = self.errorf.get(state,None)
1304 + line = func_code(f).co_firstlineno
1305 + file = func_code(f).co_filename
1306 + self.files[file] = 1
1308 + if isinstance(f, types.MethodType):
1312 + nargs = func_code(f).co_argcount
1313 + if nargs > reqargs:
1314 + self.log.error("%s:%d: Rule '%s' has too many arguments",file,line,f.__name__)
1317 + if nargs < reqargs:
1318 + self.log.error("%s:%d: Rule '%s' requires an argument", file,line,f.__name__)
1321 + for f in self.files:
1322 + self.validate_file(f)
1325 + # -----------------------------------------------------------------------------
1328 + # This checks to see if there are duplicated t_rulename() functions or strings
1329 + # in the parser input file. This is done using a simple regular expression
1330 + # match on each line in the given file.
1331 + # -----------------------------------------------------------------------------
1333 + def validate_file(self,filename):
1335 + base,ext = os.path.splitext(filename)
1336 + if ext != '.py': return # No idea what the file is. Return OK
1339 + f = open(filename)
1340 + lines = f.readlines()
1343 + return # Couldn't find the file. Don't worry about it
1345 + fre = re.compile(r'\s*def\s+(t_[a-zA-Z_0-9]*)\(')
1346 + sre = re.compile(r'\s*(t_[a-zA-Z_0-9]*)\s*=')
1356 + prev = counthash.get(name)
1358 + counthash[name] = linen
1360 + self.log.error("%s:%d: Rule %s redefined. Previously defined on line %d",filename,linen,name,prev)
1364 # -----------------------------------------------------------------------------
1367 # Build all of the regular expression rules from definitions in the supplied module
1368 # -----------------------------------------------------------------------------
1369 -def lex(module=None,object=None,debug=0,optimize=0,lextab="lextab",reflags=0,nowarn=0,outputdir=""):
1370 +def lex(module=None,object=None,debug=0,optimize=0,lextab="lextab",reflags=0,nowarn=0,outputdir="", debuglog=None, errorlog=None):
1373 stateinfo = { 'INITIAL' : 'inclusive'}
1377 - lexobj.lexdebug = debug
1378 lexobj.lexoptimize = optimize
1381 - if nowarn: warn = 0
1383 + if errorlog is None:
1384 + errorlog = PlyLogger(sys.stderr)
1387 + if debuglog is None:
1388 + debuglog = PlyLogger(sys.stderr)
1390 + # Get the module dictionary used for the lexer
1391 if object: module = object
1394 - # User supplied a module object.
1395 - if isinstance(module, types.ModuleType):
1396 - ldict = module.__dict__
1397 - elif isinstance(module, _INSTANCETYPE):
1398 - _items = [(k,getattr(module,k)) for k in dir(module)]
1400 - for (i,v) in _items:
1403 - raise ValueError,"Expected a module or instance"
1404 - lexobj.lexmodule = module
1405 + _items = [(k,getattr(module,k)) for k in dir(module)]
1406 + ldict = dict(_items)
1408 + ldict = get_caller_module_dict(2)
1411 - # No module given. We might be able to get information from the caller.
1413 - raise RuntimeError
1414 - except RuntimeError:
1415 - e,b,t = sys.exc_info()
1417 - f = f.f_back # Walk out to our calling function
1418 - if f.f_globals is f.f_locals: # Collect global and local variations from caller
1419 - ldict = f.f_globals
1421 - ldict = f.f_globals.copy()
1422 - ldict.update(f.f_locals)
1423 + # Collect parser information from the dictionary
1424 + linfo = LexerReflect(ldict,log=errorlog,reflags=reflags)
1427 + if linfo.validate_all():
1428 + raise SyntaxError("Can't build lexer")
1430 if optimize and lextab:
1432 lexobj.readtab(lextab,ldict)
1433 token = lexobj.token
1434 input = lexobj.input
1441 - # Get the tokens, states, and literals variables (if any)
1443 - tokens = ldict.get("tokens",None)
1444 - states = ldict.get("states",None)
1445 - literals = ldict.get("literals","")
1448 - raise SyntaxError,"lex: module does not define 'tokens'"
1450 - if not (isinstance(tokens,types.ListType) or isinstance(tokens,types.TupleType)):
1451 - raise SyntaxError,"lex: tokens must be a list or tuple."
1452 + # Dump some basic debugging information
1454 + debuglog.info("lex: tokens = %r", linfo.tokens)
1455 + debuglog.info("lex: literals = %r", linfo.literals)
1456 + debuglog.info("lex: states = %r", linfo.stateinfo)
1458 # Build a dictionary of valid token names
1459 lexobj.lextokens = { }
1462 - if not _is_identifier.match(n):
1463 - print >>sys.stderr, "lex: Bad token name '%s'" % n
1465 - if warn and lexobj.lextokens.has_key(n):
1466 - print >>sys.stderr, "lex: Warning. Token '%s' multiply defined." % n
1467 - lexobj.lextokens[n] = None
1468 + for n in linfo.tokens:
1469 + lexobj.lextokens[n] = 1
1471 + # Get literals specification
1472 + if isinstance(linfo.literals,(list,tuple)):
1473 + lexobj.lexliterals = type(linfo.literals[0])().join(linfo.literals)
1475 - for n in tokens: lexobj.lextokens[n] = None
1476 + lexobj.lexliterals = linfo.literals
1479 - print "lex: tokens = '%s'" % lexobj.lextokens.keys()
1482 - for c in literals:
1483 - if not (isinstance(c,types.StringType) or isinstance(c,types.UnicodeType)) or len(c) > 1:
1484 - print >>sys.stderr, "lex: Invalid literal %s. Must be a single character" % repr(c)
1489 - print >>sys.stderr, "lex: Invalid literals specification. literals must be a sequence of characters."
1492 - lexobj.lexliterals = literals
1496 - if not (isinstance(states,types.TupleType) or isinstance(states,types.ListType)):
1497 - print >>sys.stderr, "lex: states must be defined as a tuple or list."
1501 - if not isinstance(s,types.TupleType) or len(s) != 2:
1502 - print >>sys.stderr, "lex: invalid state specifier %s. Must be a tuple (statename,'exclusive|inclusive')" % repr(s)
1505 - name, statetype = s
1506 - if not isinstance(name,types.StringType):
1507 - print >>sys.stderr, "lex: state name %s must be a string" % repr(name)
1510 - if not (statetype == 'inclusive' or statetype == 'exclusive'):
1511 - print >>sys.stderr, "lex: state type for state %s must be 'inclusive' or 'exclusive'" % name
1514 - if stateinfo.has_key(name):
1515 - print >>sys.stderr, "lex: state '%s' already defined." % name
1518 - stateinfo[name] = statetype
1520 - # Get a list of symbols with the t_ or s_ prefix
1521 - tsymbols = [f for f in ldict.keys() if f[:2] == 't_' ]
1523 - # Now build up a list of functions and a list of strings
1525 - funcsym = { } # Symbols defined as functions
1526 - strsym = { } # Symbols defined as strings
1527 - toknames = { } # Mapping of symbols to token names
1529 - for s in stateinfo.keys():
1533 - ignore = { } # Ignore strings by state
1534 - errorf = { } # Error functions by state
1536 - if len(tsymbols) == 0:
1537 - raise SyntaxError,"lex: no rules of the form t_rulename are defined."
1539 - for f in tsymbols:
1541 - states, tokname = _statetoken(f,stateinfo)
1542 - toknames[f] = tokname
1545 - for s in states: funcsym[s].append((f,t))
1546 - elif (isinstance(t, types.StringType) or isinstance(t,types.UnicodeType)):
1547 - for s in states: strsym[s].append((f,t))
1549 - print >>sys.stderr, "lex: %s not defined as a function or string" % f
1552 - # Sort the functions by line number
1553 - for f in funcsym.values():
1554 - f.sort(lambda x,y: cmp(x[1].func_code.co_firstlineno,y[1].func_code.co_firstlineno))
1556 - # Sort the strings by regular expression length
1557 - for s in strsym.values():
1558 - s.sort(lambda x,y: (len(x[1]) < len(y[1])) - (len(x[1]) > len(y[1])))
1559 + # Get the stateinfo dictionary
1560 + stateinfo = linfo.stateinfo
1564 # Build the master regular expressions
1565 - for state in stateinfo.keys():
1566 + for state in stateinfo:
1569 # Add rules defined by functions first
1570 - for fname, f in funcsym[state]:
1571 - line = f.func_code.co_firstlineno
1572 - file = f.func_code.co_filename
1573 - files[file] = None
1574 - tokname = toknames[fname]
1576 - ismethod = isinstance(f, types.MethodType)
1579 - nargs = f.func_code.co_argcount
1584 - if nargs > reqargs:
1585 - print >>sys.stderr, "%s:%d: Rule '%s' has too many arguments." % (file,line,f.__name__)
1589 - if nargs < reqargs:
1590 - print >>sys.stderr, "%s:%d: Rule '%s' requires an argument." % (file,line,f.__name__)
1594 - if tokname == 'ignore':
1595 - print >>sys.stderr, "%s:%d: Rule '%s' must be defined as a string." % (file,line,f.__name__)
1599 - if tokname == 'error':
1606 - c = re.compile("(?P<%s>%s)" % (fname,f.__doc__), re.VERBOSE | reflags)
1608 - print >>sys.stderr, "%s:%d: Regular expression for rule '%s' matches empty string." % (file,line,f.__name__)
1611 - except re.error,e:
1612 - print >>sys.stderr, "%s:%d: Invalid regular expression for rule '%s'. %s" % (file,line,f.__name__,e)
1613 - if '#' in f.__doc__:
1614 - print >>sys.stderr, "%s:%d. Make sure '#' in rule '%s' is escaped with '\\#'." % (file,line, f.__name__)
1619 - print "lex: Adding rule %s -> '%s' (state '%s')" % (f.__name__,f.__doc__, state)
1621 - # Okay. The regular expression seemed okay. Let's append it to the master regular
1622 - # expression we're building
1624 - regex_list.append("(?P<%s>%s)" % (fname,f.__doc__))
1626 - print >>sys.stderr, "%s:%d: No regular expression defined for rule '%s'" % (file,line,f.__name__)
1627 + for fname, f in linfo.funcsym[state]:
1628 + line = func_code(f).co_firstlineno
1629 + file = func_code(f).co_filename
1630 + regex_list.append("(?P<%s>%s)" % (fname,f.__doc__))
1632 + debuglog.info("lex: Adding rule %s -> '%s' (state '%s')",fname,f.__doc__, state)
1634 # Now add all of the simple rules
1635 - for name,r in strsym[state]:
1636 - tokname = toknames[name]
1638 - if tokname == 'ignore':
1640 - print >>sys.stderr, "lex: Warning. %s contains a literal backslash '\\'" % name
1645 - if tokname == 'error':
1646 - raise SyntaxError,"lex: Rule '%s' must be defined as a function" % name
1650 - if not lexobj.lextokens.has_key(tokname) and tokname.find("ignore_") < 0:
1651 - print >>sys.stderr, "lex: Rule '%s' defined for an unspecified token %s." % (name,tokname)
1655 - c = re.compile("(?P<%s>%s)" % (name,r),re.VERBOSE | reflags)
1657 - print >>sys.stderr, "lex: Regular expression for rule '%s' matches empty string." % name
1660 - except re.error,e:
1661 - print >>sys.stderr, "lex: Invalid regular expression for rule '%s'. %s" % (name,e)
1663 - print >>sys.stderr, "lex: Make sure '#' in rule '%s' is escaped with '\\#'." % name
1668 - print "lex: Adding rule %s -> '%s' (state '%s')" % (name,r,state)
1670 + for name,r in linfo.strsym[state]:
1671 regex_list.append("(?P<%s>%s)" % (name,r))
1673 - if not regex_list:
1674 - print >>sys.stderr, "lex: No rules defined for state '%s'" % state
1677 + debuglog.info("lex: Adding rule %s -> '%s' (state '%s')",name,r, state)
1679 regexs[state] = regex_list
1683 - for f in files.keys():
1684 - if not _validate_file(f):
1688 - raise SyntaxError,"lex: Unable to build lexer."
1690 - # From this point forward, we're reasonably confident that we can build the lexer.
1691 - # No more errors will be generated, but there might be some warning messages.
1693 # Build the master regular expressions
1695 - for state in regexs.keys():
1696 - lexre, re_text, re_names = _form_master_re(regexs[state],reflags,ldict,toknames)
1698 + debuglog.info("lex: ==== MASTER REGEXS FOLLOW ====")
1700 + for state in regexs:
1701 + lexre, re_text, re_names = _form_master_re(regexs[state],reflags,ldict,linfo.toknames)
1702 lexobj.lexstatere[state] = lexre
1703 lexobj.lexstateretext[state] = re_text
1704 lexobj.lexstaterenames[state] = re_names
1706 for i in range(len(re_text)):
1707 - print "lex: state '%s'. regex[%d] = '%s'" % (state, i, re_text[i])
1708 + debuglog.info("lex: state '%s' : regex[%d] = '%s'",state, i, re_text[i])
1710 - # For inclusive states, we need to add the INITIAL state
1711 - for state,type in stateinfo.items():
1712 - if state != "INITIAL" and type == 'inclusive':
1713 + # For inclusive states, we need to add the regular expressions from the INITIAL state
1714 + for state,stype in stateinfo.items():
1715 + if state != "INITIAL" and stype == 'inclusive':
1716 lexobj.lexstatere[state].extend(lexobj.lexstatere['INITIAL'])
1717 lexobj.lexstateretext[state].extend(lexobj.lexstateretext['INITIAL'])
1718 lexobj.lexstaterenames[state].extend(lexobj.lexstaterenames['INITIAL'])
1720 lexobj.lexstateinfo = stateinfo
1721 lexobj.lexre = lexobj.lexstatere["INITIAL"]
1722 lexobj.lexretext = lexobj.lexstateretext["INITIAL"]
1723 + lexobj.lexreflags = reflags
1725 # Set up ignore variables
1726 - lexobj.lexstateignore = ignore
1727 + lexobj.lexstateignore = linfo.ignore
1728 lexobj.lexignore = lexobj.lexstateignore.get("INITIAL","")
1730 # Set up error functions
1731 - lexobj.lexstateerrorf = errorf
1732 - lexobj.lexerrorf = errorf.get("INITIAL",None)
1733 - if warn and not lexobj.lexerrorf:
1734 - print >>sys.stderr, "lex: Warning. no t_error rule is defined."
1735 + lexobj.lexstateerrorf = linfo.errorf
1736 + lexobj.lexerrorf = linfo.errorf.get("INITIAL",None)
1737 + if not lexobj.lexerrorf:
1738 + errorlog.warning("No t_error rule is defined")
1740 # Check state information for ignore and error rules
1741 for s,stype in stateinfo.items():
1742 if stype == 'exclusive':
1743 - if warn and not errorf.has_key(s):
1744 - print >>sys.stderr, "lex: Warning. no error rule is defined for exclusive state '%s'" % s
1745 - if warn and not ignore.has_key(s) and lexobj.lexignore:
1746 - print >>sys.stderr, "lex: Warning. no ignore rule is defined for exclusive state '%s'" % s
1747 + if not s in linfo.errorf:
1748 + errorlog.warning("No error rule is defined for exclusive state '%s'", s)
1749 + if not s in linfo.ignore and lexobj.lexignore:
1750 + errorlog.warning("No ignore rule is defined for exclusive state '%s'", s)
1751 elif stype == 'inclusive':
1752 - if not errorf.has_key(s):
1753 - errorf[s] = errorf.get("INITIAL",None)
1754 - if not ignore.has_key(s):
1755 - ignore[s] = ignore.get("INITIAL","")
1757 + if not s in linfo.errorf:
1758 + linfo.errorf[s] = linfo.errorf.get("INITIAL",None)
1759 + if not s in linfo.ignore:
1760 + linfo.ignore[s] = linfo.ignore.get("INITIAL","")
1762 # Create global versions of the token() and input() functions
1763 token = lexobj.token
1764 input = lexobj.input
1767 # If in optimize mode, we write the lextab
1768 if lextab and optimize:
1769 @@ -851,45 +1014,44 @@ def lex(module=None,object=None,debug=0,
1770 def runmain(lexer=None,data=None):
1773 filename = sys.argv[1]
1778 - print "Reading from standard input (type EOF to end):"
1779 + sys.stdout.write("Reading from standard input (type EOF to end):\n")
1780 data = sys.stdin.read()
1783 _input = lexer.input
1788 _token = lexer.token
1795 - print "(%s,%r,%d,%d)" % (tok.type, tok.value, tok.lineno,tok.lexpos)
1797 + sys.stdout.write("(%s,%r,%d,%d)\n" % (tok.type, tok.value, tok.lineno,tok.lexpos))
1799 # -----------------------------------------------------------------------------
1802 # This decorator function can be used to set the regex expression on a function
1803 # when its docstring might need to be set in an alternative way
1804 # -----------------------------------------------------------------------------
1809 + if hasattr(r,"__call__"):
1810 f.__doc__ = r.__doc__
1816 # Alternative spelling of the TOKEN decorator
1818 diff --git a/other-licenses/ply/ply/yacc.py b/other-licenses/ply/ply/yacc.py
1819 --- a/other-licenses/ply/ply/yacc.py
1820 +++ b/other-licenses/ply/ply/yacc.py
1822 -#-----------------------------------------------------------------------------
1823 +# -----------------------------------------------------------------------------
1826 -# Author(s): David M. Beazley (dave@dabeaz.com)
1827 +# Copyright (C) 2001-2009,
1828 +# David M. Beazley (Dabeaz LLC)
1829 +# All rights reserved.
1831 -# Copyright (C) 2001-2008, David M. Beazley
1832 +# Redistribution and use in source and binary forms, with or without
1833 +# modification, are permitted provided that the following conditions are
1836 +# * Redistributions of source code must retain the above copyright notice,
1837 +# this list of conditions and the following disclaimer.
1838 +# * Redistributions in binary form must reproduce the above copyright notice,
1839 +# this list of conditions and the following disclaimer in the documentation
1840 +# and/or other materials provided with the distribution.
1841 +# * Neither the name of the David Beazley or Dabeaz LLC may be used to
1842 +# endorse or promote products derived from this software without
1843 +# specific prior written permission.
1845 -# This library is free software; you can redistribute it and/or
1846 -# modify it under the terms of the GNU Lesser General Public
1847 -# License as published by the Free Software Foundation; either
1848 -# version 2.1 of the License, or (at your option) any later version.
1850 -# This library is distributed in the hope that it will be useful,
1851 -# but WITHOUT ANY WARRANTY; without even the implied warranty of
1852 -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
1853 -# Lesser General Public License for more details.
1855 -# You should have received a copy of the GNU Lesser General Public
1856 -# License along with this library; if not, write to the Free Software
1857 -# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
1859 -# See the file COPYING for a complete copy of the LGPL.
1861 +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
1862 +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
1863 +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
1864 +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
1865 +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
1866 +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
1867 +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
1868 +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
1869 +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
1870 +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
1871 +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
1872 +# -----------------------------------------------------------------------------
1874 # This implements an LR parser that is constructed from grammar rules defined
1875 # as Python functions. The grammer is specified by supplying the BNF inside
1876 # Python documentation strings. The inspiration for this technique was borrowed
1877 # from John Aycock's Spark parsing system. PLY might be viewed as cross between
1878 # Spark and the GNU bison utility.
1880 # The current implementation is only somewhat object-oriented. The
1883 # Construction of LR parsing tables is fairly complicated and expensive.
1884 # To make this module run fast, a *LOT* of work has been put into
1885 # optimization---often at the expensive of readability and what might
1886 # consider to be good Python "coding style." Modify the code at your
1888 # ----------------------------------------------------------------------------
1890 -__version__ = "2.5"
1891 -__tabversion__ = "2.4" # Table version
1892 +__version__ = "3.3"
1893 +__tabversion__ = "3.2" # Table version
1895 #-----------------------------------------------------------------------------
1896 # === User configurable parameters ===
1898 # Change these to modify the default behavior of yacc (if you wish)
1899 #-----------------------------------------------------------------------------
1901 yaccdebug = 1 # Debugging mode. If set, yacc generates a
1902 @@ -66,34 +75,93 @@ debug_file = 'parser.out' # Default
1903 tab_module = 'parsetab' # Default name of the table module
1904 default_lr = 'LALR' # Default LR table generation method
1906 error_count = 3 # Number of symbols that must be shifted to leave recovery mode
1908 yaccdevel = 0 # Set to True if developing yacc. This turns off optimized
1909 # implementations of certain functions.
1911 -import re, types, sys, cStringIO, md5, os.path
1913 +resultlimit = 40 # Size limit of results when running in debug mode.
1915 +pickle_protocol = 0 # Protocol to use when writing pickle files
1917 +import re, types, sys, os.path
1919 +# Compatibility function for python 2.6/3.0
1920 +if sys.version_info[0] < 3:
1922 + return f.func_code
1929 + MAXINT = sys.maxint
1930 +except AttributeError:
1931 + MAXINT = sys.maxsize
1933 +# Python 2.x/3.0 compatibility.
1934 +def load_ply_lex():
1935 + if sys.version_info[0] < 3:
1938 + import ply.lex as lex
1941 +# This object is a stand-in for a logging object created by the
1942 +# logging module. PLY will use this by default to create things
1943 +# such as the parser.out file. If a user wants more detailed
1944 +# information, they can create their own logging object and pass
1947 +class PlyLogger(object):
1948 + def __init__(self,f):
1950 + def debug(self,msg,*args,**kwargs):
1951 + self.f.write((msg % args) + "\n")
1954 + def warning(self,msg,*args,**kwargs):
1955 + self.f.write("WARNING: "+ (msg % args) + "\n")
1957 + def error(self,msg,*args,**kwargs):
1958 + self.f.write("ERROR: " + (msg % args) + "\n")
1962 +# Null logger is used when no output is generated. Does nothing.
1963 +class NullLogger(object):
1964 + def __getattribute__(self,name):
1966 + def __call__(self,*args,**kwargs):
1969 # Exception raised for yacc-related errors
1970 class YaccError(Exception): pass
1972 -# Exception raised for errors raised in production rules
1973 -class SyntaxError(Exception): pass
1976 -# Available instance types. This is used when parsers are defined by a class.
1977 -# it's a little funky because I want to preserve backwards compatibility
1978 -# with Python 2.0 where types.ObjectType is undefined.
1981 - _INSTANCETYPE = (types.InstanceType, types.ObjectType)
1982 -except AttributeError:
1983 - _INSTANCETYPE = types.InstanceType
1984 - class object: pass # Note: needed if no new-style classes present
1985 +# Format the result message that the parser produces when running in debug mode.
1986 +def format_result(r):
1987 + repr_str = repr(r)
1988 + if '\n' in repr_str: repr_str = repr(repr_str)
1989 + if len(repr_str) > resultlimit:
1990 + repr_str = repr_str[:resultlimit]+" ..."
1991 + result = "<%s @ 0x%x> (%s)" % (type(r).__name__,id(r),repr_str)
1995 +# Format stack entries when the parser is running in debug mode
1996 +def format_stack_entry(r):
1997 + repr_str = repr(r)
1998 + if '\n' in repr_str: repr_str = repr(repr_str)
1999 + if len(repr_str) < 16:
2002 + return "<%s @ 0x%x>" % (type(r).__name__,id(r))
2004 #-----------------------------------------------------------------------------
2005 # === LR Parsing Engine ===
2007 # The following classes are used for the LR parser itself. These are not
2008 # used during table construction and are independent of the actual LR
2009 # table generation algorithm
2010 #-----------------------------------------------------------------------------
2011 @@ -137,16 +205,19 @@ class YaccProduction:
2012 return [s.value for s in self.slice[i:j]]
2015 return len(self.slice)
2018 return getattr(self.slice[n],"lineno",0)
2020 + def set_lineno(self,n,lineno):
2021 + self.slice[n].lineno = lineno
2023 def linespan(self,n):
2024 startline = getattr(self.slice[n],"lineno",0)
2025 endline = getattr(self.slice[n],"endlineno",startline)
2026 return startline,endline
2029 return getattr(self.slice[n],"lexpos",0)
2031 @@ -154,51 +225,44 @@ class YaccProduction:
2032 startpos = getattr(self.slice[n],"lexpos",0)
2033 endpos = getattr(self.slice[n],"endlexpos",startpos)
2034 return startpos,endpos
2040 -# The LR Parsing engine. This is defined as a class so that multiple parsers
2041 -# can exist in the same process. A user never instantiates this directly.
2042 -# Instead, the global yacc() function should be used to create a suitable Parser
2046 - def __init__(self,magic=None):
2048 - # This is a hack to keep users from trying to instantiate a Parser
2049 - # object directly.
2051 - if magic != "xyzzy":
2052 - raise YaccError, "Can't directly instantiate Parser. Use yacc() instead."
2054 - # Reset internal state
2055 - self.productions = None # List of productions
2056 - self.errorfunc = None # Error handling function
2057 - self.action = { } # LR Action table
2058 - self.goto = { } # LR goto table
2059 - self.require = { } # Attribute require table
2060 - self.method = "Unknown LR" # Table construction method used
2061 +# -----------------------------------------------------------------------------
2064 +# The LR Parsing engine.
2065 +# -----------------------------------------------------------------------------
2068 + def __init__(self,lrtab,errorf):
2069 + self.productions = lrtab.lr_productions
2070 + self.action = lrtab.lr_action
2071 + self.goto = lrtab.lr_goto
2072 + self.errorfunc = errorf
2078 del self.statestack[:]
2079 del self.symstack[:]
2082 self.symstack.append(sym)
2083 self.statestack.append(0)
2085 def parse(self,input=None,lexer=None,debug=0,tracking=0,tokenfunc=None):
2086 if debug or yaccdevel:
2087 + if isinstance(debug,int):
2088 + debug = PlyLogger(sys.stderr)
2089 return self.parsedebug(input,lexer,debug,tracking,tokenfunc)
2091 return self.parseopt(input,lexer,debug,tracking,tokenfunc)
2093 return self.parseopt_notrack(input,lexer,debug,tracking,tokenfunc)
2096 # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2097 @@ -210,30 +274,34 @@ class Parser:
2104 # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2106 - def parsedebug(self,input=None,lexer=None,debug=0,tracking=0,tokenfunc=None):
2107 + def parsedebug(self,input=None,lexer=None,debug=None,tracking=0,tokenfunc=None):
2108 lookahead = None # Current lookahead symbol
2109 lookaheadstack = [ ] # Stack of lookahead symbols
2110 actions = self.action # Local reference to action table (to avoid lookup on self.)
2111 goto = self.goto # Local reference to goto table (to avoid lookup on self.)
2112 prod = self.productions # Local reference to production list (to avoid lookup on self.)
2113 pslice = YaccProduction(None) # Production object passed to grammar rules
2114 errorcount = 0 # Used during error recovery
2115 - endsym = "$end" # End symbol
2118 + debug.info("PLY: PARSE DEBUG START")
2121 # If no lexer was given, we will try to use the lex module
2124 + lex = load_ply_lex()
2128 # Set up the lexer and parser objects on pslice
2129 pslice.lexer = lexer
2130 pslice.parser = self
2132 # If input was supplied, pass to lexer
2133 if input is not None:
2136 @@ -252,65 +320,55 @@ class Parser:
2138 pslice.stack = symstack # Put in the production
2139 errtoken = None # Err token
2141 # The start state is assumed to be (0,$end)
2143 statestack.append(0)
2147 symstack.append(sym)
2150 # Get the next symbol on the input. If a lookahead symbol
2151 # is already set, we just use that. Otherwise, we'll pull
2152 # the next token off of the lookaheadstack or from the lexer
2156 - print 'state', state
2158 + debug.debug('State : %s', state)
2162 if not lookaheadstack:
2163 lookahead = get_token() # Get the next token
2165 lookahead = lookaheadstack.pop()
2167 lookahead = YaccSymbol()
2168 - lookahead.type = endsym
2169 + lookahead.type = "$end"
2173 - errorlead = ("%s . %s" % (" ".join([xx.type for xx in symstack][1:]), str(lookahead))).lstrip()
2174 + debug.debug('Stack : %s',
2175 + ("%s . %s" % (" ".join([xx.type for xx in symstack][1:]), str(lookahead))).lstrip())
2178 # Check the action table
2179 ltype = lookahead.type
2180 t = actions[state].get(ltype)
2189 # shift a symbol on the stack
2190 - if ltype is endsym:
2191 - # Error, end of input
2192 - sys.stderr.write("yacc: Parse error. EOF\n")
2194 statestack.append(t)
2199 - sys.stderr.write("%-60s shift state %s\n" % (errorlead, t))
2200 + debug.debug("Action : Shift and goto state %s", t)
2203 symstack.append(lookahead)
2206 # Decrease error count on successful shift
2207 if errorcount: errorcount -=1
2209 @@ -322,18 +380,21 @@ class Parser:
2212 # Get production function
2214 sym.type = pname # Production name
2219 - sys.stderr.write("%-60s reduce %d\n" % (errorlead, -t))
2221 + debug.info("Action : Reduce rule [%s] with %s and goto state %d", p.str, "["+",".join([format_stack_entry(_v.value) for _v in symstack[-plen:]])+"]",-t)
2223 + debug.info("Action : Reduce rule [%s] with %s and goto state %d", p.str, [],-t)
2228 targ = symstack[-plen-1:]
2233 @@ -350,19 +411,22 @@ class Parser:
2234 # The code enclosed in this section is duplicated
2235 # below as a performance optimization. Make sure
2236 # changes get made in both locations.
2241 # Call the grammar rule with our special slice object
2243 del symstack[-plen:]
2244 del statestack[-plen:]
2245 + p.callable(pslice)
2247 + debug.info("Result : %s", format_result(pslice[0]))
2249 symstack.append(sym)
2250 state = goto[statestack[-1]][pname]
2251 statestack.append(state)
2253 # If an error was set. Enter error recovery state
2254 lookaheadstack.append(lookahead)
2257 @@ -388,17 +452,20 @@ class Parser:
2258 # The code enclosed in this section is duplicated
2259 # above as a performance optimization. Make sure
2260 # changes get made in both locations.
2265 # Call the grammar rule with our special slice object
2267 + p.callable(pslice)
2269 + debug.info("Result : %s", format_result(pslice[0]))
2271 symstack.append(sym)
2272 state = goto[statestack[-1]][pname]
2273 statestack.append(state)
2275 # If an error was set. Enter error recovery state
2276 lookaheadstack.append(lookahead)
2279 @@ -407,46 +474,53 @@ class Parser:
2281 errorcount = error_count
2284 # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2288 - return getattr(n,"value",None)
2289 + result = getattr(n,"value",None)
2291 + debug.info("Done : Returning %s", format_result(result))
2292 + debug.info("PLY: PARSE DEBUG END")
2300 - sys.stderr.write(errorlead + "\n")
2301 + debug.error('Error : %s',
2302 + ("%s . %s" % (" ".join([xx.type for xx in symstack][1:]), str(lookahead))).lstrip())
2305 # We have some kind of parsing error here. To handle
2306 # this, we are going to push the current token onto
2307 # the tokenstack and replace it with an 'error' token.
2308 # If there are any synchronization rules, they may
2311 # In addition to pushing the error token, we call call
2312 # the user defined p_error() function if this is the
2313 # first syntax error. This function is only called if
2315 if errorcount == 0 or self.errorok:
2316 errorcount = error_count
2318 errtoken = lookahead
2319 - if errtoken.type is endsym:
2320 + if errtoken.type == "$end":
2321 errtoken = None # End of file!
2323 global errok,token,restart
2324 errok = self.errok # Set some special functions available in error recovery
2326 restart = self.restart
2327 + if errtoken and not hasattr(errtoken,'lexer'):
2328 + errtoken.lexer = lexer
2329 tok = self.errorfunc(errtoken)
2330 del errok, token, restart # Delete special functions
2333 # User must have done some kind of panic
2334 # mode recovery on their own. The
2335 # returned token is the next lookahead
2337 @@ -466,29 +540,29 @@ class Parser:
2340 errorcount = error_count
2342 # case 1: the statestack only has 1 entry on it. If we're in this state, the
2343 # entire parse has been rolled back and we're completely hosed. The token is
2344 # discarded and we just keep going.
2346 - if len(statestack) <= 1 and lookahead.type is not endsym:
2347 + if len(statestack) <= 1 and lookahead.type != "$end":
2351 # Nuke the pushback stack
2352 del lookaheadstack[:]
2355 # case 2: the statestack has a couple of entries on it, but we're
2356 # at the end of the file. nuke the top entry and generate an error token
2358 # Start nuking entries on the stack
2359 - if lookahead.type is endsym:
2360 + if lookahead.type == "$end":
2361 # Whoa. We're really hosed here. Bail out
2364 if lookahead.type != 'error':
2366 if sym.type == 'error':
2367 # Hmmm. Error is on top of stack, we'll just nuke input
2368 # symbol and continue
2369 @@ -504,17 +578,17 @@ class Parser:
2373 state = statestack[-1] # Potential bug fix
2377 # Call an error function here
2378 - raise RuntimeError, "yacc: internal parser error!!!\n"
2379 + raise RuntimeError("yacc: internal parser error!!!\n")
2381 # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2384 # Optimized version of parse() method. DO NOT EDIT THIS CODE DIRECTLY.
2385 # Edit the debug version above, then copy any modifications to the method
2386 # below while removing #--! DEBUG sections.
2387 # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2388 @@ -526,17 +600,17 @@ class Parser:
2389 actions = self.action # Local reference to action table (to avoid lookup on self.)
2390 goto = self.goto # Local reference to goto table (to avoid lookup on self.)
2391 prod = self.productions # Local reference to production list (to avoid lookup on self.)
2392 pslice = YaccProduction(None) # Production object passed to grammar rules
2393 errorcount = 0 # Used during error recovery
2395 # If no lexer was given, we will try to use the lex module
2398 + lex = load_ply_lex()
2401 # Set up the lexer and parser objects on pslice
2402 pslice.lexer = lexer
2403 pslice.parser = self
2405 # If input was supplied, pass to lexer
2406 if input is not None:
2407 @@ -581,20 +655,16 @@ class Parser:
2409 # Check the action table
2410 ltype = lookahead.type
2411 t = actions[state].get(ltype)
2415 # shift a symbol on the stack
2416 - if ltype == '$end':
2417 - # Error, end of input
2418 - sys.stderr.write("yacc: Parse error. EOF\n")
2420 statestack.append(t)
2423 symstack.append(lookahead)
2426 # Decrease error count on successful shift
2427 if errorcount: errorcount -=1
2428 @@ -630,19 +700,19 @@ class Parser:
2429 # The code enclosed in this section is duplicated
2430 # below as a performance optimization. Make sure
2431 # changes get made in both locations.
2436 # Call the grammar rule with our special slice object
2438 del symstack[-plen:]
2439 del statestack[-plen:]
2440 + p.callable(pslice)
2441 symstack.append(sym)
2442 state = goto[statestack[-1]][pname]
2443 statestack.append(state)
2445 # If an error was set. Enter error recovery state
2446 lookaheadstack.append(lookahead)
2449 @@ -668,17 +738,17 @@ class Parser:
2450 # The code enclosed in this section is duplicated
2451 # above as a performance optimization. Make sure
2452 # changes get made in both locations.
2457 # Call the grammar rule with our special slice object
2459 + p.callable(pslice)
2460 symstack.append(sym)
2461 state = goto[statestack[-1]][pname]
2462 statestack.append(state)
2464 # If an error was set. Enter error recovery state
2465 lookaheadstack.append(lookahead)
2468 @@ -712,16 +782,18 @@ class Parser:
2469 errtoken = lookahead
2470 if errtoken.type == '$end':
2471 errtoken = None # End of file!
2473 global errok,token,restart
2474 errok = self.errok # Set some special functions available in error recovery
2476 restart = self.restart
2477 + if errtoken and not hasattr(errtoken,'lexer'):
2478 + errtoken.lexer = lexer
2479 tok = self.errorfunc(errtoken)
2480 del errok, token, restart # Delete special functions
2483 # User must have done some kind of panic
2484 # mode recovery on their own. The
2485 # returned token is the next lookahead
2487 @@ -779,17 +851,17 @@ class Parser:
2491 state = statestack[-1] # Potential bug fix
2495 # Call an error function here
2496 - raise RuntimeError, "yacc: internal parser error!!!\n"
2497 + raise RuntimeError("yacc: internal parser error!!!\n")
2499 # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2500 # parseopt_notrack().
2502 # Optimized version of parseopt() with line number tracking removed.
2503 # DO NOT EDIT THIS CODE DIRECTLY. Copy the optimized version and remove
2504 # code in the #--! TRACKING sections
2505 # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2506 @@ -800,17 +872,17 @@ class Parser:
2507 actions = self.action # Local reference to action table (to avoid lookup on self.)
2508 goto = self.goto # Local reference to goto table (to avoid lookup on self.)
2509 prod = self.productions # Local reference to production list (to avoid lookup on self.)
2510 pslice = YaccProduction(None) # Production object passed to grammar rules
2511 errorcount = 0 # Used during error recovery
2513 # If no lexer was given, we will try to use the lex module
2516 + lex = load_ply_lex()
2519 # Set up the lexer and parser objects on pslice
2520 pslice.lexer = lexer
2521 pslice.parser = self
2523 # If input was supplied, pass to lexer
2524 if input is not None:
2525 @@ -855,20 +927,16 @@ class Parser:
2527 # Check the action table
2528 ltype = lookahead.type
2529 t = actions[state].get(ltype)
2533 # shift a symbol on the stack
2534 - if ltype == '$end':
2535 - # Error, end of input
2536 - sys.stderr.write("yacc: Parse error. EOF\n")
2538 statestack.append(t)
2541 symstack.append(lookahead)
2544 # Decrease error count on successful shift
2545 if errorcount: errorcount -=1
2546 @@ -893,19 +961,19 @@ class Parser:
2547 # The code enclosed in this section is duplicated
2548 # below as a performance optimization. Make sure
2549 # changes get made in both locations.
2554 # Call the grammar rule with our special slice object
2556 del symstack[-plen:]
2557 del statestack[-plen:]
2558 + p.callable(pslice)
2559 symstack.append(sym)
2560 state = goto[statestack[-1]][pname]
2561 statestack.append(state)
2563 # If an error was set. Enter error recovery state
2564 lookaheadstack.append(lookahead)
2567 @@ -925,17 +993,17 @@ class Parser:
2568 # The code enclosed in this section is duplicated
2569 # above as a performance optimization. Make sure
2570 # changes get made in both locations.
2575 # Call the grammar rule with our special slice object
2577 + p.callable(pslice)
2578 symstack.append(sym)
2579 state = goto[statestack[-1]][pname]
2580 statestack.append(state)
2582 # If an error was set. Enter error recovery state
2583 lookaheadstack.append(lookahead)
2586 @@ -969,16 +1037,18 @@ class Parser:
2587 errtoken = lookahead
2588 if errtoken.type == '$end':
2589 errtoken = None # End of file!
2591 global errok,token,restart
2592 errok = self.errok # Set some special functions available in error recovery
2594 restart = self.restart
2595 + if errtoken and not hasattr(errtoken,'lexer'):
2596 + errtoken.lexer = lexer
2597 tok = self.errorfunc(errtoken)
2598 del errok, token, restart # Delete special functions
2601 # User must have done some kind of panic
2602 # mode recovery on their own. The
2603 # returned token is the next lookahead
2605 @@ -1036,1115 +1106,783 @@ class Parser:
2609 state = statestack[-1] # Potential bug fix
2613 # Call an error function here
2614 - raise RuntimeError, "yacc: internal parser error!!!\n"
2616 + raise RuntimeError("yacc: internal parser error!!!\n")
2618 # -----------------------------------------------------------------------------
2619 -# === Parser Construction ===
2620 +# === Grammar Representation ===
2622 -# The following functions and variables are used to implement the yacc() function
2623 -# itself. This is pretty hairy stuff involving lots of error checking,
2624 -# construction of LR items, kernels, and so forth. Although a lot of
2625 -# this work is done using global variables, the resulting Parser object
2626 -# is completely self contained--meaning that it is safe to repeatedly
2627 -# call yacc() with different grammars in the same application.
2628 +# The following functions, classes, and variables are used to represent and
2629 +# manipulate the rules that make up a grammar.
2630 # -----------------------------------------------------------------------------
2632 -# -----------------------------------------------------------------------------
2635 -# This function checks to see if there are duplicated p_rulename() functions
2636 -# in the parser module file. Without this function, it is really easy for
2637 -# users to make mistakes by cutting and pasting code fragments (and it's a real
2638 -# bugger to try and figure out why the resulting parser doesn't work). Therefore,
2639 -# we just do a little regular expression pattern matching of def statements
2640 -# to try and detect duplicates.
2641 -# -----------------------------------------------------------------------------
2643 -def validate_file(filename):
2644 - base,ext = os.path.splitext(filename)
2645 - if ext != '.py': return 1 # No idea. Assume it's okay.
2648 - f = open(filename)
2649 - lines = f.readlines()
2652 - return 1 # Oh well
2654 - # Match def p_funcname(
2655 - fre = re.compile(r'\s*def\s+(p_[a-zA-Z_0-9]*)\(')
2663 - prev = counthash.get(name)
2665 - counthash[name] = linen
2667 - sys.stderr.write("%s:%d: Function %s redefined. Previously defined on line %d\n" % (filename,linen,name,prev))
2672 -# This function looks for functions that might be grammar rules, but which don't have the proper p_suffix.
2673 -def validate_dict(d):
2674 - for n,v in d.items():
2675 - if n[0:2] == 'p_' and type(v) in (types.FunctionType, types.MethodType): continue
2676 - if n[0:2] == 't_': continue
2678 - if n[0:2] == 'p_':
2679 - sys.stderr.write("yacc: Warning. '%s' not defined as a function\n" % n)
2680 - if 1 and isinstance(v,types.FunctionType) and v.func_code.co_argcount == 1:
2682 - doc = v.__doc__.split(" ")
2684 - sys.stderr.write("%s:%d: Warning. Possible grammar rule '%s' defined without p_ prefix.\n" % (v.func_code.co_filename, v.func_code.co_firstlineno,n))
2685 - except StandardError:
2688 -# -----------------------------------------------------------------------------
2689 -# === GRAMMAR FUNCTIONS ===
2691 -# The following global variables and functions are used to store, manipulate,
2692 -# and verify the grammar rules specified by the user.
2693 -# -----------------------------------------------------------------------------
2695 -# Initialize all of the global variables used during grammar construction
2696 -def initialize_vars():
2697 - global Productions, Prodnames, Prodmap, Terminals
2698 - global Nonterminals, First, Follow, Precedence, UsedPrecedence, LRitems
2699 - global Errorfunc, Signature, Requires
2701 - Productions = [None] # A list of all of the productions. The first
2702 - # entry is always reserved for the purpose of
2703 - # building an augmented grammar
2705 - Prodnames = { } # A dictionary mapping the names of nonterminals to a list of all
2706 - # productions of that nonterminal.
2708 - Prodmap = { } # A dictionary that is only used to detect duplicate
2711 - Terminals = { } # A dictionary mapping the names of terminal symbols to a
2712 - # list of the rules where they are used.
2714 - Nonterminals = { } # A dictionary mapping names of nonterminals to a list
2715 - # of rule numbers where they are used.
2717 - First = { } # A dictionary of precomputed FIRST(x) symbols
2719 - Follow = { } # A dictionary of precomputed FOLLOW(x) symbols
2721 - Precedence = { } # Precedence rules for each terminal. Contains tuples of the
2722 - # form ('right',level) or ('nonassoc', level) or ('left',level)
2724 - UsedPrecedence = { } # Precedence rules that were actually used by the grammer.
2725 - # This is only used to provide error checking and to generate
2726 - # a warning about unused precedence rules.
2728 - LRitems = [ ] # A list of all LR items for the grammar. These are the
2729 - # productions with the "dot" like E -> E . PLUS E
2731 - Errorfunc = None # User defined error handler
2733 - Signature = md5.new() # Digital signature of the grammar rules, precedence
2734 - # and other information. Used to determined when a
2735 - # parsing table needs to be regenerated.
2737 - Signature.update(__tabversion__)
2739 - Requires = { } # Requires list
2741 - # File objects used when creating the parser.out debugging file
2743 - _vf = cStringIO.StringIO()
2744 - _vfc = cStringIO.StringIO()
2747 +# regex matching identifiers
2748 +_is_identifier = re.compile(r'^[a-zA-Z0-9_-]+$')
2750 # -----------------------------------------------------------------------------
2753 # This class stores the raw information about a single production or grammar rule.
2754 -# It has a few required attributes:
2755 +# A grammar rule refers to a specification such as this:
2757 -# name - Name of the production (nonterminal)
2758 -# prod - A list of symbols making up its production
2759 +# expr : expr PLUS term
2761 +# Here are the basic attributes defined on all productions
2763 +# name - Name of the production. For example 'expr'
2764 +# prod - A list of symbols on the right side ['expr','PLUS','term']
2765 +# prec - Production precedence level
2766 # number - Production number.
2767 +# func - Function that executes on reduce
2768 +# file - File where production function is defined
2769 +# lineno - Line number where production function is defined
2771 -# In addition, a few additional attributes are used to help with debugging or
2772 -# optimization of table generation.
2773 +# The following attributes are defined or optional.
2775 -# file - File where production action is defined.
2776 -# lineno - Line number where action is defined
2777 -# func - Action function
2778 -# prec - Precedence level
2779 -# lr_next - Next LR item. Example, if we are ' E -> E . PLUS E'
2780 -# then lr_next refers to 'E -> E PLUS . E'
2781 -# lr_index - LR item index (location of the ".") in the prod list.
2782 +# len - Length of the production (number of symbols on right hand side)
2783 +# usyms - Set of unique symbols found in the production
2784 +# -----------------------------------------------------------------------------
2786 +class Production(object):
2788 + def __init__(self,number,name,prod,precedence=('right',0),func=None,file='',line=0):
2790 + self.prod = tuple(prod)
2791 + self.number = number
2793 + self.callable = None
2796 + self.prec = precedence
2798 + # Internal settings used during table construction
2800 + self.len = len(self.prod) # Length of the production
2802 + # Create a list of unique production symbols used in the production
2804 + for s in self.prod:
2805 + if s not in self.usyms:
2806 + self.usyms.append(s)
2808 + # List of all LR items for the production
2809 + self.lr_items = []
2810 + self.lr_next = None
2812 + # Create a string representation
2814 + self.str = "%s -> %s" % (self.name," ".join(self.prod))
2816 + self.str = "%s -> <empty>" % self.name
2818 + def __str__(self):
2821 + def __repr__(self):
2822 + return "Production("+str(self)+")"
2824 + def __len__(self):
2825 + return len(self.prod)
2827 + def __nonzero__(self):
2830 + def __getitem__(self,index):
2831 + return self.prod[index]
2833 + # Return the nth lr_item from the production (or None if at the end)
2834 + def lr_item(self,n):
2835 + if n > len(self.prod): return None
2836 + p = LRItem(self,n)
2838 + # Precompute the list of productions immediately following. Hack. Remove later
2840 + p.lr_after = Prodnames[p.prod[n+1]]
2841 + except (IndexError,KeyError):
2844 + p.lr_before = p.prod[n-1]
2845 + except IndexError:
2846 + p.lr_before = None
2850 + # Bind the production function name to a callable
2851 + def bind(self,pdict):
2853 + self.callable = pdict[self.func]
2855 +# This class serves as a minimal standin for Production objects when
2856 +# reading table data from files. It only contains information
2857 +# actually used by the LR parsing engine, plus some additional
2858 +# debugging information.
2859 +class MiniProduction(object):
2860 + def __init__(self,str,name,len,func,file,line):
2864 + self.callable = None
2868 + def __str__(self):
2870 + def __repr__(self):
2871 + return "MiniProduction(%s)" % self.str
2873 + # Bind the production function name to a callable
2874 + def bind(self,pdict):
2876 + self.callable = pdict[self.func]
2879 +# -----------------------------------------------------------------------------
2882 +# This class represents a specific stage of parsing a production rule. For
2885 +# expr : expr . PLUS term
2887 +# In the above, the "." represents the current location of the parse. Here
2888 +# basic attributes:
2890 +# name - Name of the production. For example 'expr'
2891 +# prod - A list of symbols on the right side ['expr','.', 'PLUS','term']
2892 +# number - Production number.
2894 +# lr_next Next LR item. Example, if we are ' expr -> expr . PLUS term'
2895 +# then lr_next refers to 'expr -> expr PLUS . term'
2896 +# lr_index - LR item index (location of the ".") in the prod list.
2897 # lookaheads - LALR lookahead symbols for this item
2898 -# len - Length of the production (number of symbols on right hand side)
2899 +# len - Length of the production (number of symbols on right hand side)
2900 +# lr_after - List of all productions that immediately follow
2901 +# lr_before - Grammar symbol immediately before
2902 # -----------------------------------------------------------------------------
2905 - def __init__(self,**kw):
2906 - for k,v in kw.items():
2908 - self.lr_index = -1
2909 - self.lr0_added = 0 # Flag indicating whether or not added to LR0 closure
2910 - self.lr1_added = 0 # Flag indicating whether or not added to LR1
2912 +class LRItem(object):
2913 + def __init__(self,p,n):
2914 + self.name = p.name
2915 + self.prod = list(p.prod)
2916 + self.number = p.number
2918 self.lookaheads = { }
2919 - self.lk_added = { }
2920 - self.setnumbers = [ ]
2921 + self.prod.insert(n,".")
2922 + self.prod = tuple(self.prod)
2923 + self.len = len(self.prod)
2924 + self.usyms = p.usyms
2928 s = "%s -> %s" % (self.name," ".join(self.prod))
2930 s = "%s -> <empty>" % self.name
2936 - # Compute lr_items from the production
2937 - def lr_item(self,n):
2938 - if n > len(self.prod): return None
2940 - p.name = self.name
2941 - p.prod = list(self.prod)
2942 - p.number = self.number
2944 - p.lookaheads = { }
2945 - p.setnumbers = self.setnumbers
2946 - p.prod.insert(n,".")
2947 - p.prod = tuple(p.prod)
2948 - p.len = len(p.prod)
2949 - p.usyms = self.usyms
2951 - # Precompute list of productions immediately following
2952 + return "LRItem("+str(self)+")"
2954 +# -----------------------------------------------------------------------------
2955 +# rightmost_terminal()
2957 +# Return the rightmost terminal from a list of symbols. Used in add_production()
2958 +# -----------------------------------------------------------------------------
2959 +def rightmost_terminal(symbols, terminals):
2960 + i = len(symbols) - 1
2962 + if symbols[i] in terminals:
2967 +# -----------------------------------------------------------------------------
2968 +# === GRAMMAR CLASS ===
2970 +# The following class represents the contents of the specified grammar along
2971 +# with various computed properties such as first sets, follow sets, LR items, etc.
2972 +# This data is used for critical parts of the table generation process later.
2973 +# -----------------------------------------------------------------------------
2975 +class GrammarError(YaccError): pass
2977 +class Grammar(object):
2978 + def __init__(self,terminals):
2979 + self.Productions = [None] # A list of all of the productions. The first
2980 + # entry is always reserved for the purpose of
2981 + # building an augmented grammar
2983 + self.Prodnames = { } # A dictionary mapping the names of nonterminals to a list of all
2984 + # productions of that nonterminal.
2986 + self.Prodmap = { } # A dictionary that is only used to detect duplicate
2989 + self.Terminals = { } # A dictionary mapping the names of terminal symbols to a
2990 + # list of the rules where they are used.
2992 + for term in terminals:
2993 + self.Terminals[term] = []
2995 + self.Terminals['error'] = []
2997 + self.Nonterminals = { } # A dictionary mapping names of nonterminals to a list
2998 + # of rule numbers where they are used.
3000 + self.First = { } # A dictionary of precomputed FIRST(x) symbols
3002 + self.Follow = { } # A dictionary of precomputed FOLLOW(x) symbols
3004 + self.Precedence = { } # Precedence rules for each terminal. Contains tuples of the
3005 + # form ('right',level) or ('nonassoc', level) or ('left',level)
3007 + self.UsedPrecedence = { } # Precedence rules that were actually used by the grammer.
3008 + # This is only used to provide error checking and to generate
3009 + # a warning about unused precedence rules.
3011 + self.Start = None # Starting symbol for the grammar
3014 + def __len__(self):
3015 + return len(self.Productions)
3017 + def __getitem__(self,index):
3018 + return self.Productions[index]
3020 + # -----------------------------------------------------------------------------
3021 + # set_precedence()
3023 + # Sets the precedence for a given terminal. assoc is the associativity such as
3024 + # 'left','right', or 'nonassoc'. level is a numeric level.
3026 + # -----------------------------------------------------------------------------
3028 + def set_precedence(self,term,assoc,level):
3029 + assert self.Productions == [None],"Must call set_precedence() before add_production()"
3030 + if term in self.Precedence:
3031 + raise GrammarError("Precedence already specified for terminal '%s'" % term)
3032 + if assoc not in ['left','right','nonassoc']:
3033 + raise GrammarError("Associativity must be one of 'left','right', or 'nonassoc'")
3034 + self.Precedence[term] = (assoc,level)
3036 + # -----------------------------------------------------------------------------
3037 + # add_production()
3039 + # Given an action function, this function assembles a production rule and
3040 + # computes its precedence level.
3042 + # The production rule is supplied as a list of symbols. For example,
3043 + # a rule such as 'expr : expr PLUS term' has a production name of 'expr' and
3044 + # symbols ['expr','PLUS','term'].
3046 + # Precedence is determined by the precedence of the right-most non-terminal
3047 + # or the precedence of a terminal specified by %prec.
3049 + # A variety of error checks are performed to make sure production symbols
3050 + # are valid and that %prec is used correctly.
3051 + # -----------------------------------------------------------------------------
3053 + def add_production(self,prodname,syms,func=None,file='',line=0):
3055 + if prodname in self.Terminals:
3056 + raise GrammarError("%s:%d: Illegal rule name '%s'. Already defined as a token" % (file,line,prodname))
3057 + if prodname == 'error':
3058 + raise GrammarError("%s:%d: Illegal rule name '%s'. error is a reserved word" % (file,line,prodname))
3059 + if not _is_identifier.match(prodname):
3060 + raise GrammarError("%s:%d: Illegal rule name '%s'" % (file,line,prodname))
3062 + # Look for literal tokens
3063 + for n,s in enumerate(syms):
3068 + raise GrammarError("%s:%d: Literal token %s in rule '%s' may only be a single character" % (file,line,s, prodname))
3069 + if not c in self.Terminals:
3070 + self.Terminals[c] = []
3073 + except SyntaxError:
3075 + if not _is_identifier.match(s) and s != '%prec':
3076 + raise GrammarError("%s:%d: Illegal name '%s' in rule '%s'" % (file,line,s, prodname))
3078 + # Determine the precedence level
3079 + if '%prec' in syms:
3080 + if syms[-1] == '%prec':
3081 + raise GrammarError("%s:%d: Syntax error. Nothing follows %%prec" % (file,line))
3082 + if syms[-2] != '%prec':
3083 + raise GrammarError("%s:%d: Syntax error. %%prec can only appear at the end of a grammar rule" % (file,line))
3084 + precname = syms[-1]
3085 + prodprec = self.Precedence.get(precname,None)
3087 + raise GrammarError("%s:%d: Nothing known about the precedence of '%s'" % (file,line,precname))
3089 + self.UsedPrecedence[precname] = 1
3090 + del syms[-2:] # Drop %prec from the rule
3092 + # If no %prec, precedence is determined by the rightmost terminal symbol
3093 + precname = rightmost_terminal(syms,self.Terminals)
3094 + prodprec = self.Precedence.get(precname,('right',0))
3096 + # See if the rule is already in the rulemap
3097 + map = "%s -> %s" % (prodname,syms)
3098 + if map in self.Prodmap:
3099 + m = self.Prodmap[map]
3100 + raise GrammarError("%s:%d: Duplicate rule %s. " % (file,line, m) +
3101 + "Previous definition at %s:%d" % (m.file, m.line))
3103 + # From this point on, everything is valid. Create a new Production instance
3104 + pnumber = len(self.Productions)
3105 + if not prodname in self.Nonterminals:
3106 + self.Nonterminals[prodname] = [ ]
3108 + # Add the production number to Terminals and Nonterminals
3110 + if t in self.Terminals:
3111 + self.Terminals[t].append(pnumber)
3113 + if not t in self.Nonterminals:
3114 + self.Nonterminals[t] = [ ]
3115 + self.Nonterminals[t].append(pnumber)
3117 + # Create a production and add it to the list of productions
3118 + p = Production(pnumber,prodname,syms,prodprec,func,file,line)
3119 + self.Productions.append(p)
3120 + self.Prodmap[map] = p
3122 + # Add to the global productions list
3124 - p.lrafter = Prodnames[p.prod[n+1]]
3125 - except (IndexError,KeyError),e:
3128 - p.lrbefore = p.prod[n-1]
3129 - except IndexError:
3134 -class MiniProduction:
3137 -# regex matching identifiers
3138 -_is_identifier = re.compile(r'^[a-zA-Z0-9_-]+$')
3140 -# -----------------------------------------------------------------------------
3143 -# Given an action function, this function assembles a production rule.
3144 -# The production rule is assumed to be found in the function's docstring.
3145 -# This rule has the general syntax:
3147 -# name1 ::= production1
3152 -# name2 ::= production1
3155 -# -----------------------------------------------------------------------------
3157 -def add_production(f,file,line,prodname,syms):
3159 - if Terminals.has_key(prodname):
3160 - sys.stderr.write("%s:%d: Illegal rule name '%s'. Already defined as a token.\n" % (file,line,prodname))
3162 - if prodname == 'error':
3163 - sys.stderr.write("%s:%d: Illegal rule name '%s'. error is a reserved word.\n" % (file,line,prodname))
3166 - if not _is_identifier.match(prodname):
3167 - sys.stderr.write("%s:%d: Illegal rule name '%s'\n" % (file,line,prodname))
3170 - for x in range(len(syms)):
3176 - sys.stderr.write("%s:%d: Literal token %s in rule '%s' may only be a single character\n" % (file,line,s, prodname))
3178 - if not Terminals.has_key(c):
3182 - except SyntaxError:
3184 - if not _is_identifier.match(s) and s != '%prec':
3185 - sys.stderr.write("%s:%d: Illegal name '%s' in rule '%s'\n" % (file,line,s, prodname))
3188 - # See if the rule is already in the rulemap
3189 - map = "%s -> %s" % (prodname,syms)
3190 - if Prodmap.has_key(map):
3192 - sys.stderr.write("%s:%d: Duplicate rule %s.\n" % (file,line, m))
3193 - sys.stderr.write("%s:%d: Previous definition at %s:%d\n" % (file,line, m.file, m.line))
3202 - p.number = len(Productions)
3205 - Productions.append(p)
3207 - if not Nonterminals.has_key(prodname):
3208 - Nonterminals[prodname] = [ ]
3210 - # Add all terminals to Terminals
3212 - while i < len(p.prod):
3216 - precname = p.prod[i+1]
3217 - except IndexError:
3218 - sys.stderr.write("%s:%d: Syntax error. Nothing follows %%prec.\n" % (p.file,p.line))
3221 - prec = Precedence.get(precname,None)
3223 - sys.stderr.write("%s:%d: Nothing known about the precedence of '%s'\n" % (p.file,p.line,precname))
3227 - UsedPrecedence[precname] = 1
3232 - if Terminals.has_key(t):
3233 - Terminals[t].append(p.number)
3234 - # Is a terminal. We'll assign a precedence to p based on this
3235 - if not hasattr(p,"prec"):
3236 - p.prec = Precedence.get(t,('right',0))
3238 - if not Nonterminals.has_key(t):
3239 - Nonterminals[t] = [ ]
3240 - Nonterminals[t].append(p.number)
3243 - if not hasattr(p,"prec"):
3244 - p.prec = ('right',0)
3246 - # Set final length of productions
3247 - p.len = len(p.prod)
3248 - p.prod = tuple(p.prod)
3250 - # Calculate unique syms in the production
3253 - if s not in p.usyms:
3256 - # Add to the global productions list
3258 - Prodnames[p.name].append(p)
3260 - Prodnames[p.name] = [ p ]
3263 -# Given a raw rule function, this function rips out its doc string
3264 -# and adds rules to the grammar
3266 -def add_function(f):
3267 - line = f.func_code.co_firstlineno
3268 - file = f.func_code.co_filename
3271 - if isinstance(f,types.MethodType):
3276 - if f.func_code.co_argcount > reqdargs:
3277 - sys.stderr.write("%s:%d: Rule '%s' has too many arguments.\n" % (file,line,f.__name__))
3280 - if f.func_code.co_argcount < reqdargs:
3281 - sys.stderr.write("%s:%d: Rule '%s' requires an argument.\n" % (file,line,f.__name__))
3285 - # Split the doc string into lines
3286 - pstrings = f.__doc__.splitlines()
3289 - for ps in pstrings:
3292 + self.Prodnames[prodname].append(p)
3294 + self.Prodnames[prodname] = [ p ]
3297 + # -----------------------------------------------------------------------------
3300 + # Sets the starting symbol and creates the augmented grammar. Production
3301 + # rule 0 is S' -> start where start is the start symbol.
3302 + # -----------------------------------------------------------------------------
3304 + def set_start(self,start=None):
3306 + start = self.Productions[1].name
3307 + if start not in self.Nonterminals:
3308 + raise GrammarError("start symbol %s undefined" % start)
3309 + self.Productions[0] = Production(0,"S'",[start])
3310 + self.Nonterminals[start].append(0)
3311 + self.Start = start
3313 + # -----------------------------------------------------------------------------
3314 + # find_unreachable()
3316 + # Find all of the nonterminal symbols that can't be reached from the starting
3317 + # symbol. Returns a list of nonterminals that can't be reached.
3318 + # -----------------------------------------------------------------------------
3320 + def find_unreachable(self):
3322 + # Mark all symbols that are reachable from a symbol s
3323 + def mark_reachable_from(s):
3325 + # We've already reached symbol s.
3328 + for p in self.Prodnames.get(s,[]):
3330 + mark_reachable_from(r)
3333 + for s in list(self.Terminals) + list(self.Nonterminals):
3336 + mark_reachable_from( self.Productions[0].prod[0] )
3338 + return [s for s in list(self.Nonterminals)
3339 + if not reachable[s]]
3341 + # -----------------------------------------------------------------------------
3342 + # infinite_cycles()
3344 + # This function looks at the various parsing rules and tries to detect
3345 + # infinite recursion cycles (grammar rules where there is no possible way
3346 + # to derive a string of only terminals).
3347 + # -----------------------------------------------------------------------------
3349 + def infinite_cycles(self):
3353 + for t in self.Terminals:
3356 + terminates['$end'] = 1
3360 + # Initialize to false:
3361 + for n in self.Nonterminals:
3364 + # Then propagate termination until no change:
3367 + for (n,pl) in self.Prodnames.items():
3368 + # Nonterminal n terminates iff any of its productions terminates.
3370 + # Production p terminates iff all of its rhs symbols terminate.
3372 + if not terminates[s]:
3373 + # The symbol s does not terminate,
3374 + # so production p does not terminate.
3378 + # didn't break from the loop,
3379 + # so every symbol s terminates
3380 + # so production p terminates.
3384 + # symbol n terminates!
3385 + if not terminates[n]:
3388 + # Don't need to consider any more productions for this n.
3391 + if not some_change:
3395 + for (s,term) in terminates.items():
3397 + if not s in self.Prodnames and not s in self.Terminals and s != 'error':
3398 + # s is used-but-not-defined, and we've already warned of that,
3399 + # so it would be overkill to say that it's also non-terminating.
3402 + infinite.append(s)
3407 + # -----------------------------------------------------------------------------
3408 + # undefined_symbols()
3410 + # Find all symbols that were used the grammar, but not defined as tokens or
3411 + # grammar rules. Returns a list of tuples (sym, prod) where sym in the symbol
3412 + # and prod is the production where the symbol was used.
3413 + # -----------------------------------------------------------------------------
3414 + def undefined_symbols(self):
3416 + for p in self.Productions:
3420 - # This is a continuation of a previous rule
3422 - sys.stderr.write("%s:%d: Misplaced '|'.\n" % (file,dline))
3431 + if not s in self.Prodnames and not s in self.Terminals and s != 'error':
3432 + result.append((s,p))
3435 + # -----------------------------------------------------------------------------
3436 + # unused_terminals()
3438 + # Find all terminals that were defined, but not used by the grammar. Returns
3439 + # a list of all symbols.
3440 + # -----------------------------------------------------------------------------
3441 + def unused_terminals(self):
3443 + for s,v in self.Terminals.items():
3444 + if s != 'error' and not v:
3445 + unused_tok.append(s)
3449 + # ------------------------------------------------------------------------------
3452 + # Find all grammar rules that were defined, but not used (maybe not reachable)
3453 + # Returns a list of productions.
3454 + # ------------------------------------------------------------------------------
3456 + def unused_rules(self):
3458 + for s,v in self.Nonterminals.items():
3460 + p = self.Prodnames[s][0]
3461 + unused_prod.append(p)
3462 + return unused_prod
3464 + # -----------------------------------------------------------------------------
3465 + # unused_precedence()
3467 + # Returns a list of tuples (term,precedence) corresponding to precedence
3468 + # rules that were never used by the grammar. term is the name of the terminal
3469 + # on which precedence was applied and precedence is a string such as 'left' or
3470 + # 'right' corresponding to the type of precedence.
3471 + # -----------------------------------------------------------------------------
3473 + def unused_precedence(self):
3475 + for termname in self.Precedence:
3476 + if not (termname in self.Terminals or termname in self.UsedPrecedence):
3477 + unused.append((termname,self.Precedence[termname][0]))
3481 + # -------------------------------------------------------------------------
3484 + # Compute the value of FIRST1(beta) where beta is a tuple of symbols.
3486 + # During execution of compute_first1, the result may be incomplete.
3487 + # Afterward (e.g., when called from compute_follow()), it will be complete.
3488 + # -------------------------------------------------------------------------
3489 + def _first(self,beta):
3491 + # We are computing First(x1,x2,x3,...,xn)
3494 + x_produces_empty = 0
3496 + # Add all the non-<empty> symbols of First[x] to the result.
3497 + for f in self.First[x]:
3498 + if f == '<empty>':
3499 + x_produces_empty = 1
3508 - if assign != ':' and assign != '::=':
3509 - sys.stderr.write("%s:%d: Syntax error. Expected ':'\n" % (file,dline))
3513 - e = add_production(f,file,dline,prodname,syms)
3517 - except StandardError:
3518 - sys.stderr.write("%s:%d: Syntax error in rule '%s'\n" % (file,dline,ps))
3521 - sys.stderr.write("%s:%d: No documentation string specified in function '%s'\n" % (file,line,f.__name__))
3525 -# Cycle checking code (Michael Dyck)
3527 -def compute_reachable():
3529 - Find each symbol that can be reached from the start symbol.
3530 - Print a warning for any nonterminals that can't be reached.
3531 - (Unused terminals have already had their warning.)
3534 - for s in Terminals.keys() + Nonterminals.keys():
3537 - mark_reachable_from( Productions[0].prod[0], Reachable )
3539 - for s in Nonterminals.keys():
3540 - if not Reachable[s]:
3541 - sys.stderr.write("yacc: Symbol '%s' is unreachable.\n" % s)
3543 -def mark_reachable_from(s, Reachable):
3545 - Mark all symbols that are reachable from symbol s.
3548 - # We've already reached symbol s.
3551 - for p in Prodnames.get(s,[]):
3553 - mark_reachable_from(r, Reachable)
3555 -# -----------------------------------------------------------------------------
3556 -# compute_terminates()
3558 -# This function looks at the various parsing rules and tries to detect
3559 -# infinite recursion cycles (grammar rules where there is no possible way
3560 -# to derive a string of only terminals).
3561 -# -----------------------------------------------------------------------------
3562 -def compute_terminates():
3564 - Raise an error for any symbols that don't terminate.
3569 - for t in Terminals.keys():
3572 - Terminates['$end'] = 1
3576 - # Initialize to false:
3577 - for n in Nonterminals.keys():
3580 - # Then propagate termination until no change:
3583 - for (n,pl) in Prodnames.items():
3584 - # Nonterminal n terminates iff any of its productions terminates.
3586 - # Production p terminates iff all of its rhs symbols terminate.
3588 - if not Terminates[s]:
3589 - # The symbol s does not terminate,
3590 - # so production p does not terminate.
3594 - # didn't break from the loop,
3595 - # so every symbol s terminates
3596 - # so production p terminates.
3600 - # symbol n terminates!
3601 - if not Terminates[n]:
3604 - # Don't need to consider any more productions for this n.
3607 - if not some_change:
3611 - for (s,terminates) in Terminates.items():
3612 - if not terminates:
3613 - if not Prodnames.has_key(s) and not Terminals.has_key(s) and s != 'error':
3614 - # s is used-but-not-defined, and we've already warned of that,
3615 - # so it would be overkill to say that it's also non-terminating.
3616 + if f not in result: result.append(f)
3618 + if x_produces_empty:
3619 + # We have to consider the next x in beta,
3620 + # i.e. stay in the loop.
3623 - sys.stderr.write("yacc: Infinite recursion detected for symbol '%s'.\n" % s)
3627 + # We don't have to consider any further symbols in beta.
3630 + # There was no 'break' from the loop,
3631 + # so x_produces_empty was true for all x in beta,
3632 + # so beta produces empty as well.
3633 + result.append('<empty>')
3637 + # -------------------------------------------------------------------------
3640 + # Compute the value of FIRST1(X) for all symbols
3641 + # -------------------------------------------------------------------------
3642 + def compute_first(self):
3647 + for t in self.Terminals:
3648 + self.First[t] = [t]
3650 + self.First['$end'] = ['$end']
3654 + # Initialize to the empty set:
3655 + for n in self.Nonterminals:
3656 + self.First[n] = []
3658 + # Then propagate symbols until no change:
3661 + for n in self.Nonterminals:
3662 + for p in self.Prodnames[n]:
3663 + for f in self._first(p.prod):
3664 + if f not in self.First[n]:
3665 + self.First[n].append( f )
3667 + if not some_change:
3672 + # ---------------------------------------------------------------------
3673 + # compute_follow()
3675 + # Computes all of the follow sets for every non-terminal symbol. The
3676 + # follow set is the set of all symbols that might follow a given
3677 + # non-terminal. See the Dragon book, 2nd Ed. p. 189.
3678 + # ---------------------------------------------------------------------
3679 + def compute_follow(self,start=None):
3680 + # If already computed, return the result
3682 + return self.Follow
3684 + # If first sets not computed yet, do that first.
3685 + if not self.First:
3686 + self.compute_first()
3688 + # Add '$end' to the follow list of the start symbol
3689 + for k in self.Nonterminals:
3690 + self.Follow[k] = [ ]
3693 + start = self.Productions[1].name
3695 + self.Follow[start] = [ '$end' ]
3699 + for p in self.Productions[1:]:
3700 + # Here is the production set
3701 + for i in range(len(p.prod)):
3703 + if B in self.Nonterminals:
3704 + # Okay. We got a non-terminal in a production
3705 + fst = self._first(p.prod[i+1:])
3708 + if f != '<empty>' and f not in self.Follow[B]:
3709 + self.Follow[B].append(f)
3711 + if f == '<empty>':
3713 + if hasempty or i == (len(p.prod)-1):
3714 + # Add elements of follow(a) to follow(b)
3715 + for f in self.Follow[p.name]:
3716 + if f not in self.Follow[B]:
3717 + self.Follow[B].append(f)
3719 + if not didadd: break
3720 + return self.Follow
3723 + # -----------------------------------------------------------------------------
3726 + # This function walks the list of productions and builds a complete set of the
3727 + # LR items. The LR items are stored in two ways: First, they are uniquely
3728 + # numbered and placed in the list _lritems. Second, a linked list of LR items
3729 + # is built for each production. For example:
3733 + # Creates the list
3735 + # [E -> . E PLUS E, E -> E . PLUS E, E -> E PLUS . E, E -> E PLUS E . ]
3736 + # -----------------------------------------------------------------------------
3738 + def build_lritems(self):
3739 + for p in self.Productions:
3748 + # Precompute the list of productions immediately following
3750 + lri.lr_after = self.Prodnames[lri.prod[i+1]]
3751 + except (IndexError,KeyError):
3754 + lri.lr_before = lri.prod[i-1]
3755 + except IndexError:
3756 + lri.lr_before = None
3758 + lastlri.lr_next = lri
3760 + lr_items.append(lri)
3763 + p.lr_items = lr_items
3765 # -----------------------------------------------------------------------------
3766 -# verify_productions()
3767 +# == Class LRTable ==
3769 -# This function examines all of the supplied rules to see if they seem valid.
3770 +# This basic class represents a basic table of LR parsing information.
3771 +# Methods for generating the tables are not defined here. They are defined
3772 +# in the derived class LRGeneratedTable.
3773 # -----------------------------------------------------------------------------
3774 -def verify_productions(cycle_check=1):
3776 - for p in Productions:
3777 - if not p: continue
3780 - if not Prodnames.has_key(s) and not Terminals.has_key(s) and s != 'error':
3781 - sys.stderr.write("%s:%d: Symbol '%s' used, but not defined as a token or a rule.\n" % (p.file,p.line,s))
3786 - # Now verify all of the tokens
3788 - _vf.write("Unused terminals:\n\n")
3789 - for s,v in Terminals.items():
3790 - if s != 'error' and not v:
3791 - sys.stderr.write("yacc: Warning. Token '%s' defined, but not used.\n" % s)
3792 - if yaccdebug: _vf.write(" %s\n"% s)
3795 - # Print out all of the productions
3797 - _vf.write("\nGrammar\n\n")
3798 - for i in range(1,len(Productions)):
3799 - _vf.write("Rule %-5d %s\n" % (i, Productions[i]))
3802 - # Verify the use of all productions
3803 - for s,v in Nonterminals.items():
3805 - p = Prodnames[s][0]
3806 - sys.stderr.write("%s:%d: Warning. Rule '%s' defined, but not used.\n" % (p.file,p.line, s))
3810 - if unused_tok == 1:
3811 - sys.stderr.write("yacc: Warning. There is 1 unused token.\n")
3812 - if unused_tok > 1:
3813 - sys.stderr.write("yacc: Warning. There are %d unused tokens.\n" % unused_tok)
3815 - if unused_prod == 1:
3816 - sys.stderr.write("yacc: Warning. There is 1 unused rule.\n")
3817 - if unused_prod > 1:
3818 - sys.stderr.write("yacc: Warning. There are %d unused rules.\n" % unused_prod)
3821 - _vf.write("\nTerminals, with rules where they appear\n\n")
3822 - ks = Terminals.keys()
3825 - _vf.write("%-20s : %s\n" % (k, " ".join([str(s) for s in Terminals[k]])))
3826 - _vf.write("\nNonterminals, with rules where they appear\n\n")
3827 - ks = Nonterminals.keys()
3830 - _vf.write("%-20s : %s\n" % (k, " ".join([str(s) for s in Nonterminals[k]])))
3833 - compute_reachable()
3834 - error += compute_terminates()
3835 -# error += check_cycles()
3839 +class VersionError(YaccError): pass
3841 +class LRTable(object):
3842 + def __init__(self):
3843 + self.lr_action = None
3844 + self.lr_goto = None
3845 + self.lr_productions = None
3846 + self.lr_method = None
3848 + def read_table(self,module):
3849 + if isinstance(module,types.ModuleType):
3852 + if sys.version_info[0] < 3:
3853 + exec("import %s as parsetab" % module)
3856 + exec("import %s as parsetab" % module, env, env)
3857 + parsetab = env['parsetab']
3859 + if parsetab._tabversion != __tabversion__:
3860 + raise VersionError("yacc table file version is out of date")
3862 + self.lr_action = parsetab._lr_action
3863 + self.lr_goto = parsetab._lr_goto
3865 + self.lr_productions = []
3866 + for p in parsetab._lr_productions:
3867 + self.lr_productions.append(MiniProduction(*p))
3869 + self.lr_method = parsetab._lr_method
3870 + return parsetab._lr_signature
3872 + def read_pickle(self,filename):
3874 + import cPickle as pickle
3875 + except ImportError:
3878 + in_f = open(filename,"rb")
3880 + tabversion = pickle.load(in_f)
3881 + if tabversion != __tabversion__:
3882 + raise VersionError("yacc table file version is out of date")
3883 + self.lr_method = pickle.load(in_f)
3884 + signature = pickle.load(in_f)
3885 + self.lr_action = pickle.load(in_f)
3886 + self.lr_goto = pickle.load(in_f)
3887 + productions = pickle.load(in_f)
3889 + self.lr_productions = []
3890 + for p in productions:
3891 + self.lr_productions.append(MiniProduction(*p))
3896 + # Bind all production function names to callable objects in pdict
3897 + def bind_callables(self,pdict):
3898 + for p in self.lr_productions:
3901 # -----------------------------------------------------------------------------
3903 +# === LR Generator ===
3905 -# This function walks the list of productions and builds a complete set of the
3906 -# LR items. The LR items are stored in two ways: First, they are uniquely
3907 -# numbered and placed in the list _lritems. Second, a linked list of LR items
3908 -# is built for each production. For example:
3914 -# [E -> . E PLUS E, E -> E . PLUS E, E -> E PLUS . E, E -> E PLUS E . ]
3915 +# The following classes and functions are used to generate LR parsing tables on
3917 # -----------------------------------------------------------------------------
3919 -def build_lritems():
3920 - for p in Productions:
3922 - lri = p.lr_item(0)
3925 - lri = p.lr_item(i)
3926 - lastlri.lr_next = lri
3928 - lri.lr_num = len(LRitems)
3929 - LRitems.append(lri)
3933 - # In order for the rest of the parser generator to work, we need to
3934 - # guarantee that no more lritems are generated. Therefore, we nuke
3935 - # the p.lr_item method. (Only used in debugging)
3936 - # Production.lr_item = None
3938 -# -----------------------------------------------------------------------------
3941 -# Given a list of precedence rules, add to the precedence table.
3942 -# -----------------------------------------------------------------------------
3944 -def add_precedence(plist):
3952 - if prec != 'left' and prec != 'right' and prec != 'nonassoc':
3953 - sys.stderr.write("yacc: Invalid precedence '%s'\n" % prec)
3956 - if Precedence.has_key(t):
3957 - sys.stderr.write("yacc: Precedence already specified for terminal '%s'\n" % t)
3960 - Precedence[t] = (prec,plevel)
3962 - sys.stderr.write("yacc: Invalid precedence table.\n")
3967 -# -----------------------------------------------------------------------------
3968 -# check_precedence()
3970 -# Checks the use of the Precedence tables. This makes sure all of the symbols
3971 -# are terminals or were used with %prec
3972 -# -----------------------------------------------------------------------------
3974 -def check_precedence():
3976 - for precname in Precedence.keys():
3977 - if not (Terminals.has_key(precname) or UsedPrecedence.has_key(precname)):
3978 - sys.stderr.write("yacc: Precedence rule '%s' defined for unknown symbol '%s'\n" % (Precedence[precname][0],precname))
3982 -# -----------------------------------------------------------------------------
3983 -# augment_grammar()
3985 -# Compute the augmented grammar. This is just a rule S' -> start where start
3986 -# is the starting symbol.
3987 -# -----------------------------------------------------------------------------
3989 -def augment_grammar(start=None):
3991 - start = Productions[1].name
3992 - Productions[0] = Production(name="S'",prod=[start],number=0,len=1,prec=('right',0),func=None)
3993 - Productions[0].usyms = [ start ]
3994 - Nonterminals[start].append(0)
3997 -# -------------------------------------------------------------------------
4000 -# Compute the value of FIRST1(beta) where beta is a tuple of symbols.
4002 -# During execution of compute_first1, the result may be incomplete.
4003 -# Afterward (e.g., when called from compute_follow()), it will be complete.
4004 -# -------------------------------------------------------------------------
4007 - # We are computing First(x1,x2,x3,...,xn)
4010 - x_produces_empty = 0
4012 - # Add all the non-<empty> symbols of First[x] to the result.
4013 - for f in First[x]:
4014 - if f == '<empty>':
4015 - x_produces_empty = 1
4017 - if f not in result: result.append(f)
4019 - if x_produces_empty:
4020 - # We have to consider the next x in beta,
4021 - # i.e. stay in the loop.
4024 - # We don't have to consider any further symbols in beta.
4027 - # There was no 'break' from the loop,
4028 - # so x_produces_empty was true for all x in beta,
4029 - # so beta produces empty as well.
4030 - result.append('<empty>')
4036 -# Given a non-terminal. This function computes the set of all symbols
4037 -# that might follow it. Dragon book, p. 189.
4039 -def compute_follow(start=None):
4040 - # Add '$end' to the follow list of the start symbol
4041 - for k in Nonterminals.keys():
4045 - start = Productions[1].name
4047 - Follow[start] = [ '$end' ]
4051 - for p in Productions[1:]:
4052 - # Here is the production set
4053 - for i in range(len(p.prod)):
4055 - if Nonterminals.has_key(B):
4056 - # Okay. We got a non-terminal in a production
4057 - fst = first(p.prod[i+1:])
4060 - if f != '<empty>' and f not in Follow[B]:
4061 - Follow[B].append(f)
4063 - if f == '<empty>':
4065 - if hasempty or i == (len(p.prod)-1):
4066 - # Add elements of follow(a) to follow(b)
4067 - for f in Follow[p.name]:
4068 - if f not in Follow[B]:
4069 - Follow[B].append(f)
4071 - if not didadd: break
4073 - if 0 and yaccdebug:
4074 - _vf.write('\nFollow:\n')
4075 - for k in Nonterminals.keys():
4076 - _vf.write("%-20s : %s\n" % (k, " ".join([str(s) for s in Follow[k]])))
4078 -# -------------------------------------------------------------------------
4081 -# Compute the value of FIRST1(X) for all symbols
4082 -# -------------------------------------------------------------------------
4083 -def compute_first1():
4086 - for t in Terminals.keys():
4089 - First['$end'] = ['$end']
4090 - First['#'] = ['#'] # what's this for?
4094 - # Initialize to the empty set:
4095 - for n in Nonterminals.keys():
4098 - # Then propagate symbols until no change:
4101 - for n in Nonterminals.keys():
4102 - for p in Prodnames[n]:
4103 - for f in first(p.prod):
4104 - if f not in First[n]:
4105 - First[n].append( f )
4107 - if not some_change:
4110 - if 0 and yaccdebug:
4111 - _vf.write('\nFirst:\n')
4112 - for k in Nonterminals.keys():
4113 - _vf.write("%-20s : %s\n" %
4114 - (k, " ".join([str(s) for s in First[k]])))
4116 -# -----------------------------------------------------------------------------
4117 -# === SLR Generation ===
4119 -# The following functions are used to construct SLR (Simple LR) parsing tables
4120 -# as described on p.221-229 of the dragon book.
4121 -# -----------------------------------------------------------------------------
4123 -# Global variables for the LR parsing engine
4124 -def lr_init_vars():
4125 - global _lr_action, _lr_goto, _lr_method
4126 - global _lr_goto_cache, _lr0_cidhash
4128 - _lr_action = { } # Action table
4129 - _lr_goto = { } # Goto table
4130 - _lr_method = "Unknown" # LR method used
4131 - _lr_goto_cache = { }
4132 - _lr0_cidhash = { }
4135 -# Compute the LR(0) closure operation on I, where I is a set of LR(0) items.
4136 -# prodlist is a list of productions.
4138 -_add_count = 0 # Counter used to detect cycles
4140 -def lr0_closure(I):
4144 - prodlist = Productions
4146 - # Add everything in I to J
4152 - for x in j.lrafter:
4153 - if x.lr0_added == _add_count: continue
4154 - # Add B --> .G to J
4155 - J.append(x.lr_next)
4156 - x.lr0_added = _add_count
4161 -# Compute the LR(0) goto function goto(I,X) where I is a set
4162 -# of LR(0) items and X is a grammar symbol. This function is written
4163 -# in a way that guarantees uniqueness of the generated goto sets
4164 -# (i.e. the same goto set will never be returned as two different Python
4165 -# objects). With uniqueness, we can later do fast set comparisons using
4166 -# id(obj) instead of element-wise comparison.
4169 - # First we look for a previously cached entry
4170 - g = _lr_goto_cache.get((id(I),x),None)
4173 - # Now we generate the goto set in a way that guarantees uniqueness
4176 - s = _lr_goto_cache.get(x,None)
4179 - _lr_goto_cache[x] = s
4184 - if n and n.lrbefore == x:
4185 - s1 = s.get(id(n),None)
4191 - g = s.get('$end',None)
4194 - g = lr0_closure(gs)
4198 - _lr_goto_cache[(id(I),x)] = g
4203 -# Compute the LR(0) sets of item function
4206 - C = [ lr0_closure([Productions[0].lr_next]) ]
4209 - _lr0_cidhash[id(I)] = i
4212 - # Loop over the items in C and each grammar symbols
4218 - # Collect all of the symbols that could possibly be in the goto(I,X) sets
4221 - for s in ii.usyms:
4224 - for x in asyms.keys():
4226 - if not g: continue
4227 - if _lr0_cidhash.has_key(id(g)): continue
4228 - _lr0_cidhash[id(g)] = len(C)
4233 -# -----------------------------------------------------------------------------
4234 -# ==== LALR(1) Parsing ====
4236 -# LALR(1) parsing is almost exactly the same as SLR except that instead of
4237 -# relying upon Follow() sets when performing reductions, a more selective
4238 -# lookahead set that incorporates the state of the LR(0) machine is utilized.
4239 -# Thus, we mainly just have to focus on calculating the lookahead sets.
4241 -# The method used here is due to DeRemer and Pennelo (1982).
4243 -# DeRemer, F. L., and T. J. Pennelo: "Efficient Computation of LALR(1)
4244 -# Lookahead Sets", ACM Transactions on Programming Languages and Systems,
4245 -# Vol. 4, No. 4, Oct. 1982, pp. 615-649
4247 -# Further details can also be found in:
4249 -# J. Tremblay and P. Sorenson, "The Theory and Practice of Compiler Writing",
4250 -# McGraw-Hill Book Company, (1985).
4252 -# Note: This implementation is a complete replacement of the LALR(1)
4253 -# implementation in PLY-1.x releases. That version was based on
4254 -# a less efficient algorithm and it had bugs in its implementation.
4255 -# -----------------------------------------------------------------------------
4257 -# -----------------------------------------------------------------------------
4258 -# compute_nullable_nonterminals()
4260 -# Creates a dictionary containing all of the non-terminals that might produce
4261 -# an empty production.
4262 -# -----------------------------------------------------------------------------
4264 -def compute_nullable_nonterminals():
4268 - for p in Productions[1:]:
4270 - nullable[p.name] = 1
4273 - if not nullable.has_key(t): break
4275 - nullable[p.name] = 1
4276 - if len(nullable) == num_nullable: break
4277 - num_nullable = len(nullable)
4280 -# -----------------------------------------------------------------------------
4281 -# find_nonterminal_trans(C)
4283 -# Given a set of LR(0) items, this functions finds all of the non-terminal
4284 -# transitions. These are transitions in which a dot appears immediately before
4285 -# a non-terminal. Returns a list of tuples of the form (state,N) where state
4286 -# is the state number and N is the nonterminal symbol.
4288 -# The input C is the set of LR(0) items.
4289 -# -----------------------------------------------------------------------------
4291 -def find_nonterminal_transitions(C):
4293 - for state in range(len(C)):
4294 - for p in C[state]:
4295 - if p.lr_index < p.len - 1:
4296 - t = (state,p.prod[p.lr_index+1])
4297 - if Nonterminals.has_key(t[1]):
4298 - if t not in trans: trans.append(t)
4302 -# -----------------------------------------------------------------------------
4305 -# Computes the DR(p,A) relationships for non-terminal transitions. The input
4306 -# is a tuple (state,N) where state is a number and N is a nonterminal symbol.
4308 -# Returns a list of terminals.
4309 -# -----------------------------------------------------------------------------
4311 -def dr_relation(C,trans,nullable):
4316 - g = lr0_goto(C[state],N)
4318 - if p.lr_index < p.len - 1:
4319 - a = p.prod[p.lr_index+1]
4320 - if Terminals.has_key(a):
4321 - if a not in terms: terms.append(a)
4323 - # This extra bit is to handle the start state
4324 - if state == 0 and N == Productions[0].prod[0]:
4325 - terms.append('$end')
4329 -# -----------------------------------------------------------------------------
4332 -# Computes the READS() relation (p,A) READS (t,C).
4333 -# -----------------------------------------------------------------------------
4335 -def reads_relation(C, trans, empty):
4336 - # Look for empty transitions
4340 - g = lr0_goto(C[state],N)
4341 - j = _lr0_cidhash.get(id(g),-1)
4343 - if p.lr_index < p.len - 1:
4344 - a = p.prod[p.lr_index + 1]
4345 - if empty.has_key(a):
4350 -# -----------------------------------------------------------------------------
4351 -# compute_lookback_includes()
4353 -# Determines the lookback and includes relations
4357 -# This relation is determined by running the LR(0) state machine forward.
4358 -# For example, starting with a production "N : . A B C", we run it forward
4359 -# to obtain "N : A B C ." We then build a relationship between this final
4360 -# state and the starting state. These relationships are stored in a dictionary
4365 -# Computes the INCLUDE() relation (p,A) INCLUDES (p',B).
4367 -# This relation is used to determine non-terminal transitions that occur
4368 -# inside of other non-terminal transition states. (p,A) INCLUDES (p', B)
4369 -# if the following holds:
4371 -# B -> LAT, where T -> epsilon and p' -L-> p
4373 -# L is essentially a prefix (which may be empty), T is a suffix that must be
4374 -# able to derive an empty string. State p' must lead to state p with the string L.
4376 -# -----------------------------------------------------------------------------
4378 -def compute_lookback_includes(C,trans,nullable):
4380 - lookdict = {} # Dictionary of lookback relations
4381 - includedict = {} # Dictionary of include relations
4383 - # Make a dictionary of non-terminal transitions
4388 - # Loop over all transitions and compute lookbacks and includes
4389 - for state,N in trans:
4392 - for p in C[state]:
4393 - if p.name != N: continue
4395 - # Okay, we have a name match. We now follow the production all the way
4396 - # through the state machine until we get the . on the right hand side
4398 - lr_index = p.lr_index
4400 - while lr_index < p.len - 1:
4401 - lr_index = lr_index + 1
4402 - t = p.prod[lr_index]
4404 - # Check to see if this symbol and state are a non-terminal transition
4405 - if dtrans.has_key((j,t)):
4406 - # Yes. Okay, there is some chance that this is an includes relation
4407 - # the only way to know for certain is whether the rest of the
4408 - # production derives empty
4412 - if Terminals.has_key(p.prod[li]): break # No forget it
4413 - if not nullable.has_key(p.prod[li]): break
4416 - # Appears to be a relation between (j,t) and (state,N)
4417 - includes.append((j,t))
4419 - g = lr0_goto(C[j],t) # Go to next set
4420 - j = _lr0_cidhash.get(id(g),-1) # Go to next state
4422 - # When we get here, j is the final state, now we have to locate the production
4424 - if r.name != p.name: continue
4425 - if r.len != p.len: continue
4427 - # This look is comparing a production ". A B C" with "A B C ."
4428 - while i < r.lr_index:
4429 - if r.prod[i] != p.prod[i+1]: break
4432 - lookb.append((j,r))
4433 - for i in includes:
4434 - if not includedict.has_key(i): includedict[i] = []
4435 - includedict[i].append((state,N))
4436 - lookdict[(state,N)] = lookb
4438 - return lookdict,includedict
4440 # -----------------------------------------------------------------------------
4444 # The following two functions are used to compute set valued functions
4447 # F(x) = F'(x) U U{F(y) | x R y}
4448 @@ -2176,720 +1914,1363 @@ def traverse(x,N,stack,F,X,R,FP):
4449 rel = R(x) # Get y's related to x
4452 traverse(y,N,stack,F,X,R,FP)
4453 N[x] = min(N[x],N[y])
4454 for a in F.get(y,[]):
4455 if a not in F[x]: F[x].append(a)
4457 - N[stack[-1]] = sys.maxint
4458 + N[stack[-1]] = MAXINT
4460 element = stack.pop()
4462 - N[stack[-1]] = sys.maxint
4463 + N[stack[-1]] = MAXINT
4465 element = stack.pop()
4467 +class LALRError(YaccError): pass
4469 # -----------------------------------------------------------------------------
4470 -# compute_read_sets()
4471 +# == LRGeneratedTable ==
4473 -# Given a set of LR(0) items, this function computes the read sets.
4475 -# Inputs: C = Set of LR(0) items
4476 -# ntrans = Set of nonterminal transitions
4477 -# nullable = Set of empty transitions
4479 -# Returns a set containing the read sets
4480 +# This class implements the LR table generation algorithm. There are no
4481 +# public methods except for write()
4482 # -----------------------------------------------------------------------------
4484 -def compute_read_sets(C, ntrans, nullable):
4485 - FP = lambda x: dr_relation(C,x,nullable)
4486 - R = lambda x: reads_relation(C,x,nullable)
4487 - F = digraph(ntrans,R,FP)
4490 -# -----------------------------------------------------------------------------
4491 -# compute_follow_sets()
4493 -# Given a set of LR(0) items, a set of non-terminal transitions, a readset,
4494 -# and an include set, this function computes the follow sets
4496 -# Follow(p,A) = Read(p,A) U U {Follow(p',B) | (p,A) INCLUDES (p',B)}
4499 -# ntrans = Set of nonterminal transitions
4500 -# readsets = Readset (previously computed)
4501 -# inclsets = Include sets (previously computed)
4503 -# Returns a set containing the follow sets
4504 -# -----------------------------------------------------------------------------
4506 -def compute_follow_sets(ntrans,readsets,inclsets):
4507 - FP = lambda x: readsets[x]
4508 - R = lambda x: inclsets.get(x,[])
4509 - F = digraph(ntrans,R,FP)
4512 -# -----------------------------------------------------------------------------
4515 -# Attaches the lookahead symbols to grammar rules.
4517 -# Inputs: lookbacks - Set of lookback relations
4518 -# followset - Computed follow set
4520 -# This function directly attaches the lookaheads to productions contained
4521 -# in the lookbacks set
4522 -# -----------------------------------------------------------------------------
4524 -def add_lookaheads(lookbacks,followset):
4525 - for trans,lb in lookbacks.items():
4526 - # Loop over productions in lookback
4527 - for state,p in lb:
4528 - if not p.lookaheads.has_key(state):
4529 - p.lookaheads[state] = []
4530 - f = followset.get(trans,[])
4532 - if a not in p.lookaheads[state]: p.lookaheads[state].append(a)
4534 -# -----------------------------------------------------------------------------
4535 -# add_lalr_lookaheads()
4537 -# This function does all of the work of adding lookahead information for use
4538 -# with LALR parsing
4539 -# -----------------------------------------------------------------------------
4541 -def add_lalr_lookaheads(C):
4542 - # Determine all of the nullable nonterminals
4543 - nullable = compute_nullable_nonterminals()
4545 - # Find all non-terminal transitions
4546 - trans = find_nonterminal_transitions(C)
4548 - # Compute read sets
4549 - readsets = compute_read_sets(C,trans,nullable)
4551 - # Compute lookback/includes relations
4552 - lookd, included = compute_lookback_includes(C,trans,nullable)
4554 - # Compute LALR FOLLOW sets
4555 - followsets = compute_follow_sets(trans,readsets,included)
4557 - # Add all of the lookaheads
4558 - add_lookaheads(lookd,followsets)
4560 -# -----------------------------------------------------------------------------
4563 -# This function constructs the parse tables for SLR or LALR
4564 -# -----------------------------------------------------------------------------
4565 -def lr_parse_table(method):
4567 - goto = _lr_goto # Goto array
4568 - action = _lr_action # Action array
4569 - actionp = { } # Action production array (temporary)
4571 - _lr_method = method
4577 - sys.stderr.write("yacc: Generating %s parsing table...\n" % method)
4578 - _vf.write("\n\nParsing method: %s\n\n" % method)
4580 - # Step 1: Construct C = { I0, I1, ... IN}, collection of LR(0) items
4581 - # This determines the number of states
4585 - if method == 'LALR':
4586 - add_lalr_lookaheads(C)
4589 - # Build the parser table, state by state
4592 - # Loop over each production in I
4593 - actlist = [ ] # List of actions
4598 - _vf.write("\nstate %d\n\n" % st)
4599 +class LRGeneratedTable(LRTable):
4600 + def __init__(self,grammar,method='LALR',log=None):
4601 + if method not in ['SLR','LALR']:
4602 + raise LALRError("Unsupported method %s" % method)
4604 + self.grammar = grammar
4605 + self.lr_method = method
4607 + # Set up the logger
4609 + log = NullLogger()
4612 + # Internal attributes
4613 + self.lr_action = {} # Action table
4614 + self.lr_goto = {} # Goto table
4615 + self.lr_productions = grammar.Productions # Copy of grammar Production array
4616 + self.lr_goto_cache = {} # Cache of computed gotos
4617 + self.lr0_cidhash = {} # Cache of closures
4619 + self._add_count = 0 # Internal counter used to detect cycles
4621 + # Diagonistic information filled in by the table generator
4622 + self.sr_conflict = 0
4623 + self.rr_conflict = 0
4624 + self.conflicts = [] # List of conflicts
4626 + self.sr_conflicts = []
4627 + self.rr_conflicts = []
4629 + # Build the tables
4630 + self.grammar.build_lritems()
4631 + self.grammar.compute_first()
4632 + self.grammar.compute_follow()
4633 + self.lr_parse_table()
4635 + # Compute the LR(0) closure operation on I, where I is a set of LR(0) items.
4637 + def lr0_closure(self,I):
4638 + self._add_count += 1
4640 + # Add everything in I to J
4646 + for x in j.lr_after:
4647 + if getattr(x,"lr0_added",0) == self._add_count: continue
4648 + # Add B --> .G to J
4649 + J.append(x.lr_next)
4650 + x.lr0_added = self._add_count
4655 + # Compute the LR(0) goto function goto(I,X) where I is a set
4656 + # of LR(0) items and X is a grammar symbol. This function is written
4657 + # in a way that guarantees uniqueness of the generated goto sets
4658 + # (i.e. the same goto set will never be returned as two different Python
4659 + # objects). With uniqueness, we can later do fast set comparisons using
4660 + # id(obj) instead of element-wise comparison.
4662 + def lr0_goto(self,I,x):
4663 + # First we look for a previously cached entry
4664 + g = self.lr_goto_cache.get((id(I),x),None)
4667 + # Now we generate the goto set in a way that guarantees uniqueness
4670 + s = self.lr_goto_cache.get(x,None)
4673 + self.lr_goto_cache[x] = s
4678 + if n and n.lr_before == x:
4679 + s1 = s.get(id(n),None)
4685 + g = s.get('$end',None)
4688 + g = self.lr0_closure(gs)
4692 + self.lr_goto_cache[(id(I),x)] = g
4695 + # Compute the LR(0) sets of item function
4696 + def lr0_items(self):
4698 + C = [ self.lr0_closure([self.grammar.Productions[0].lr_next]) ]
4701 + self.lr0_cidhash[id(I)] = i
4704 + # Loop over the items in C and each grammar symbols
4710 + # Collect all of the symbols that could possibly be in the goto(I,X) sets
4713 + for s in ii.usyms:
4717 + g = self.lr0_goto(I,x)
4718 + if not g: continue
4719 + if id(g) in self.lr0_cidhash: continue
4720 + self.lr0_cidhash[id(g)] = len(C)
4725 + # -----------------------------------------------------------------------------
4726 + # ==== LALR(1) Parsing ====
4728 + # LALR(1) parsing is almost exactly the same as SLR except that instead of
4729 + # relying upon Follow() sets when performing reductions, a more selective
4730 + # lookahead set that incorporates the state of the LR(0) machine is utilized.
4731 + # Thus, we mainly just have to focus on calculating the lookahead sets.
4733 + # The method used here is due to DeRemer and Pennelo (1982).
4735 + # DeRemer, F. L., and T. J. Pennelo: "Efficient Computation of LALR(1)
4736 + # Lookahead Sets", ACM Transactions on Programming Languages and Systems,
4737 + # Vol. 4, No. 4, Oct. 1982, pp. 615-649
4739 + # Further details can also be found in:
4741 + # J. Tremblay and P. Sorenson, "The Theory and Practice of Compiler Writing",
4742 + # McGraw-Hill Book Company, (1985).
4744 + # -----------------------------------------------------------------------------
4746 + # -----------------------------------------------------------------------------
4747 + # compute_nullable_nonterminals()
4749 + # Creates a dictionary containing all of the non-terminals that might produce
4750 + # an empty production.
4751 + # -----------------------------------------------------------------------------
4753 + def compute_nullable_nonterminals(self):
4757 + for p in self.grammar.Productions[1:]:
4759 + nullable[p.name] = 1
4762 + if not t in nullable: break
4764 + nullable[p.name] = 1
4765 + if len(nullable) == num_nullable: break
4766 + num_nullable = len(nullable)
4769 + # -----------------------------------------------------------------------------
4770 + # find_nonterminal_trans(C)
4772 + # Given a set of LR(0) items, this functions finds all of the non-terminal
4773 + # transitions. These are transitions in which a dot appears immediately before
4774 + # a non-terminal. Returns a list of tuples of the form (state,N) where state
4775 + # is the state number and N is the nonterminal symbol.
4777 + # The input C is the set of LR(0) items.
4778 + # -----------------------------------------------------------------------------
4780 + def find_nonterminal_transitions(self,C):
4782 + for state in range(len(C)):
4783 + for p in C[state]:
4784 + if p.lr_index < p.len - 1:
4785 + t = (state,p.prod[p.lr_index+1])
4786 + if t[1] in self.grammar.Nonterminals:
4787 + if t not in trans: trans.append(t)
4791 + # -----------------------------------------------------------------------------
4794 + # Computes the DR(p,A) relationships for non-terminal transitions. The input
4795 + # is a tuple (state,N) where state is a number and N is a nonterminal symbol.
4797 + # Returns a list of terminals.
4798 + # -----------------------------------------------------------------------------
4800 + def dr_relation(self,C,trans,nullable):
4805 + g = self.lr0_goto(C[state],N)
4807 + if p.lr_index < p.len - 1:
4808 + a = p.prod[p.lr_index+1]
4809 + if a in self.grammar.Terminals:
4810 + if a not in terms: terms.append(a)
4812 + # This extra bit is to handle the start state
4813 + if state == 0 and N == self.grammar.Productions[0].prod[0]:
4814 + terms.append('$end')
4818 + # -----------------------------------------------------------------------------
4819 + # reads_relation()
4821 + # Computes the READS() relation (p,A) READS (t,C).
4822 + # -----------------------------------------------------------------------------
4824 + def reads_relation(self,C, trans, empty):
4825 + # Look for empty transitions
4829 + g = self.lr0_goto(C[state],N)
4830 + j = self.lr0_cidhash.get(id(g),-1)
4832 + if p.lr_index < p.len - 1:
4833 + a = p.prod[p.lr_index + 1]
4839 + # -----------------------------------------------------------------------------
4840 + # compute_lookback_includes()
4842 + # Determines the lookback and includes relations
4846 + # This relation is determined by running the LR(0) state machine forward.
4847 + # For example, starting with a production "N : . A B C", we run it forward
4848 + # to obtain "N : A B C ." We then build a relationship between this final
4849 + # state and the starting state. These relationships are stored in a dictionary
4854 + # Computes the INCLUDE() relation (p,A) INCLUDES (p',B).
4856 + # This relation is used to determine non-terminal transitions that occur
4857 + # inside of other non-terminal transition states. (p,A) INCLUDES (p', B)
4858 + # if the following holds:
4860 + # B -> LAT, where T -> epsilon and p' -L-> p
4862 + # L is essentially a prefix (which may be empty), T is a suffix that must be
4863 + # able to derive an empty string. State p' must lead to state p with the string L.
4865 + # -----------------------------------------------------------------------------
4867 + def compute_lookback_includes(self,C,trans,nullable):
4869 + lookdict = {} # Dictionary of lookback relations
4870 + includedict = {} # Dictionary of include relations
4872 + # Make a dictionary of non-terminal transitions
4877 + # Loop over all transitions and compute lookbacks and includes
4878 + for state,N in trans:
4881 + for p in C[state]:
4882 + if p.name != N: continue
4884 + # Okay, we have a name match. We now follow the production all the way
4885 + # through the state machine until we get the . on the right hand side
4887 + lr_index = p.lr_index
4889 + while lr_index < p.len - 1:
4890 + lr_index = lr_index + 1
4891 + t = p.prod[lr_index]
4893 + # Check to see if this symbol and state are a non-terminal transition
4894 + if (j,t) in dtrans:
4895 + # Yes. Okay, there is some chance that this is an includes relation
4896 + # the only way to know for certain is whether the rest of the
4897 + # production derives empty
4901 + if p.prod[li] in self.grammar.Terminals: break # No forget it
4902 + if not p.prod[li] in nullable: break
4905 + # Appears to be a relation between (j,t) and (state,N)
4906 + includes.append((j,t))
4908 + g = self.lr0_goto(C[j],t) # Go to next set
4909 + j = self.lr0_cidhash.get(id(g),-1) # Go to next state
4911 + # When we get here, j is the final state, now we have to locate the production
4913 + if r.name != p.name: continue
4914 + if r.len != p.len: continue
4916 + # This look is comparing a production ". A B C" with "A B C ."
4917 + while i < r.lr_index:
4918 + if r.prod[i] != p.prod[i+1]: break
4921 + lookb.append((j,r))
4922 + for i in includes:
4923 + if not i in includedict: includedict[i] = []
4924 + includedict[i].append((state,N))
4925 + lookdict[(state,N)] = lookb
4927 + return lookdict,includedict
4929 + # -----------------------------------------------------------------------------
4930 + # compute_read_sets()
4932 + # Given a set of LR(0) items, this function computes the read sets.
4934 + # Inputs: C = Set of LR(0) items
4935 + # ntrans = Set of nonterminal transitions
4936 + # nullable = Set of empty transitions
4938 + # Returns a set containing the read sets
4939 + # -----------------------------------------------------------------------------
4941 + def compute_read_sets(self,C, ntrans, nullable):
4942 + FP = lambda x: self.dr_relation(C,x,nullable)
4943 + R = lambda x: self.reads_relation(C,x,nullable)
4944 + F = digraph(ntrans,R,FP)
4947 + # -----------------------------------------------------------------------------
4948 + # compute_follow_sets()
4950 + # Given a set of LR(0) items, a set of non-terminal transitions, a readset,
4951 + # and an include set, this function computes the follow sets
4953 + # Follow(p,A) = Read(p,A) U U {Follow(p',B) | (p,A) INCLUDES (p',B)}
4956 + # ntrans = Set of nonterminal transitions
4957 + # readsets = Readset (previously computed)
4958 + # inclsets = Include sets (previously computed)
4960 + # Returns a set containing the follow sets
4961 + # -----------------------------------------------------------------------------
4963 + def compute_follow_sets(self,ntrans,readsets,inclsets):
4964 + FP = lambda x: readsets[x]
4965 + R = lambda x: inclsets.get(x,[])
4966 + F = digraph(ntrans,R,FP)
4969 + # -----------------------------------------------------------------------------
4970 + # add_lookaheads()
4972 + # Attaches the lookahead symbols to grammar rules.
4974 + # Inputs: lookbacks - Set of lookback relations
4975 + # followset - Computed follow set
4977 + # This function directly attaches the lookaheads to productions contained
4978 + # in the lookbacks set
4979 + # -----------------------------------------------------------------------------
4981 + def add_lookaheads(self,lookbacks,followset):
4982 + for trans,lb in lookbacks.items():
4983 + # Loop over productions in lookback
4984 + for state,p in lb:
4985 + if not state in p.lookaheads:
4986 + p.lookaheads[state] = []
4987 + f = followset.get(trans,[])
4989 + if a not in p.lookaheads[state]: p.lookaheads[state].append(a)
4991 + # -----------------------------------------------------------------------------
4992 + # add_lalr_lookaheads()
4994 + # This function does all of the work of adding lookahead information for use
4995 + # with LALR parsing
4996 + # -----------------------------------------------------------------------------
4998 + def add_lalr_lookaheads(self,C):
4999 + # Determine all of the nullable nonterminals
5000 + nullable = self.compute_nullable_nonterminals()
5002 + # Find all non-terminal transitions
5003 + trans = self.find_nonterminal_transitions(C)
5005 + # Compute read sets
5006 + readsets = self.compute_read_sets(C,trans,nullable)
5008 + # Compute lookback/includes relations
5009 + lookd, included = self.compute_lookback_includes(C,trans,nullable)
5011 + # Compute LALR FOLLOW sets
5012 + followsets = self.compute_follow_sets(trans,readsets,included)
5014 + # Add all of the lookaheads
5015 + self.add_lookaheads(lookd,followsets)
5017 + # -----------------------------------------------------------------------------
5018 + # lr_parse_table()
5020 + # This function constructs the parse tables for SLR or LALR
5021 + # -----------------------------------------------------------------------------
5022 + def lr_parse_table(self):
5023 + Productions = self.grammar.Productions
5024 + Precedence = self.grammar.Precedence
5025 + goto = self.lr_goto # Goto array
5026 + action = self.lr_action # Action array
5027 + log = self.log # Logger for output
5029 + actionp = { } # Action production array (temporary)
5031 + log.info("Parsing method: %s", self.lr_method)
5033 + # Step 1: Construct C = { I0, I1, ... IN}, collection of LR(0) items
5034 + # This determines the number of states
5036 + C = self.lr0_items()
5038 + if self.lr_method == 'LALR':
5039 + self.add_lalr_lookaheads(C)
5041 + # Build the parser table, state by state
5044 + # Loop over each production in I
5045 + actlist = [ ] # List of actions
5050 + log.info("state %d", st)
5053 - _vf.write(" (%d) %s\n" % (p.number, str(p)))
5058 - if p.len == p.lr_index + 1:
5059 - if p.name == "S'":
5060 - # Start symbol. Accept!
5061 - st_action["$end"] = 0
5062 - st_actionp["$end"] = p
5063 + log.info(" (%d) %s", p.number, str(p))
5067 + if p.len == p.lr_index + 1:
5068 + if p.name == "S'":
5069 + # Start symbol. Accept!
5070 + st_action["$end"] = 0
5071 + st_actionp["$end"] = p
5073 + # We are at the end of a production. Reduce!
5074 + if self.lr_method == 'LALR':
5075 + laheads = p.lookaheads[st]
5077 + laheads = self.grammar.Follow[p.name]
5079 + actlist.append((a,p,"reduce using rule %d (%s)" % (p.number,p)))
5080 + r = st_action.get(a,None)
5082 + # Whoa. Have a shift/reduce or reduce/reduce conflict
5084 + # Need to decide on shift or reduce here
5085 + # By default we favor shifting. Need to add
5086 + # some precedence rules here.
5087 + sprec,slevel = Productions[st_actionp[a].number].prec
5088 + rprec,rlevel = Precedence.get(a,('right',0))
5089 + if (slevel < rlevel) or ((slevel == rlevel) and (rprec == 'left')):
5090 + # We really need to reduce here.
5091 + st_action[a] = -p.number
5093 + if not slevel and not rlevel:
5094 + log.info(" ! shift/reduce conflict for %s resolved as reduce",a)
5095 + self.sr_conflicts.append((st,a,'reduce'))
5096 + Productions[p.number].reduced += 1
5097 + elif (slevel == rlevel) and (rprec == 'nonassoc'):
5098 + st_action[a] = None
5100 + # Hmmm. Guess we'll keep the shift
5102 + log.info(" ! shift/reduce conflict for %s resolved as shift",a)
5103 + self.sr_conflicts.append((st,a,'shift'))
5105 + # Reduce/reduce conflict. In this case, we favor the rule
5106 + # that was defined first in the grammar file
5107 + oldp = Productions[-r]
5108 + pp = Productions[p.number]
5109 + if oldp.line > pp.line:
5110 + st_action[a] = -p.number
5112 + chosenp,rejectp = pp,oldp
5113 + Productions[p.number].reduced += 1
5114 + Productions[oldp.number].reduced -= 1
5116 + chosenp,rejectp = oldp,pp
5117 + self.rr_conflicts.append((st,chosenp,rejectp))
5118 + log.info(" ! reduce/reduce conflict for %s resolved using rule %d (%s)", a,st_actionp[a].number, st_actionp[a])
5120 + raise LALRError("Unknown conflict in state %d" % st)
5122 + st_action[a] = -p.number
5124 + Productions[p.number].reduced += 1
5126 - # We are at the end of a production. Reduce!
5127 - if method == 'LALR':
5128 - laheads = p.lookaheads[st]
5130 - laheads = Follow[p.name]
5132 - actlist.append((a,p,"reduce using rule %d (%s)" % (p.number,p)))
5133 - r = st_action.get(a,None)
5135 - # Whoa. Have a shift/reduce or reduce/reduce conflict
5137 - # Need to decide on shift or reduce here
5138 - # By default we favor shifting. Need to add
5139 - # some precedence rules here.
5140 - sprec,slevel = Productions[st_actionp[a].number].prec
5141 - rprec,rlevel = Precedence.get(a,('right',0))
5142 - if (slevel < rlevel) or ((slevel == rlevel) and (rprec == 'left')):
5143 - # We really need to reduce here.
5144 - st_action[a] = -p.number
5146 - if not slevel and not rlevel:
5147 - _vfc.write("shift/reduce conflict in state %d resolved as reduce.\n" % st)
5148 - _vf.write(" ! shift/reduce conflict for %s resolved as reduce.\n" % a)
5150 - elif (slevel == rlevel) and (rprec == 'nonassoc'):
5151 - st_action[a] = None
5153 + a = p.prod[i+1] # Get symbol right after the "."
5154 + if a in self.grammar.Terminals:
5155 + g = self.lr0_goto(I,a)
5156 + j = self.lr0_cidhash.get(id(g),-1)
5158 + # We are in a shift state
5159 + actlist.append((a,p,"shift and go to state %d" % j))
5160 + r = st_action.get(a,None)
5162 + # Whoa have a shift/reduce or shift/shift conflict
5165 + raise LALRError("Shift/shift conflict in state %d" % st)
5167 + # Do a precedence check.
5168 + # - if precedence of reduce rule is higher, we reduce.
5169 + # - if precedence of reduce is same and left assoc, we reduce.
5170 + # - otherwise we shift
5171 + rprec,rlevel = Productions[st_actionp[a].number].prec
5172 + sprec,slevel = Precedence.get(a,('right',0))
5173 + if (slevel > rlevel) or ((slevel == rlevel) and (rprec == 'right')):
5174 + # We decide to shift here... highest precedence to shift
5175 + Productions[st_actionp[a].number].reduced -= 1
5179 + log.info(" ! shift/reduce conflict for %s resolved as shift",a)
5180 + self.sr_conflicts.append((st,a,'shift'))
5181 + elif (slevel == rlevel) and (rprec == 'nonassoc'):
5182 + st_action[a] = None
5184 + # Hmmm. Guess we'll keep the reduce
5185 + if not slevel and not rlevel:
5186 + log.info(" ! shift/reduce conflict for %s resolved as reduce",a)
5187 + self.sr_conflicts.append((st,a,'reduce'))
5190 - # Hmmm. Guess we'll keep the shift
5192 - _vfc.write("shift/reduce conflict in state %d resolved as shift.\n" % st)
5193 - _vf.write(" ! shift/reduce conflict for %s resolved as shift.\n" % a)
5196 - # Reduce/reduce conflict. In this case, we favor the rule
5197 - # that was defined first in the grammar file
5198 - oldp = Productions[-r]
5199 - pp = Productions[p.number]
5200 - if oldp.line > pp.line:
5201 - st_action[a] = -p.number
5203 - # sys.stderr.write("Reduce/reduce conflict in state %d\n" % st)
5205 - _vfc.write("reduce/reduce conflict in state %d resolved using rule %d (%s).\n" % (st, st_actionp[a].number, st_actionp[a]))
5206 - _vf.write(" ! reduce/reduce conflict for %s resolved using rule %d (%s).\n" % (a,st_actionp[a].number, st_actionp[a]))
5207 + raise LALRError("Unknown conflict in state %d" % st)
5209 - sys.stderr.write("Unknown conflict in state %d\n" % st)
5211 - st_action[a] = -p.number
5215 - a = p.prod[i+1] # Get symbol right after the "."
5216 - if Terminals.has_key(a):
5218 - j = _lr0_cidhash.get(id(g),-1)
5220 - # We are in a shift state
5221 - actlist.append((a,p,"shift and go to state %d" % j))
5222 - r = st_action.get(a,None)
5224 - # Whoa have a shift/reduce or shift/shift conflict
5227 - sys.stderr.write("Shift/shift conflict in state %d\n" % st)
5229 - # Do a precedence check.
5230 - # - if precedence of reduce rule is higher, we reduce.
5231 - # - if precedence of reduce is same and left assoc, we reduce.
5232 - # - otherwise we shift
5233 - rprec,rlevel = Productions[st_actionp[a].number].prec
5234 - sprec,slevel = Precedence.get(a,('right',0))
5235 - if (slevel > rlevel) or ((slevel == rlevel) and (rprec == 'right')):
5236 - # We decide to shift here... highest precedence to shift
5241 - _vfc.write("shift/reduce conflict in state %d resolved as shift.\n" % st)
5242 - _vf.write(" ! shift/reduce conflict for %s resolved as shift.\n" % a)
5243 - elif (slevel == rlevel) and (rprec == 'nonassoc'):
5244 - st_action[a] = None
5246 - # Hmmm. Guess we'll keep the reduce
5247 - if not slevel and not rlevel:
5249 - _vfc.write("shift/reduce conflict in state %d resolved as reduce.\n" % st)
5250 - _vf.write(" ! shift/reduce conflict for %s resolved as reduce.\n" % a)
5253 - sys.stderr.write("Unknown conflict in state %d\n" % st)
5258 - except StandardError,e:
5259 - print sys.exc_info()
5260 - raise YaccError, "Hosed in lr_parse_table"
5262 - # Print the actions associated with each terminal
5265 - for a,p,m in actlist:
5266 - if st_action.has_key(a):
5267 - if p is st_actionp[a]:
5268 - _vf.write(" %-15s %s\n" % (a,m))
5269 - _actprint[(a,m)] = 1
5271 - for a,p,m in actlist:
5272 - if st_action.has_key(a):
5273 - if p is not st_actionp[a]:
5274 - if not _actprint.has_key((a,m)):
5275 - _vf.write(" ! %-15s [ %s ]\n" % (a,m))
5279 + # Print the actions associated with each terminal
5281 + for a,p,m in actlist:
5282 + if a in st_action:
5283 + if p is st_actionp[a]:
5284 + log.info(" %-15s %s",a,m)
5285 _actprint[(a,m)] = 1
5287 - # Construct the goto table for this state
5292 - for s in ii.usyms:
5293 - if Nonterminals.has_key(s):
5295 - for n in nkeys.keys():
5297 - j = _lr0_cidhash.get(id(g),-1)
5301 - _vf.write(" %-30s shift and go to state %d\n" % (n,j))
5303 - action[st] = st_action
5304 - actionp[st] = st_actionp
5305 - goto[st] = st_goto
5310 - if n_srconflict == 1:
5311 - sys.stderr.write("yacc: %d shift/reduce conflict\n" % n_srconflict)
5312 - if n_srconflict > 1:
5313 - sys.stderr.write("yacc: %d shift/reduce conflicts\n" % n_srconflict)
5314 - if n_rrconflict == 1:
5315 - sys.stderr.write("yacc: %d reduce/reduce conflict\n" % n_rrconflict)
5316 - if n_rrconflict > 1:
5317 - sys.stderr.write("yacc: %d reduce/reduce conflicts\n" % n_rrconflict)
5319 -# -----------------------------------------------------------------------------
5320 -# ==== LR Utility functions ====
5321 -# -----------------------------------------------------------------------------
5323 -# -----------------------------------------------------------------------------
5324 -# _lr_write_tables()
5326 -# This function writes the LR parsing tables to a file
5327 -# -----------------------------------------------------------------------------
5329 -def lr_write_tables(modulename=tab_module,outputdir=''):
5330 - if isinstance(modulename, types.ModuleType):
5331 - print >>sys.stderr, "Warning module %s is inconsistent with the grammar (ignored)" % modulename
5334 - basemodulename = modulename.split(".")[-1]
5335 - filename = os.path.join(outputdir,basemodulename) + ".py"
5337 - f = open(filename,"w")
5341 + # Print the actions that were not used. (debugging)
5343 + for a,p,m in actlist:
5344 + if a in st_action:
5345 + if p is not st_actionp[a]:
5346 + if not (a,m) in _actprint:
5347 + log.debug(" ! %-15s [ %s ]",a,m)
5349 + _actprint[(a,m)] = 1
5353 + # Construct the goto table for this state
5357 + for s in ii.usyms:
5358 + if s in self.grammar.Nonterminals:
5361 + g = self.lr0_goto(I,n)
5362 + j = self.lr0_cidhash.get(id(g),-1)
5365 + log.info(" %-30s shift and go to state %d",n,j)
5367 + action[st] = st_action
5368 + actionp[st] = st_actionp
5369 + goto[st] = st_goto
5373 + # -----------------------------------------------------------------------------
5376 + # This function writes the LR parsing tables to a file
5377 + # -----------------------------------------------------------------------------
5379 + def write_table(self,modulename,outputdir='',signature=""):
5380 + basemodulename = modulename.split(".")[-1]
5381 + filename = os.path.join(outputdir,basemodulename) + ".py"
5383 + f = open(filename,"w")
5387 # This file is automatically generated. Do not edit.
5392 -""" % (filename, repr(_lr_method), repr(Signature.digest())))
5394 - # Change smaller to 0 to go back to original tables
5397 - # Factor out names to try and make smaller
5401 - for s,nd in _lr_action.items():
5402 - for name,v in nd.items():
5403 - i = items.get(name)
5410 - f.write("\n_lr_action_items = {")
5411 - for k,v in items.items():
5412 - f.write("%r:([" % k)
5414 - f.write("%r," % i)
5417 - f.write("%r," % i)
5428 + """ % (filename, __tabversion__, self.lr_method, signature))
5430 + # Change smaller to 0 to go back to original tables
5433 + # Factor out names to try and make smaller
5437 + for s,nd in self.lr_action.items():
5438 + for name,v in nd.items():
5439 + i = items.get(name)
5446 + f.write("\n_lr_action_items = {")
5447 + for k,v in items.items():
5448 + f.write("%r:([" % k)
5450 + f.write("%r," % i)
5453 + f.write("%r," % i)
5460 for _k, _v in _lr_action_items.items():
5461 for _x,_y in zip(_v[0],_v[1]):
5462 - if not _lr_action.has_key(_x): _lr_action[_x] = { }
5463 + if not _x in _lr_action: _lr_action[_x] = { }
5464 _lr_action[_x][_k] = _y
5465 del _lr_action_items
5469 - f.write("\n_lr_action = { ");
5470 - for k,v in _lr_action.items():
5471 - f.write("(%r,%r):%r," % (k[0],k[1],v))
5475 - # Factor out names to try and make smaller
5478 - for s,nd in _lr_goto.items():
5479 - for name,v in nd.items():
5480 - i = items.get(name)
5487 - f.write("\n_lr_goto_items = {")
5488 - for k,v in items.items():
5489 - f.write("%r:([" % k)
5491 - f.write("%r," % i)
5494 - f.write("%r," % i)
5501 + f.write("\n_lr_action = { ");
5502 + for k,v in self.lr_action.items():
5503 + f.write("(%r,%r):%r," % (k[0],k[1],v))
5507 + # Factor out names to try and make smaller
5510 + for s,nd in self.lr_goto.items():
5511 + for name,v in nd.items():
5512 + i = items.get(name)
5519 + f.write("\n_lr_goto_items = {")
5520 + for k,v in items.items():
5521 + f.write("%r:([" % k)
5523 + f.write("%r," % i)
5526 + f.write("%r," % i)
5533 for _k, _v in _lr_goto_items.items():
5534 for _x,_y in zip(_v[0],_v[1]):
5535 - if not _lr_goto.has_key(_x): _lr_goto[_x] = { }
5536 + if not _x in _lr_goto: _lr_goto[_x] = { }
5537 _lr_goto[_x][_k] = _y
5541 + f.write("\n_lr_goto = { ");
5542 + for k,v in self.lr_goto.items():
5543 + f.write("(%r,%r):%r," % (k[0],k[1],v))
5546 + # Write production table
5547 + f.write("_lr_productions = [\n")
5548 + for p in self.lr_productions:
5550 + f.write(" (%r,%r,%d,%r,%r,%d),\n" % (p.str,p.name, p.len, p.func,p.file,p.line))
5552 + f.write(" (%r,%r,%d,None,None,None),\n" % (str(p),p.name, p.len))
5557 + e = sys.exc_info()[1]
5558 + sys.stderr.write("Unable to create '%s'\n" % filename)
5559 + sys.stderr.write(str(e)+"\n")
5563 + # -----------------------------------------------------------------------------
5566 + # This function pickles the LR parsing tables to a supplied file object
5567 + # -----------------------------------------------------------------------------
5569 + def pickle_table(self,filename,signature=""):
5571 + import cPickle as pickle
5572 + except ImportError:
5574 + outf = open(filename,"wb")
5575 + pickle.dump(__tabversion__,outf,pickle_protocol)
5576 + pickle.dump(self.lr_method,outf,pickle_protocol)
5577 + pickle.dump(signature,outf,pickle_protocol)
5578 + pickle.dump(self.lr_action,outf,pickle_protocol)
5579 + pickle.dump(self.lr_goto,outf,pickle_protocol)
5582 + for p in self.lr_productions:
5584 + outp.append((p.str,p.name, p.len, p.func,p.file,p.line))
5586 + outp.append((str(p),p.name,p.len,None,None,None))
5587 + pickle.dump(outp,outf,pickle_protocol)
5590 +# -----------------------------------------------------------------------------
5591 +# === INTROSPECTION ===
5593 +# The following functions and classes are used to implement the PLY
5594 +# introspection features followed by the yacc() function itself.
5595 +# -----------------------------------------------------------------------------
5597 +# -----------------------------------------------------------------------------
5598 +# get_caller_module_dict()
5600 +# This function returns a dictionary containing all of the symbols defined within
5601 +# a caller further down the call stack. This is used to get the environment
5602 +# associated with the yacc() call if none was provided.
5603 +# -----------------------------------------------------------------------------
5605 +def get_caller_module_dict(levels):
5607 + raise RuntimeError
5608 + except RuntimeError:
5609 + e,b,t = sys.exc_info()
5614 + ldict = f.f_globals.copy()
5615 + if f.f_globals != f.f_locals:
5616 + ldict.update(f.f_locals)
5620 +# -----------------------------------------------------------------------------
5623 +# This takes a raw grammar rule string and parses it into production data
5624 +# -----------------------------------------------------------------------------
5625 +def parse_grammar(doc,file,line):
5627 + # Split the doc string into lines
5628 + pstrings = doc.splitlines()
5631 + for ps in pstrings:
5634 + if not p: continue
5637 + # This is a continuation of a previous rule
5639 + raise SyntaxError("%s:%d: Misplaced '|'" % (file,dline))
5647 + if assign != ':' and assign != '::=':
5648 + raise SyntaxError("%s:%d: Syntax error. Expected ':'" % (file,dline))
5650 + grammar.append((file,dline,prodname,syms))
5651 + except SyntaxError:
5654 + raise SyntaxError("%s:%d: Syntax error in rule '%s'" % (file,dline,ps.strip()))
5658 +# -----------------------------------------------------------------------------
5661 +# This class represents information extracted for building a parser including
5662 +# start symbol, error function, tokens, precedence list, action functions,
5664 +# -----------------------------------------------------------------------------
5665 +class ParserReflect(object):
5666 + def __init__(self,pdict,log=None):
5667 + self.pdict = pdict
5669 + self.error_func = None
5670 + self.tokens = None
5676 + self.log = PlyLogger(sys.stderr)
5678 - f.write("\n_lr_goto = { ");
5679 - for k,v in _lr_goto.items():
5680 - f.write("(%r,%r):%r," % (k[0],k[1],v))
5683 - # Write production table
5684 - f.write("_lr_productions = [\n")
5685 - for p in Productions:
5688 - f.write(" (%r,%d,%r,%r,%d),\n" % (p.name, p.len, p.func.__name__,p.file,p.line))
5690 - f.write(" (%r,%d,None,None,None),\n" % (p.name, p.len))
5693 + # Get all of the basic information
5694 + def get_all(self):
5696 + self.get_error_func()
5698 + self.get_precedence()
5699 + self.get_pfunctions()
5701 + # Validate all of the information
5702 + def validate_all(self):
5703 + self.validate_start()
5704 + self.validate_error_func()
5705 + self.validate_tokens()
5706 + self.validate_precedence()
5707 + self.validate_pfunctions()
5708 + self.validate_files()
5711 + # Compute a signature over the grammar
5712 + def signature(self):
5714 + from hashlib import md5
5715 + except ImportError:
5716 + from md5 import md5
5720 + sig.update(self.start.encode('latin-1'))
5722 + sig.update("".join(["".join(p) for p in self.prec]).encode('latin-1'))
5724 + sig.update(" ".join(self.tokens).encode('latin-1'))
5725 + for f in self.pfuncs:
5727 + sig.update(f[3].encode('latin-1'))
5728 + except (TypeError,ValueError):
5730 + return sig.digest()
5732 + # -----------------------------------------------------------------------------
5735 + # This method checks to see if there are duplicated p_rulename() functions
5736 + # in the parser module file. Without this function, it is really easy for
5737 + # users to make mistakes by cutting and pasting code fragments (and it's a real
5738 + # bugger to try and figure out why the resulting parser doesn't work). Therefore,
5739 + # we just do a little regular expression pattern matching of def statements
5740 + # to try and detect duplicates.
5741 + # -----------------------------------------------------------------------------
5743 + def validate_files(self):
5744 + # Match def p_funcname(
5745 + fre = re.compile(r'\s*def\s+(p_[a-zA-Z_0-9]*)\(')
5747 + for filename in self.files.keys():
5748 + base,ext = os.path.splitext(filename)
5749 + if ext != '.py': return 1 # No idea. Assume it's okay.
5752 + f = open(filename)
5753 + lines = f.readlines()
5759 + for linen,l in enumerate(lines):
5764 + prev = counthash.get(name)
5766 + counthash[name] = linen
5768 + self.log.warning("%s:%d: Function %s redefined. Previously defined on line %d", filename,linen,name,prev)
5770 + # Get the start symbol
5771 + def get_start(self):
5772 + self.start = self.pdict.get('start')
5774 + # Validate the start symbol
5775 + def validate_start(self):
5776 + if self.start is not None:
5777 + if not isinstance(self.start,str):
5778 + self.log.error("'start' must be a string")
5780 + # Look for error handler
5781 + def get_error_func(self):
5782 + self.error_func = self.pdict.get('p_error')
5784 + # Validate the error function
5785 + def validate_error_func(self):
5786 + if self.error_func:
5787 + if isinstance(self.error_func,types.FunctionType):
5789 + elif isinstance(self.error_func, types.MethodType):
5792 - f.write(" None,\n")
5798 - print >>sys.stderr, "Unable to create '%s'" % filename
5799 - print >>sys.stderr, e
5802 -def lr_read_tables(module=tab_module,optimize=0):
5803 - global _lr_action, _lr_goto, _lr_productions, _lr_method
5805 - if isinstance(module,types.ModuleType):
5808 - exec "import %s as parsetab" % module
5810 - if (optimize) or (Signature.digest() == parsetab._lr_signature):
5811 - _lr_action = parsetab._lr_action
5812 - _lr_goto = parsetab._lr_goto
5813 - _lr_productions = parsetab._lr_productions
5814 - _lr_method = parsetab._lr_method
5819 - except (ImportError,AttributeError):
5822 + self.log.error("'p_error' defined, but is not a function or method")
5826 + eline = func_code(self.error_func).co_firstlineno
5827 + efile = func_code(self.error_func).co_filename
5828 + self.files[efile] = 1
5830 + if (func_code(self.error_func).co_argcount != 1+ismethod):
5831 + self.log.error("%s:%d: p_error() requires 1 argument",efile,eline)
5834 + # Get the tokens map
5835 + def get_tokens(self):
5836 + tokens = self.pdict.get("tokens",None)
5838 + self.log.error("No token list is defined")
5842 + if not isinstance(tokens,(list, tuple)):
5843 + self.log.error("tokens must be a list or tuple")
5848 + self.log.error("tokens is empty")
5852 + self.tokens = tokens
5854 + # Validate the tokens
5855 + def validate_tokens(self):
5856 + # Validate the tokens.
5857 + if 'error' in self.tokens:
5858 + self.log.error("Illegal token name 'error'. Is a reserved word")
5863 + for n in self.tokens:
5864 + if n in terminals:
5865 + self.log.warning("Token '%s' multiply defined", n)
5868 + # Get the precedence map (if any)
5869 + def get_precedence(self):
5870 + self.prec = self.pdict.get("precedence",None)
5872 + # Validate and parse the precedence map
5873 + def validate_precedence(self):
5876 + if not isinstance(self.prec,(list,tuple)):
5877 + self.log.error("precedence must be a list or tuple")
5880 + for level,p in enumerate(self.prec):
5881 + if not isinstance(p,(list,tuple)):
5882 + self.log.error("Bad precedence table")
5887 + self.log.error("Malformed precedence entry %s. Must be (assoc, term, ..., term)",p)
5891 + if not isinstance(assoc,str):
5892 + self.log.error("precedence associativity must be a string")
5895 + for term in p[1:]:
5896 + if not isinstance(term,str):
5897 + self.log.error("precedence items must be strings")
5900 + preclist.append((term,assoc,level+1))
5901 + self.preclist = preclist
5903 + # Get all p_functions from the grammar
5904 + def get_pfunctions(self):
5906 + for name, item in self.pdict.items():
5907 + if name[:2] != 'p_': continue
5908 + if name == 'p_error': continue
5909 + if isinstance(item,(types.FunctionType,types.MethodType)):
5910 + line = func_code(item).co_firstlineno
5911 + file = func_code(item).co_filename
5912 + p_functions.append((line,file,name,item.__doc__))
5914 + # Sort all of the actions by line number
5915 + p_functions.sort()
5916 + self.pfuncs = p_functions
5919 + # Validate all of the p_functions
5920 + def validate_pfunctions(self):
5922 + # Check for non-empty symbols
5923 + if len(self.pfuncs) == 0:
5924 + self.log.error("no rules of the form p_rulename are defined")
5928 + for line, file, name, doc in self.pfuncs:
5929 + func = self.pdict[name]
5930 + if isinstance(func, types.MethodType):
5934 + if func_code(func).co_argcount > reqargs:
5935 + self.log.error("%s:%d: Rule '%s' has too many arguments",file,line,func.__name__)
5937 + elif func_code(func).co_argcount < reqargs:
5938 + self.log.error("%s:%d: Rule '%s' requires an argument",file,line,func.__name__)
5940 + elif not func.__doc__:
5941 + self.log.warning("%s:%d: No documentation string specified in function '%s' (ignored)",file,line,func.__name__)
5944 + parsed_g = parse_grammar(doc,file,line)
5945 + for g in parsed_g:
5946 + grammar.append((name, g))
5947 + except SyntaxError:
5948 + e = sys.exc_info()[1]
5949 + self.log.error(str(e))
5952 + # Looks like a valid grammar rule
5953 + # Mark the file in which defined.
5954 + self.files[file] = 1
5956 + # Secondary validation step that looks for p_ definitions that are not functions
5957 + # or functions that look like they might be grammar rules.
5959 + for n,v in self.pdict.items():
5960 + if n[0:2] == 'p_' and isinstance(v, (types.FunctionType, types.MethodType)): continue
5961 + if n[0:2] == 't_': continue
5962 + if n[0:2] == 'p_' and n != 'p_error':
5963 + self.log.warning("'%s' not defined as a function", n)
5964 + if ((isinstance(v,types.FunctionType) and func_code(v).co_argcount == 1) or
5965 + (isinstance(v,types.MethodType) and func_code(v).co_argcount == 2)):
5967 + doc = v.__doc__.split(" ")
5969 + self.log.warning("%s:%d: Possible grammar rule '%s' defined without p_ prefix",
5970 + func_code(v).co_filename, func_code(v).co_firstlineno,n)
5974 + self.grammar = grammar
5976 # -----------------------------------------------------------------------------
5979 -# Build the parser module
5981 # -----------------------------------------------------------------------------
5983 -def yacc(method=default_lr, debug=yaccdebug, module=None, tabmodule=tab_module, start=None, check_recursion=1, optimize=0,write_tables=1,debugfile=debug_file,outputdir=''):
5992 - # Add parsing method to signature
5993 - Signature.update(method)
5995 - # If a "module" parameter was supplied, extract its dictionary.
5996 - # Note: a module may in fact be an instance as well.
5998 +def yacc(method='LALR', debug=yaccdebug, module=None, tabmodule=tab_module, start=None,
5999 + check_recursion=1, optimize=0, write_tables=1, debugfile=debug_file,outputdir='',
6000 + debuglog=None, errorlog = None, picklefile=None):
6002 + global parse # Reference to the parsing method of the last built parser
6004 + # If pickling is enabled, table files are not created
6009 + if errorlog is None:
6010 + errorlog = PlyLogger(sys.stderr)
6012 + # Get the module dictionary used for the parser
6014 - # User supplied a module object.
6015 - if isinstance(module, types.ModuleType):
6016 - ldict = module.__dict__
6017 - elif isinstance(module, _INSTANCETYPE):
6018 - _items = [(k,getattr(module,k)) for k in dir(module)]
6021 - ldict[i[0]] = i[1]
6022 + _items = [(k,getattr(module,k)) for k in dir(module)]
6023 + pdict = dict(_items)
6025 + pdict = get_caller_module_dict(2)
6027 + # Collect parser information from the dictionary
6028 + pinfo = ParserReflect(pdict,log=errorlog)
6032 + raise YaccError("Unable to build parser")
6034 + # Check signature against table files (if any)
6035 + signature = pinfo.signature()
6041 + read_signature = lr.read_pickle(picklefile)
6043 - raise ValueError,"Expected a module"
6046 - # No module given. We might be able to get information from the caller.
6047 - # Throw an exception and unwind the traceback to get the globals
6049 + read_signature = lr.read_table(tabmodule)
6050 + if optimize or (read_signature == signature):
6052 + lr.bind_callables(pinfo.pdict)
6053 + parser = LRParser(lr,pinfo.error_func)
6054 + parse = parser.parse
6057 + e = sys.exc_info()[1]
6058 + errorlog.warning("There was a problem loading the table file: %s", repr(e))
6059 + except VersionError:
6060 + e = sys.exc_info()
6061 + errorlog.warning(str(e))
6065 + if debuglog is None:
6067 + debuglog = PlyLogger(open(debugfile,"w"))
6069 + debuglog = NullLogger()
6071 + debuglog.info("Created by PLY version %s (http://www.dabeaz.com/ply)", __version__)
6076 + # Validate the parser information
6077 + if pinfo.validate_all():
6078 + raise YaccError("Unable to build parser")
6080 + if not pinfo.error_func:
6081 + errorlog.warning("no p_error() function is defined")
6083 + # Create a grammar object
6084 + grammar = Grammar(pinfo.tokens)
6086 + # Set precedence level for terminals
6087 + for term, assoc, level in pinfo.preclist:
6089 - raise RuntimeError
6090 - except RuntimeError:
6091 - e,b,t = sys.exc_info()
6093 - f = f.f_back # Walk out to our calling function
6094 - if f.f_globals is f.f_locals: # Collect global and local variations from caller
6095 - ldict = f.f_globals
6097 - ldict = f.f_globals.copy()
6098 - ldict.update(f.f_locals)
6100 - # Add starting symbol to signature
6102 - start = ldict.get("start",None)
6104 - Signature.update(start)
6106 - # Look for error handler
6107 - ef = ldict.get('p_error',None)
6109 - if isinstance(ef,types.FunctionType):
6111 - elif isinstance(ef, types.MethodType):
6113 + grammar.set_precedence(term,assoc,level)
6114 + except GrammarError:
6115 + e = sys.exc_info()[1]
6116 + errorlog.warning("%s",str(e))
6118 + # Add productions to the grammar
6119 + for funcname, gram in pinfo.grammar:
6120 + file, line, prodname, syms = gram
6122 + grammar.add_production(prodname,syms,funcname,file,line)
6123 + except GrammarError:
6124 + e = sys.exc_info()[1]
6125 + errorlog.error("%s",str(e))
6128 + # Set the grammar start symbols
6131 + grammar.set_start(pinfo.start)
6133 - raise YaccError,"'p_error' defined, but is not a function or method."
6134 - eline = ef.func_code.co_firstlineno
6135 - efile = ef.func_code.co_filename
6136 - files[efile] = None
6138 - if (ef.func_code.co_argcount != 1+ismethod):
6139 - raise YaccError,"%s:%d: p_error() requires 1 argument." % (efile,eline)
6143 - print >>sys.stderr, "yacc: Warning. no p_error() function is defined."
6145 - # If running in optimized mode. We're going to read tables instead
6147 - if (optimize and lr_read_tables(tabmodule,1)):
6148 - # Read parse table
6149 - del Productions[:]
6150 - for p in _lr_productions:
6152 - Productions.append(None)
6154 - m = MiniProduction()
6160 - m.func = ldict[p[2]]
6161 - Productions.append(m)
6164 - # Get the tokens map
6165 - if (module and isinstance(module,_INSTANCETYPE)):
6166 - tokens = getattr(module,"tokens",None)
6168 - tokens = ldict.get("tokens",None)
6171 - raise YaccError,"module does not define a list 'tokens'"
6172 - if not (isinstance(tokens,types.ListType) or isinstance(tokens,types.TupleType)):
6173 - raise YaccError,"tokens must be a list or tuple."
6175 - # Check to see if a requires dictionary is defined.
6176 - requires = ldict.get("require",None)
6178 - if not (isinstance(requires,types.DictType)):
6179 - raise YaccError,"require must be a dictionary."
6181 - for r,v in requires.items():
6183 - if not (isinstance(v,types.ListType)):
6185 - v1 = [x.split(".") for x in v]
6187 - except StandardError:
6188 - print >>sys.stderr, "Invalid specification for rule '%s' in require. Expected a list of strings" % r
6191 - # Build the dictionary of terminals. We a record a 0 in the
6192 - # dictionary to track whether or not a terminal is actually
6193 - # used in the grammar
6195 - if 'error' in tokens:
6196 - print >>sys.stderr, "yacc: Illegal token 'error'. Is a reserved word."
6197 - raise YaccError,"Illegal token name"
6200 - if Terminals.has_key(n):
6201 - print >>sys.stderr, "yacc: Warning. Token '%s' multiply defined." % n
6202 - Terminals[n] = [ ]
6204 - Terminals['error'] = [ ]
6206 - # Get the precedence map (if any)
6207 - prec = ldict.get("precedence",None)
6209 - if not (isinstance(prec,types.ListType) or isinstance(prec,types.TupleType)):
6210 - raise YaccError,"precedence must be a list or tuple."
6211 - add_precedence(prec)
6212 - Signature.update(repr(prec))
6215 - if not Precedence.has_key(n):
6216 - Precedence[n] = ('right',0) # Default, right associative, 0 precedence
6218 - # Get the list of built-in functions with p_ prefix
6219 - symbols = [ldict[f] for f in ldict.keys()
6220 - if (type(ldict[f]) in (types.FunctionType, types.MethodType) and ldict[f].__name__[:2] == 'p_'
6221 - and ldict[f].__name__ != 'p_error')]
6223 - # Check for non-empty symbols
6224 - if len(symbols) == 0:
6225 - raise YaccError,"no rules of the form p_rulename are defined."
6227 - # Sort the symbols by line number
6228 - symbols.sort(lambda x,y: cmp(x.func_code.co_firstlineno,y.func_code.co_firstlineno))
6230 - # Add all of the symbols to the grammar
6232 - if (add_function(f)) < 0:
6235 - files[f.func_code.co_filename] = None
6237 - # Make a signature of the docstrings
6240 - Signature.update(f.__doc__)
6245 - raise YaccError,"Unable to construct parser."
6247 - if not lr_read_tables(tabmodule):
6250 - for filename in files.keys():
6251 - if not validate_file(filename):
6254 - # Validate dictionary
6255 - validate_dict(ldict)
6257 - if start and not Prodnames.has_key(start):
6258 - raise YaccError,"Bad starting symbol '%s'" % start
6260 - augment_grammar(start)
6261 - error = verify_productions(cycle_check=check_recursion)
6262 - otherfunc = [ldict[f] for f in ldict.keys()
6263 - if (type(f) in (types.FunctionType,types.MethodType) and ldict[f].__name__[:2] != 'p_')]
6265 - # Check precedence rules
6266 - if check_precedence():
6270 - raise YaccError,"Unable to construct parser."
6274 - compute_follow(start)
6276 - if method in ['SLR','LALR']:
6277 - lr_parse_table(method)
6279 - raise YaccError, "Unknown parsing method '%s'" % method
6282 - lr_write_tables(tabmodule,outputdir)
6286 - f = open(os.path.join(outputdir,debugfile),"w")
6287 - f.write(_vfc.getvalue())
6289 - f.write(_vf.getvalue())
6292 - print >>sys.stderr, "yacc: can't create '%s'" % debugfile,e
6294 - # Made it here. Create a parser object and set up its internal state.
6295 - # Set global parse() method to bound method of parser object.
6297 - p = Parser("xyzzy")
6298 - p.productions = Productions
6299 - p.errorfunc = Errorfunc
6300 - p.action = _lr_action
6302 - p.method = _lr_method
6303 - p.require = Requires
6311 - # Clean up all of the globals we created
6312 - if (not optimize):
6316 -# yacc_cleanup function. Delete all of the global variables
6317 -# used during table construction
6319 -def yacc_cleanup():
6320 - global _lr_action, _lr_goto, _lr_method, _lr_goto_cache
6321 - del _lr_action, _lr_goto, _lr_method, _lr_goto_cache
6323 - global Productions, Prodnames, Prodmap, Terminals
6324 - global Nonterminals, First, Follow, Precedence, UsedPrecedence, LRitems
6325 - global Errorfunc, Signature, Requires
6327 - del Productions, Prodnames, Prodmap, Terminals
6328 - del Nonterminals, First, Follow, Precedence, UsedPrecedence, LRitems
6329 - del Errorfunc, Signature, Requires
6335 -# Stub that raises an error if parsing is attempted without first calling yacc()
6336 -def parse(*args,**kwargs):
6337 - raise YaccError, "yacc: No parser built with yacc()"
6338 + grammar.set_start(start)
6339 + except GrammarError:
6340 + e = sys.exc_info()[1]
6341 + errorlog.error(str(e))
6345 + raise YaccError("Unable to build parser")
6347 + # Verify the grammar structure
6348 + undefined_symbols = grammar.undefined_symbols()
6349 + for sym, prod in undefined_symbols:
6350 + errorlog.error("%s:%d: Symbol '%s' used, but not defined as a token or a rule",prod.file,prod.line,sym)
6353 + unused_terminals = grammar.unused_terminals()
6354 + if unused_terminals:
6356 + debuglog.info("Unused terminals:")
6358 + for term in unused_terminals:
6359 + errorlog.warning("Token '%s' defined, but not used", term)
6360 + debuglog.info(" %s", term)
6362 + # Print out all productions to the debug log
6365 + debuglog.info("Grammar")
6367 + for n,p in enumerate(grammar.Productions):
6368 + debuglog.info("Rule %-5d %s", n, p)
6370 + # Find unused non-terminals
6371 + unused_rules = grammar.unused_rules()
6372 + for prod in unused_rules:
6373 + errorlog.warning("%s:%d: Rule '%s' defined, but not used", prod.file, prod.line, prod.name)
6375 + if len(unused_terminals) == 1:
6376 + errorlog.warning("There is 1 unused token")
6377 + if len(unused_terminals) > 1:
6378 + errorlog.warning("There are %d unused tokens", len(unused_terminals))
6380 + if len(unused_rules) == 1:
6381 + errorlog.warning("There is 1 unused rule")
6382 + if len(unused_rules) > 1:
6383 + errorlog.warning("There are %d unused rules", len(unused_rules))
6387 + debuglog.info("Terminals, with rules where they appear")
6389 + terms = list(grammar.Terminals)
6391 + for term in terms:
6392 + debuglog.info("%-20s : %s", term, " ".join([str(s) for s in grammar.Terminals[term]]))
6395 + debuglog.info("Nonterminals, with rules where they appear")
6397 + nonterms = list(grammar.Nonterminals)
6399 + for nonterm in nonterms:
6400 + debuglog.info("%-20s : %s", nonterm, " ".join([str(s) for s in grammar.Nonterminals[nonterm]]))
6403 + if check_recursion:
6404 + unreachable = grammar.find_unreachable()
6405 + for u in unreachable:
6406 + errorlog.warning("Symbol '%s' is unreachable",u)
6408 + infinite = grammar.infinite_cycles()
6409 + for inf in infinite:
6410 + errorlog.error("Infinite recursion detected for symbol '%s'", inf)
6413 + unused_prec = grammar.unused_precedence()
6414 + for term, assoc in unused_prec:
6415 + errorlog.error("Precedence rule '%s' defined for unknown symbol '%s'", assoc, term)
6419 + raise YaccError("Unable to build parser")
6421 + # Run the LRGeneratedTable on the grammar
6423 + errorlog.debug("Generating %s tables", method)
6425 + lr = LRGeneratedTable(grammar,method,debuglog)
6428 + num_sr = len(lr.sr_conflicts)
6430 + # Report shift/reduce and reduce/reduce conflicts
6432 + errorlog.warning("1 shift/reduce conflict")
6434 + errorlog.warning("%d shift/reduce conflicts", num_sr)
6436 + num_rr = len(lr.rr_conflicts)
6438 + errorlog.warning("1 reduce/reduce conflict")
6440 + errorlog.warning("%d reduce/reduce conflicts", num_rr)
6442 + # Write out conflicts to the output file
6443 + if debug and (lr.sr_conflicts or lr.rr_conflicts):
6444 + debuglog.warning("")
6445 + debuglog.warning("Conflicts:")
6446 + debuglog.warning("")
6448 + for state, tok, resolution in lr.sr_conflicts:
6449 + debuglog.warning("shift/reduce conflict for %s in state %d resolved as %s", tok, state, resolution)
6451 + already_reported = {}
6452 + for state, rule, rejected in lr.rr_conflicts:
6453 + if (state,id(rule),id(rejected)) in already_reported:
6455 + debuglog.warning("reduce/reduce conflict in state %d resolved using rule (%s)", state, rule)
6456 + debuglog.warning("rejected rule (%s) in state %d", rejected,state)
6457 + errorlog.warning("reduce/reduce conflict in state %d resolved using rule (%s)", state, rule)
6458 + errorlog.warning("rejected rule (%s) in state %d", rejected, state)
6459 + already_reported[state,id(rule),id(rejected)] = 1
6462 + for state, rule, rejected in lr.rr_conflicts:
6463 + if not rejected.reduced and (rejected not in warned_never):
6464 + debuglog.warning("Rule (%s) is never reduced", rejected)
6465 + errorlog.warning("Rule (%s) is never reduced", rejected)
6466 + warned_never.append(rejected)
6468 + # Write the table file if requested
6470 + lr.write_table(tabmodule,outputdir,signature)
6472 + # Write a pickled version of the tables
6474 + lr.pickle_table(picklefile,signature)
6476 + # Build the parser
6477 + lr.bind_callables(pinfo.pdict)
6478 + parser = LRParser(lr,pinfo.error_func)
6480 + parse = parser.parse