updated on Tue Jan 10 08:08:34 UTC 2012
[aur-mirror.git] / firefox-pgo-beta / update-ply-2.5-to-3.3.patch
blobfecfc348df1fcbcc5810bccf68f54af180a39588
1 # HG changeset patch
2 # Parent 414ff9016349c51367bc5f9c1a815ac14177109b
4 diff --git a/other-licenses/ply/COPYING b/other-licenses/ply/COPYING
5 --- a/other-licenses/ply/COPYING
6 +++ b/other-licenses/ply/COPYING
7 @@ -1,504 +1,28 @@
8 - GNU LESSER GENERAL PUBLIC LICENSE
9 - Version 2.1, February 1999
10 +Copyright (C) 2001-2009,
11 +David M. Beazley (Dabeaz LLC)
12 +All rights reserved.
14 - Copyright (C) 1991, 1999 Free Software Foundation, Inc.
15 - 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
16 - Everyone is permitted to copy and distribute verbatim copies
17 - of this license document, but changing it is not allowed.
18 +Redistribution and use in source and binary forms, with or without
19 +modification, are permitted provided that the following conditions are
20 +met:
22 -[This is the first released version of the Lesser GPL. It also counts
23 - as the successor of the GNU Library Public License, version 2, hence
24 - the version number 2.1.]
25 +* Redistributions of source code must retain the above copyright notice,
26 + this list of conditions and the following disclaimer.
27 +* Redistributions in binary form must reproduce the above copyright notice,
28 + this list of conditions and the following disclaimer in the documentation
29 + and/or other materials provided with the distribution.
30 +* Neither the name of the David Beazley or Dabeaz LLC may be used to
31 + endorse or promote products derived from this software without
32 + specific prior written permission.
34 - Preamble
36 - The licenses for most software are designed to take away your
37 -freedom to share and change it. By contrast, the GNU General Public
38 -Licenses are intended to guarantee your freedom to share and change
39 -free software--to make sure the software is free for all its users.
41 - This license, the Lesser General Public License, applies to some
42 -specially designated software packages--typically libraries--of the
43 -Free Software Foundation and other authors who decide to use it. You
44 -can use it too, but we suggest you first think carefully about whether
45 -this license or the ordinary General Public License is the better
46 -strategy to use in any particular case, based on the explanations below.
48 - When we speak of free software, we are referring to freedom of use,
49 -not price. Our General Public Licenses are designed to make sure that
50 -you have the freedom to distribute copies of free software (and charge
51 -for this service if you wish); that you receive source code or can get
52 -it if you want it; that you can change the software and use pieces of
53 -it in new free programs; and that you are informed that you can do
54 -these things.
56 - To protect your rights, we need to make restrictions that forbid
57 -distributors to deny you these rights or to ask you to surrender these
58 -rights. These restrictions translate to certain responsibilities for
59 -you if you distribute copies of the library or if you modify it.
61 - For example, if you distribute copies of the library, whether gratis
62 -or for a fee, you must give the recipients all the rights that we gave
63 -you. You must make sure that they, too, receive or can get the source
64 -code. If you link other code with the library, you must provide
65 -complete object files to the recipients, so that they can relink them
66 -with the library after making changes to the library and recompiling
67 -it. And you must show them these terms so they know their rights.
69 - We protect your rights with a two-step method: (1) we copyright the
70 -library, and (2) we offer you this license, which gives you legal
71 -permission to copy, distribute and/or modify the library.
73 - To protect each distributor, we want to make it very clear that
74 -there is no warranty for the free library. Also, if the library is
75 -modified by someone else and passed on, the recipients should know
76 -that what they have is not the original version, so that the original
77 -author's reputation will not be affected by problems that might be
78 -introduced by others.
79 -\f
80 - Finally, software patents pose a constant threat to the existence of
81 -any free program. We wish to make sure that a company cannot
82 -effectively restrict the users of a free program by obtaining a
83 -restrictive license from a patent holder. Therefore, we insist that
84 -any patent license obtained for a version of the library must be
85 -consistent with the full freedom of use specified in this license.
87 - Most GNU software, including some libraries, is covered by the
88 -ordinary GNU General Public License. This license, the GNU Lesser
89 -General Public License, applies to certain designated libraries, and
90 -is quite different from the ordinary General Public License. We use
91 -this license for certain libraries in order to permit linking those
92 -libraries into non-free programs.
94 - When a program is linked with a library, whether statically or using
95 -a shared library, the combination of the two is legally speaking a
96 -combined work, a derivative of the original library. The ordinary
97 -General Public License therefore permits such linking only if the
98 -entire combination fits its criteria of freedom. The Lesser General
99 -Public License permits more lax criteria for linking other code with
100 -the library.
102 - We call this license the "Lesser" General Public License because it
103 -does Less to protect the user's freedom than the ordinary General
104 -Public License. It also provides other free software developers Less
105 -of an advantage over competing non-free programs. These disadvantages
106 -are the reason we use the ordinary General Public License for many
107 -libraries. However, the Lesser license provides advantages in certain
108 -special circumstances.
110 - For example, on rare occasions, there may be a special need to
111 -encourage the widest possible use of a certain library, so that it becomes
112 -a de-facto standard. To achieve this, non-free programs must be
113 -allowed to use the library. A more frequent case is that a free
114 -library does the same job as widely used non-free libraries. In this
115 -case, there is little to gain by limiting the free library to free
116 -software only, so we use the Lesser General Public License.
118 - In other cases, permission to use a particular library in non-free
119 -programs enables a greater number of people to use a large body of
120 -free software. For example, permission to use the GNU C Library in
121 -non-free programs enables many more people to use the whole GNU
122 -operating system, as well as its variant, the GNU/Linux operating
123 -system.
125 - Although the Lesser General Public License is Less protective of the
126 -users' freedom, it does ensure that the user of a program that is
127 -linked with the Library has the freedom and the wherewithal to run
128 -that program using a modified version of the Library.
130 - The precise terms and conditions for copying, distribution and
131 -modification follow. Pay close attention to the difference between a
132 -"work based on the library" and a "work that uses the library". The
133 -former contains code derived from the library, whereas the latter must
134 -be combined with the library in order to run.
136 - GNU LESSER GENERAL PUBLIC LICENSE
137 - TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
139 - 0. This License Agreement applies to any software library or other
140 -program which contains a notice placed by the copyright holder or
141 -other authorized party saying it may be distributed under the terms of
142 -this Lesser General Public License (also called "this License").
143 -Each licensee is addressed as "you".
145 - A "library" means a collection of software functions and/or data
146 -prepared so as to be conveniently linked with application programs
147 -(which use some of those functions and data) to form executables.
149 - The "Library", below, refers to any such software library or work
150 -which has been distributed under these terms. A "work based on the
151 -Library" means either the Library or any derivative work under
152 -copyright law: that is to say, a work containing the Library or a
153 -portion of it, either verbatim or with modifications and/or translated
154 -straightforwardly into another language. (Hereinafter, translation is
155 -included without limitation in the term "modification".)
157 - "Source code" for a work means the preferred form of the work for
158 -making modifications to it. For a library, complete source code means
159 -all the source code for all modules it contains, plus any associated
160 -interface definition files, plus the scripts used to control compilation
161 -and installation of the library.
163 - Activities other than copying, distribution and modification are not
164 -covered by this License; they are outside its scope. The act of
165 -running a program using the Library is not restricted, and output from
166 -such a program is covered only if its contents constitute a work based
167 -on the Library (independent of the use of the Library in a tool for
168 -writing it). Whether that is true depends on what the Library does
169 -and what the program that uses the Library does.
171 - 1. You may copy and distribute verbatim copies of the Library's
172 -complete source code as you receive it, in any medium, provided that
173 -you conspicuously and appropriately publish on each copy an
174 -appropriate copyright notice and disclaimer of warranty; keep intact
175 -all the notices that refer to this License and to the absence of any
176 -warranty; and distribute a copy of this License along with the
177 -Library.
179 - You may charge a fee for the physical act of transferring a copy,
180 -and you may at your option offer warranty protection in exchange for a
181 -fee.
183 - 2. You may modify your copy or copies of the Library or any portion
184 -of it, thus forming a work based on the Library, and copy and
185 -distribute such modifications or work under the terms of Section 1
186 -above, provided that you also meet all of these conditions:
188 - a) The modified work must itself be a software library.
190 - b) You must cause the files modified to carry prominent notices
191 - stating that you changed the files and the date of any change.
193 - c) You must cause the whole of the work to be licensed at no
194 - charge to all third parties under the terms of this License.
196 - d) If a facility in the modified Library refers to a function or a
197 - table of data to be supplied by an application program that uses
198 - the facility, other than as an argument passed when the facility
199 - is invoked, then you must make a good faith effort to ensure that,
200 - in the event an application does not supply such function or
201 - table, the facility still operates, and performs whatever part of
202 - its purpose remains meaningful.
204 - (For example, a function in a library to compute square roots has
205 - a purpose that is entirely well-defined independent of the
206 - application. Therefore, Subsection 2d requires that any
207 - application-supplied function or table used by this function must
208 - be optional: if the application does not supply it, the square
209 - root function must still compute square roots.)
211 -These requirements apply to the modified work as a whole. If
212 -identifiable sections of that work are not derived from the Library,
213 -and can be reasonably considered independent and separate works in
214 -themselves, then this License, and its terms, do not apply to those
215 -sections when you distribute them as separate works. But when you
216 -distribute the same sections as part of a whole which is a work based
217 -on the Library, the distribution of the whole must be on the terms of
218 -this License, whose permissions for other licensees extend to the
219 -entire whole, and thus to each and every part regardless of who wrote
220 -it.
222 -Thus, it is not the intent of this section to claim rights or contest
223 -your rights to work written entirely by you; rather, the intent is to
224 -exercise the right to control the distribution of derivative or
225 -collective works based on the Library.
227 -In addition, mere aggregation of another work not based on the Library
228 -with the Library (or with a work based on the Library) on a volume of
229 -a storage or distribution medium does not bring the other work under
230 -the scope of this License.
232 - 3. You may opt to apply the terms of the ordinary GNU General Public
233 -License instead of this License to a given copy of the Library. To do
234 -this, you must alter all the notices that refer to this License, so
235 -that they refer to the ordinary GNU General Public License, version 2,
236 -instead of to this License. (If a newer version than version 2 of the
237 -ordinary GNU General Public License has appeared, then you can specify
238 -that version instead if you wish.) Do not make any other change in
239 -these notices.
241 - Once this change is made in a given copy, it is irreversible for
242 -that copy, so the ordinary GNU General Public License applies to all
243 -subsequent copies and derivative works made from that copy.
245 - This option is useful when you wish to copy part of the code of
246 -the Library into a program that is not a library.
248 - 4. You may copy and distribute the Library (or a portion or
249 -derivative of it, under Section 2) in object code or executable form
250 -under the terms of Sections 1 and 2 above provided that you accompany
251 -it with the complete corresponding machine-readable source code, which
252 -must be distributed under the terms of Sections 1 and 2 above on a
253 -medium customarily used for software interchange.
255 - If distribution of object code is made by offering access to copy
256 -from a designated place, then offering equivalent access to copy the
257 -source code from the same place satisfies the requirement to
258 -distribute the source code, even though third parties are not
259 -compelled to copy the source along with the object code.
261 - 5. A program that contains no derivative of any portion of the
262 -Library, but is designed to work with the Library by being compiled or
263 -linked with it, is called a "work that uses the Library". Such a
264 -work, in isolation, is not a derivative work of the Library, and
265 -therefore falls outside the scope of this License.
267 - However, linking a "work that uses the Library" with the Library
268 -creates an executable that is a derivative of the Library (because it
269 -contains portions of the Library), rather than a "work that uses the
270 -library". The executable is therefore covered by this License.
271 -Section 6 states terms for distribution of such executables.
273 - When a "work that uses the Library" uses material from a header file
274 -that is part of the Library, the object code for the work may be a
275 -derivative work of the Library even though the source code is not.
276 -Whether this is true is especially significant if the work can be
277 -linked without the Library, or if the work is itself a library. The
278 -threshold for this to be true is not precisely defined by law.
280 - If such an object file uses only numerical parameters, data
281 -structure layouts and accessors, and small macros and small inline
282 -functions (ten lines or less in length), then the use of the object
283 -file is unrestricted, regardless of whether it is legally a derivative
284 -work. (Executables containing this object code plus portions of the
285 -Library will still fall under Section 6.)
287 - Otherwise, if the work is a derivative of the Library, you may
288 -distribute the object code for the work under the terms of Section 6.
289 -Any executables containing that work also fall under Section 6,
290 -whether or not they are linked directly with the Library itself.
292 - 6. As an exception to the Sections above, you may also combine or
293 -link a "work that uses the Library" with the Library to produce a
294 -work containing portions of the Library, and distribute that work
295 -under terms of your choice, provided that the terms permit
296 -modification of the work for the customer's own use and reverse
297 -engineering for debugging such modifications.
299 - You must give prominent notice with each copy of the work that the
300 -Library is used in it and that the Library and its use are covered by
301 -this License. You must supply a copy of this License. If the work
302 -during execution displays copyright notices, you must include the
303 -copyright notice for the Library among them, as well as a reference
304 -directing the user to the copy of this License. Also, you must do one
305 -of these things:
307 - a) Accompany the work with the complete corresponding
308 - machine-readable source code for the Library including whatever
309 - changes were used in the work (which must be distributed under
310 - Sections 1 and 2 above); and, if the work is an executable linked
311 - with the Library, with the complete machine-readable "work that
312 - uses the Library", as object code and/or source code, so that the
313 - user can modify the Library and then relink to produce a modified
314 - executable containing the modified Library. (It is understood
315 - that the user who changes the contents of definitions files in the
316 - Library will not necessarily be able to recompile the application
317 - to use the modified definitions.)
319 - b) Use a suitable shared library mechanism for linking with the
320 - Library. A suitable mechanism is one that (1) uses at run time a
321 - copy of the library already present on the user's computer system,
322 - rather than copying library functions into the executable, and (2)
323 - will operate properly with a modified version of the library, if
324 - the user installs one, as long as the modified version is
325 - interface-compatible with the version that the work was made with.
327 - c) Accompany the work with a written offer, valid for at
328 - least three years, to give the same user the materials
329 - specified in Subsection 6a, above, for a charge no more
330 - than the cost of performing this distribution.
332 - d) If distribution of the work is made by offering access to copy
333 - from a designated place, offer equivalent access to copy the above
334 - specified materials from the same place.
336 - e) Verify that the user has already received a copy of these
337 - materials or that you have already sent this user a copy.
339 - For an executable, the required form of the "work that uses the
340 -Library" must include any data and utility programs needed for
341 -reproducing the executable from it. However, as a special exception,
342 -the materials to be distributed need not include anything that is
343 -normally distributed (in either source or binary form) with the major
344 -components (compiler, kernel, and so on) of the operating system on
345 -which the executable runs, unless that component itself accompanies
346 -the executable.
348 - It may happen that this requirement contradicts the license
349 -restrictions of other proprietary libraries that do not normally
350 -accompany the operating system. Such a contradiction means you cannot
351 -use both them and the Library together in an executable that you
352 -distribute.
354 - 7. You may place library facilities that are a work based on the
355 -Library side-by-side in a single library together with other library
356 -facilities not covered by this License, and distribute such a combined
357 -library, provided that the separate distribution of the work based on
358 -the Library and of the other library facilities is otherwise
359 -permitted, and provided that you do these two things:
361 - a) Accompany the combined library with a copy of the same work
362 - based on the Library, uncombined with any other library
363 - facilities. This must be distributed under the terms of the
364 - Sections above.
366 - b) Give prominent notice with the combined library of the fact
367 - that part of it is a work based on the Library, and explaining
368 - where to find the accompanying uncombined form of the same work.
370 - 8. You may not copy, modify, sublicense, link with, or distribute
371 -the Library except as expressly provided under this License. Any
372 -attempt otherwise to copy, modify, sublicense, link with, or
373 -distribute the Library is void, and will automatically terminate your
374 -rights under this License. However, parties who have received copies,
375 -or rights, from you under this License will not have their licenses
376 -terminated so long as such parties remain in full compliance.
378 - 9. You are not required to accept this License, since you have not
379 -signed it. However, nothing else grants you permission to modify or
380 -distribute the Library or its derivative works. These actions are
381 -prohibited by law if you do not accept this License. Therefore, by
382 -modifying or distributing the Library (or any work based on the
383 -Library), you indicate your acceptance of this License to do so, and
384 -all its terms and conditions for copying, distributing or modifying
385 -the Library or works based on it.
387 - 10. Each time you redistribute the Library (or any work based on the
388 -Library), the recipient automatically receives a license from the
389 -original licensor to copy, distribute, link with or modify the Library
390 -subject to these terms and conditions. You may not impose any further
391 -restrictions on the recipients' exercise of the rights granted herein.
392 -You are not responsible for enforcing compliance by third parties with
393 -this License.
395 - 11. If, as a consequence of a court judgment or allegation of patent
396 -infringement or for any other reason (not limited to patent issues),
397 -conditions are imposed on you (whether by court order, agreement or
398 -otherwise) that contradict the conditions of this License, they do not
399 -excuse you from the conditions of this License. If you cannot
400 -distribute so as to satisfy simultaneously your obligations under this
401 -License and any other pertinent obligations, then as a consequence you
402 -may not distribute the Library at all. For example, if a patent
403 -license would not permit royalty-free redistribution of the Library by
404 -all those who receive copies directly or indirectly through you, then
405 -the only way you could satisfy both it and this License would be to
406 -refrain entirely from distribution of the Library.
408 -If any portion of this section is held invalid or unenforceable under any
409 -particular circumstance, the balance of the section is intended to apply,
410 -and the section as a whole is intended to apply in other circumstances.
412 -It is not the purpose of this section to induce you to infringe any
413 -patents or other property right claims or to contest validity of any
414 -such claims; this section has the sole purpose of protecting the
415 -integrity of the free software distribution system which is
416 -implemented by public license practices. Many people have made
417 -generous contributions to the wide range of software distributed
418 -through that system in reliance on consistent application of that
419 -system; it is up to the author/donor to decide if he or she is willing
420 -to distribute software through any other system and a licensee cannot
421 -impose that choice.
423 -This section is intended to make thoroughly clear what is believed to
424 -be a consequence of the rest of this License.
426 - 12. If the distribution and/or use of the Library is restricted in
427 -certain countries either by patents or by copyrighted interfaces, the
428 -original copyright holder who places the Library under this License may add
429 -an explicit geographical distribution limitation excluding those countries,
430 -so that distribution is permitted only in or among countries not thus
431 -excluded. In such case, this License incorporates the limitation as if
432 -written in the body of this License.
434 - 13. The Free Software Foundation may publish revised and/or new
435 -versions of the Lesser General Public License from time to time.
436 -Such new versions will be similar in spirit to the present version,
437 -but may differ in detail to address new problems or concerns.
439 -Each version is given a distinguishing version number. If the Library
440 -specifies a version number of this License which applies to it and
441 -"any later version", you have the option of following the terms and
442 -conditions either of that version or of any later version published by
443 -the Free Software Foundation. If the Library does not specify a
444 -license version number, you may choose any version ever published by
445 -the Free Software Foundation.
447 - 14. If you wish to incorporate parts of the Library into other free
448 -programs whose distribution conditions are incompatible with these,
449 -write to the author to ask for permission. For software which is
450 -copyrighted by the Free Software Foundation, write to the Free
451 -Software Foundation; we sometimes make exceptions for this. Our
452 -decision will be guided by the two goals of preserving the free status
453 -of all derivatives of our free software and of promoting the sharing
454 -and reuse of software generally.
456 - NO WARRANTY
458 - 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
459 -WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
460 -EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
461 -OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
462 -KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
463 -IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
464 -PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
465 -LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
466 -THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
468 - 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
469 -WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
470 -AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
471 -FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
472 -CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
473 -LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
474 -RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
475 -FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
476 -SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
477 -DAMAGES.
479 - END OF TERMS AND CONDITIONS
481 - How to Apply These Terms to Your New Libraries
483 - If you develop a new library, and you want it to be of the greatest
484 -possible use to the public, we recommend making it free software that
485 -everyone can redistribute and change. You can do so by permitting
486 -redistribution under these terms (or, alternatively, under the terms of the
487 -ordinary General Public License).
489 - To apply these terms, attach the following notices to the library. It is
490 -safest to attach them to the start of each source file to most effectively
491 -convey the exclusion of warranty; and each file should have at least the
492 -"copyright" line and a pointer to where the full notice is found.
494 - <one line to give the library's name and a brief idea of what it does.>
495 - Copyright (C) <year> <name of author>
497 - This library is free software; you can redistribute it and/or
498 - modify it under the terms of the GNU Lesser General Public
499 - License as published by the Free Software Foundation; either
500 - version 2.1 of the License, or (at your option) any later version.
502 - This library is distributed in the hope that it will be useful,
503 - but WITHOUT ANY WARRANTY; without even the implied warranty of
504 - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
505 - Lesser General Public License for more details.
507 - You should have received a copy of the GNU Lesser General Public
508 - License along with this library; if not, write to the Free Software
509 - Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
511 -Also add information on how to contact you by electronic and paper mail.
513 -You should also get your employer (if you work as a programmer) or your
514 -school, if any, to sign a "copyright disclaimer" for the library, if
515 -necessary. Here is a sample; alter the names:
517 - Yoyodyne, Inc., hereby disclaims all copyright interest in the
518 - library `Frob' (a library for tweaking knobs) written by James Random Hacker.
520 - <signature of Ty Coon>, 1 April 1990
521 - Ty Coon, President of Vice
523 -That's all there is to it!
526 +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
527 +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
528 +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
529 +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
530 +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
531 +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
532 +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
533 +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
534 +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
535 +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
536 +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
537 diff --git a/other-licenses/ply/README b/other-licenses/ply/README
538 --- a/other-licenses/ply/README
539 +++ b/other-licenses/ply/README
540 @@ -1,9 +1,9 @@
541 David Beazley's PLY (Python Lex-Yacc)
542 http://www.dabeaz.com/ply/
544 -Licensed under the GPL (v2.1 or later).
545 +Licensed under BSD.
547 -This directory contains just the code and license from PLY version 2.5;
548 +This directory contains just the code and license from PLY version 3.3;
549 the full distribution (see the URL) also contains examples, tests,
550 documentation, and a longer README.
552 diff --git a/other-licenses/ply/ply/lex.py b/other-licenses/ply/ply/lex.py
553 --- a/other-licenses/ply/ply/lex.py
554 +++ b/other-licenses/ply/ply/lex.py
555 @@ -1,88 +1,119 @@
556 # -----------------------------------------------------------------------------
557 # ply: lex.py
559 -# Author: David M. Beazley (dave@dabeaz.com)
560 +# Copyright (C) 2001-2009,
561 +# David M. Beazley (Dabeaz LLC)
562 +# All rights reserved.
564 -# Copyright (C) 2001-2008, David M. Beazley
565 +# Redistribution and use in source and binary forms, with or without
566 +# modification, are permitted provided that the following conditions are
567 +# met:
569 +# * Redistributions of source code must retain the above copyright notice,
570 +# this list of conditions and the following disclaimer.
571 +# * Redistributions in binary form must reproduce the above copyright notice,
572 +# this list of conditions and the following disclaimer in the documentation
573 +# and/or other materials provided with the distribution.
574 +# * Neither the name of the David Beazley or Dabeaz LLC may be used to
575 +# endorse or promote products derived from this software without
576 +# specific prior written permission.
578 -# This library is free software; you can redistribute it and/or
579 -# modify it under the terms of the GNU Lesser General Public
580 -# License as published by the Free Software Foundation; either
581 -# version 2.1 of the License, or (at your option) any later version.
583 -# This library is distributed in the hope that it will be useful,
584 -# but WITHOUT ANY WARRANTY; without even the implied warranty of
585 -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
586 -# Lesser General Public License for more details.
588 -# You should have received a copy of the GNU Lesser General Public
589 -# License along with this library; if not, write to the Free Software
590 -# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
592 -# See the file COPYING for a complete copy of the LGPL.
593 +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
594 +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
595 +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
596 +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
597 +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
598 +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
599 +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
600 +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
601 +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
602 +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
603 +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
604 # -----------------------------------------------------------------------------
606 -__version__ = "2.5"
607 -__tabversion__ = "2.4" # Version of table file used
608 +__version__ = "3.3"
609 +__tabversion__ = "3.2" # Version of table file used
611 import re, sys, types, copy, os
613 +# This tuple contains known string types
614 +try:
615 + # Python 2.6
616 + StringTypes = (types.StringType, types.UnicodeType)
617 +except AttributeError:
618 + # Python 3.0
619 + StringTypes = (str, bytes)
621 +# Extract the code attribute of a function. Different implementations
622 +# are for Python 2/3 compatibility.
624 +if sys.version_info[0] < 3:
625 + def func_code(f):
626 + return f.func_code
627 +else:
628 + def func_code(f):
629 + return f.__code__
631 # This regular expression is used to match valid token names
632 _is_identifier = re.compile(r'^[a-zA-Z0-9_]+$')
634 -# _INSTANCETYPE sets the valid set of instance types recognized
635 -# by PLY when lexers are defined by a class. In order to maintain
636 -# backwards compatibility with Python-2.0, we have to check for
637 -# the existence of ObjectType.
639 -try:
640 - _INSTANCETYPE = (types.InstanceType, types.ObjectType)
641 -except AttributeError:
642 - _INSTANCETYPE = types.InstanceType
643 - class object: pass # Note: needed if no new-style classes present
645 # Exception thrown when invalid token encountered and no default error
646 # handler is defined.
648 class LexError(Exception):
649 def __init__(self,message,s):
650 self.args = (message,)
651 self.text = s
653 -# An object used to issue one-time warning messages for various features
655 -class LexWarning(object):
656 - def __init__(self):
657 - self.warned = 0
658 - def __call__(self,msg):
659 - if not self.warned:
660 - sys.stderr.write("ply.lex: Warning: " + msg+"\n")
661 - self.warned = 1
663 -_SkipWarning = LexWarning() # Warning for use of t.skip() on tokens
665 # Token class. This class is used to represent the tokens produced.
666 class LexToken(object):
667 def __str__(self):
668 return "LexToken(%s,%r,%d,%d)" % (self.type,self.value,self.lineno,self.lexpos)
669 def __repr__(self):
670 return str(self)
671 - def skip(self,n):
672 - self.lexer.skip(n)
673 - _SkipWarning("Calling t.skip() on a token is deprecated. Please use t.lexer.skip()")
675 +# This object is a stand-in for a logging object created by the
676 +# logging module.
678 +class PlyLogger(object):
679 + def __init__(self,f):
680 + self.f = f
681 + def critical(self,msg,*args,**kwargs):
682 + self.f.write((msg % args) + "\n")
684 + def warning(self,msg,*args,**kwargs):
685 + self.f.write("WARNING: "+ (msg % args) + "\n")
687 + def error(self,msg,*args,**kwargs):
688 + self.f.write("ERROR: " + (msg % args) + "\n")
690 + info = critical
691 + debug = critical
693 +# Null logger is used when no output is generated. Does nothing.
694 +class NullLogger(object):
695 + def __getattribute__(self,name):
696 + return self
697 + def __call__(self,*args,**kwargs):
698 + return self
700 # -----------------------------------------------------------------------------
701 -# Lexer class
702 +# === Lexing Engine ===
704 -# This class encapsulates all of the methods and data associated with a lexer.
705 +# The following Lexer class implements the lexer runtime. There are only
706 +# a few public methods and attributes:
708 # input() - Store a new string in the lexer
709 # token() - Get the next token
710 +# clone() - Clone the lexer
712 +# lineno - Current line number
713 +# lexpos - Current position in the input string
714 # -----------------------------------------------------------------------------
716 class Lexer:
717 def __init__(self):
718 self.lexre = None # Master regular expression. This is a list of
719 # tuples (re,findex) where re is a compiled
720 # regular expression and findex is a list
721 # mapping regex group numbers to rules
722 @@ -100,17 +131,16 @@ class Lexer:
723 self.lexpos = 0 # Current position in input text
724 self.lexlen = 0 # Length of the input text
725 self.lexerrorf = None # Error rule (if any)
726 self.lextokens = None # List of valid tokens
727 self.lexignore = "" # Ignored characters
728 self.lexliterals = "" # Literal characters that can be passed through
729 self.lexmodule = None # Module
730 self.lineno = 1 # Current line number
731 - self.lexdebug = 0 # Debugging mode
732 self.lexoptimize = 0 # Optimized mode
734 def clone(self,object=None):
735 c = copy.copy(self)
737 # If the object parameter has been supplied, it means we are attaching the
738 # lexer to a new object. In this case, we have to rebind all methods in
739 # the lexstatere and lexstateerrorf tables.
740 @@ -140,16 +170,17 @@ class Lexer:
741 # ------------------------------------------------------------
742 def writetab(self,tabfile,outputdir=""):
743 if isinstance(tabfile,types.ModuleType):
744 return
745 basetabfilename = tabfile.split(".")[-1]
746 filename = os.path.join(outputdir,basetabfilename)+".py"
747 tf = open(filename,"w")
748 tf.write("# %s.py. This file automatically created by PLY (version %s). Don't edit!\n" % (tabfile,__version__))
749 + tf.write("_tabversion = %s\n" % repr(__version__))
750 tf.write("_lextokens = %s\n" % repr(self.lextokens))
751 tf.write("_lexreflags = %s\n" % repr(self.lexreflags))
752 tf.write("_lexliterals = %s\n" % repr(self.lexliterals))
753 tf.write("_lexstateinfo = %s\n" % repr(self.lexstateinfo))
755 tabre = { }
756 # Collect all functions in the initial state
757 initial = self.lexstatere["INITIAL"]
758 @@ -179,55 +210,64 @@ class Lexer:
760 # ------------------------------------------------------------
761 # readtab() - Read lexer information from a tab file
762 # ------------------------------------------------------------
763 def readtab(self,tabfile,fdict):
764 if isinstance(tabfile,types.ModuleType):
765 lextab = tabfile
766 else:
767 - exec "import %s as lextab" % tabfile
768 + if sys.version_info[0] < 3:
769 + exec("import %s as lextab" % tabfile)
770 + else:
771 + env = { }
772 + exec("import %s as lextab" % tabfile, env,env)
773 + lextab = env['lextab']
775 + if getattr(lextab,"_tabversion","0.0") != __version__:
776 + raise ImportError("Inconsistent PLY version")
778 self.lextokens = lextab._lextokens
779 self.lexreflags = lextab._lexreflags
780 self.lexliterals = lextab._lexliterals
781 self.lexstateinfo = lextab._lexstateinfo
782 self.lexstateignore = lextab._lexstateignore
783 self.lexstatere = { }
784 self.lexstateretext = { }
785 for key,lre in lextab._lexstatere.items():
786 titem = []
787 txtitem = []
788 for i in range(len(lre)):
789 - titem.append((re.compile(lre[i][0],lextab._lexreflags),_names_to_funcs(lre[i][1],fdict)))
790 + titem.append((re.compile(lre[i][0],lextab._lexreflags | re.VERBOSE),_names_to_funcs(lre[i][1],fdict)))
791 txtitem.append(lre[i][0])
792 self.lexstatere[key] = titem
793 self.lexstateretext[key] = txtitem
794 self.lexstateerrorf = { }
795 for key,ef in lextab._lexstateerrorf.items():
796 self.lexstateerrorf[key] = fdict[ef]
797 self.begin('INITIAL')
799 # ------------------------------------------------------------
800 # input() - Push a new string into the lexer
801 # ------------------------------------------------------------
802 def input(self,s):
803 # Pull off the first character to see if s looks like a string
804 c = s[:1]
805 - if not (isinstance(c,types.StringType) or isinstance(c,types.UnicodeType)):
806 - raise ValueError, "Expected a string"
807 + if not isinstance(c,StringTypes):
808 + raise ValueError("Expected a string")
809 self.lexdata = s
810 self.lexpos = 0
811 self.lexlen = len(s)
813 # ------------------------------------------------------------
814 # begin() - Changes the lexing state
815 # ------------------------------------------------------------
816 def begin(self,state):
817 - if not self.lexstatere.has_key(state):
818 - raise ValueError, "Undefined state"
819 + if not state in self.lexstatere:
820 + raise ValueError("Undefined state")
821 self.lexre = self.lexstatere[state]
822 self.lexretext = self.lexstateretext[state]
823 self.lexignore = self.lexstateignore.get(state,"")
824 self.lexerrorf = self.lexstateerrorf.get(state,None)
825 self.lexstate = state
827 # ------------------------------------------------------------
828 # push_state() - Changes the lexing state and saves old on stack
829 @@ -250,17 +290,17 @@ class Lexer:
831 # ------------------------------------------------------------
832 # skip() - Skip ahead n characters
833 # ------------------------------------------------------------
834 def skip(self,n):
835 self.lexpos += n
837 # ------------------------------------------------------------
838 - # token() - Return the next token from the Lexer
839 + # opttoken() - Return the next token from the Lexer
841 # Note: This function has been carefully implemented to be as fast
842 # as possible. Don't make changes unless you really know what
843 # you are doing
844 # ------------------------------------------------------------
845 def token(self):
846 # Make local copies of frequently referenced attributes
847 lexpos = self.lexpos
848 @@ -294,39 +334,35 @@ class Lexer:
849 self.lexpos = m.end()
850 return tok
851 else:
852 lexpos = m.end()
853 break
855 lexpos = m.end()
857 - # if func not callable, it means it's an ignored token
858 - if not callable(func):
859 - break
861 # If token is processed by a function, call it
863 tok.lexer = self # Set additional attributes useful in token rules
864 self.lexmatch = m
865 self.lexpos = lexpos
867 newtok = func(tok)
869 # Every function must return a token, if nothing, we just move to next token
870 if not newtok:
871 lexpos = self.lexpos # This is here in case user has updated lexpos.
872 lexignore = self.lexignore # This is here in case there was a state change
873 break
875 # Verify type of the token. If not in the token map, raise an error
876 if not self.lexoptimize:
877 - if not self.lextokens.has_key(newtok.type):
878 - raise LexError, ("%s:%d: Rule '%s' returned an unknown token type '%s'" % (
879 - func.func_code.co_filename, func.func_code.co_firstlineno,
880 + if not newtok.type in self.lextokens:
881 + raise LexError("%s:%d: Rule '%s' returned an unknown token type '%s'" % (
882 + func_code(func).co_filename, func_code(func).co_firstlineno,
883 func.__name__, newtok.type),lexdata[lexpos:])
885 return newtok
886 else:
887 # No match, see if in literals
888 if lexdata[lexpos] in self.lexliterals:
889 tok = LexToken()
890 tok.value = lexdata[lexpos]
891 @@ -343,70 +379,70 @@ class Lexer:
892 tok.lineno = self.lineno
893 tok.type = "error"
894 tok.lexer = self
895 tok.lexpos = lexpos
896 self.lexpos = lexpos
897 newtok = self.lexerrorf(tok)
898 if lexpos == self.lexpos:
899 # Error method didn't change text position at all. This is an error.
900 - raise LexError, ("Scanning error. Illegal character '%s'" % (lexdata[lexpos]), lexdata[lexpos:])
901 + raise LexError("Scanning error. Illegal character '%s'" % (lexdata[lexpos]), lexdata[lexpos:])
902 lexpos = self.lexpos
903 if not newtok: continue
904 return newtok
906 self.lexpos = lexpos
907 - raise LexError, ("Illegal character '%s' at index %d" % (lexdata[lexpos],lexpos), lexdata[lexpos:])
908 + raise LexError("Illegal character '%s' at index %d" % (lexdata[lexpos],lexpos), lexdata[lexpos:])
910 self.lexpos = lexpos + 1
911 if self.lexdata is None:
912 - raise RuntimeError, "No input string given with input()"
913 + raise RuntimeError("No input string given with input()")
914 return None
916 + # Iterator interface
917 + def __iter__(self):
918 + return self
920 + def next(self):
921 + t = self.token()
922 + if t is None:
923 + raise StopIteration
924 + return t
926 + __next__ = next
928 # -----------------------------------------------------------------------------
929 -# _validate_file()
930 +# ==== Lex Builder ===
932 -# This checks to see if there are duplicated t_rulename() functions or strings
933 -# in the parser input file. This is done using a simple regular expression
934 -# match on each line in the given file. If the file can't be located or opened,
935 -# a true result is returned by default.
936 +# The functions and classes below are used to collect lexing information
937 +# and build a Lexer object from it.
938 # -----------------------------------------------------------------------------
940 -def _validate_file(filename):
941 - import os.path
942 - base,ext = os.path.splitext(filename)
943 - if ext != '.py': return 1 # No idea what the file is. Return OK
944 +# -----------------------------------------------------------------------------
945 +# get_caller_module_dict()
947 +# This function returns a dictionary containing all of the symbols defined within
948 +# a caller further down the call stack. This is used to get the environment
949 +# associated with the yacc() call if none was provided.
950 +# -----------------------------------------------------------------------------
952 +def get_caller_module_dict(levels):
953 try:
954 - f = open(filename)
955 - lines = f.readlines()
956 - f.close()
957 - except IOError:
958 - return 1 # Couldn't find the file. Don't worry about it
959 + raise RuntimeError
960 + except RuntimeError:
961 + e,b,t = sys.exc_info()
962 + f = t.tb_frame
963 + while levels > 0:
964 + f = f.f_back
965 + levels -= 1
966 + ldict = f.f_globals.copy()
967 + if f.f_globals != f.f_locals:
968 + ldict.update(f.f_locals)
970 - fre = re.compile(r'\s*def\s+(t_[a-zA-Z_0-9]*)\(')
971 - sre = re.compile(r'\s*(t_[a-zA-Z_0-9]*)\s*=')
973 - counthash = { }
974 - linen = 1
975 - noerror = 1
976 - for l in lines:
977 - m = fre.match(l)
978 - if not m:
979 - m = sre.match(l)
980 - if m:
981 - name = m.group(1)
982 - prev = counthash.get(name)
983 - if not prev:
984 - counthash[name] = linen
985 - else:
986 - print >>sys.stderr, "%s:%d: Rule %s redefined. Previously defined on line %d" % (filename,linen,name,prev)
987 - noerror = 0
988 - linen += 1
989 - return noerror
990 + return ldict
992 # -----------------------------------------------------------------------------
993 # _funcs_to_names()
995 # Given a list of regular expression functions, this converts it to a list
996 # suitable for output to a table file
997 # -----------------------------------------------------------------------------
999 @@ -461,17 +497,17 @@ def _form_master_re(relist,reflags,ldict
1000 elif handle is not None:
1001 lexindexnames[i] = f
1002 if f.find("ignore_") > 0:
1003 lexindexfunc[i] = (None,None)
1004 else:
1005 lexindexfunc[i] = (None, toknames[f])
1007 return [(lexre,lexindexfunc)],[regex],[lexindexnames]
1008 - except Exception,e:
1009 + except Exception:
1010 m = int(len(relist)/2)
1011 if m == 0: m = 1
1012 llist, lre, lnames = _form_master_re(relist[:m],reflags,ldict,toknames)
1013 rlist, rre, rnames = _form_master_re(relist[m:],reflags,ldict,toknames)
1014 return llist+rlist, lre+rre, lnames+rnames
1016 # -----------------------------------------------------------------------------
1017 # def _statetoken(s,names)
1018 @@ -481,360 +517,487 @@ def _form_master_re(relist,reflags,ldict
1019 # is a tuple of state names and tokenname is the name of the token. For example,
1020 # calling this with s = "t_foo_bar_SPAM" might return (('foo','bar'),'SPAM')
1021 # -----------------------------------------------------------------------------
1023 def _statetoken(s,names):
1024 nonstate = 1
1025 parts = s.split("_")
1026 for i in range(1,len(parts)):
1027 - if not names.has_key(parts[i]) and parts[i] != 'ANY': break
1028 + if not parts[i] in names and parts[i] != 'ANY': break
1029 if i > 1:
1030 states = tuple(parts[1:i])
1031 else:
1032 states = ('INITIAL',)
1034 if 'ANY' in states:
1035 - states = tuple(names.keys())
1036 + states = tuple(names)
1038 tokenname = "_".join(parts[i:])
1039 return (states,tokenname)
1042 +# -----------------------------------------------------------------------------
1043 +# LexerReflect()
1045 +# This class represents information needed to build a lexer as extracted from a
1046 +# user's input file.
1047 +# -----------------------------------------------------------------------------
1048 +class LexerReflect(object):
1049 + def __init__(self,ldict,log=None,reflags=0):
1050 + self.ldict = ldict
1051 + self.error_func = None
1052 + self.tokens = []
1053 + self.reflags = reflags
1054 + self.stateinfo = { 'INITIAL' : 'inclusive'}
1055 + self.files = {}
1056 + self.error = 0
1058 + if log is None:
1059 + self.log = PlyLogger(sys.stderr)
1060 + else:
1061 + self.log = log
1063 + # Get all of the basic information
1064 + def get_all(self):
1065 + self.get_tokens()
1066 + self.get_literals()
1067 + self.get_states()
1068 + self.get_rules()
1070 + # Validate all of the information
1071 + def validate_all(self):
1072 + self.validate_tokens()
1073 + self.validate_literals()
1074 + self.validate_rules()
1075 + return self.error
1077 + # Get the tokens map
1078 + def get_tokens(self):
1079 + tokens = self.ldict.get("tokens",None)
1080 + if not tokens:
1081 + self.log.error("No token list is defined")
1082 + self.error = 1
1083 + return
1085 + if not isinstance(tokens,(list, tuple)):
1086 + self.log.error("tokens must be a list or tuple")
1087 + self.error = 1
1088 + return
1090 + if not tokens:
1091 + self.log.error("tokens is empty")
1092 + self.error = 1
1093 + return
1095 + self.tokens = tokens
1097 + # Validate the tokens
1098 + def validate_tokens(self):
1099 + terminals = {}
1100 + for n in self.tokens:
1101 + if not _is_identifier.match(n):
1102 + self.log.error("Bad token name '%s'",n)
1103 + self.error = 1
1104 + if n in terminals:
1105 + self.log.warning("Token '%s' multiply defined", n)
1106 + terminals[n] = 1
1108 + # Get the literals specifier
1109 + def get_literals(self):
1110 + self.literals = self.ldict.get("literals","")
1112 + # Validate literals
1113 + def validate_literals(self):
1114 + try:
1115 + for c in self.literals:
1116 + if not isinstance(c,StringTypes) or len(c) > 1:
1117 + self.log.error("Invalid literal %s. Must be a single character", repr(c))
1118 + self.error = 1
1119 + continue
1121 + except TypeError:
1122 + self.log.error("Invalid literals specification. literals must be a sequence of characters")
1123 + self.error = 1
1125 + def get_states(self):
1126 + self.states = self.ldict.get("states",None)
1127 + # Build statemap
1128 + if self.states:
1129 + if not isinstance(self.states,(tuple,list)):
1130 + self.log.error("states must be defined as a tuple or list")
1131 + self.error = 1
1132 + else:
1133 + for s in self.states:
1134 + if not isinstance(s,tuple) or len(s) != 2:
1135 + self.log.error("Invalid state specifier %s. Must be a tuple (statename,'exclusive|inclusive')",repr(s))
1136 + self.error = 1
1137 + continue
1138 + name, statetype = s
1139 + if not isinstance(name,StringTypes):
1140 + self.log.error("State name %s must be a string", repr(name))
1141 + self.error = 1
1142 + continue
1143 + if not (statetype == 'inclusive' or statetype == 'exclusive'):
1144 + self.log.error("State type for state %s must be 'inclusive' or 'exclusive'",name)
1145 + self.error = 1
1146 + continue
1147 + if name in self.stateinfo:
1148 + self.log.error("State '%s' already defined",name)
1149 + self.error = 1
1150 + continue
1151 + self.stateinfo[name] = statetype
1153 + # Get all of the symbols with a t_ prefix and sort them into various
1154 + # categories (functions, strings, error functions, and ignore characters)
1156 + def get_rules(self):
1157 + tsymbols = [f for f in self.ldict if f[:2] == 't_' ]
1159 + # Now build up a list of functions and a list of strings
1161 + self.toknames = { } # Mapping of symbols to token names
1162 + self.funcsym = { } # Symbols defined as functions
1163 + self.strsym = { } # Symbols defined as strings
1164 + self.ignore = { } # Ignore strings by state
1165 + self.errorf = { } # Error functions by state
1167 + for s in self.stateinfo:
1168 + self.funcsym[s] = []
1169 + self.strsym[s] = []
1171 + if len(tsymbols) == 0:
1172 + self.log.error("No rules of the form t_rulename are defined")
1173 + self.error = 1
1174 + return
1176 + for f in tsymbols:
1177 + t = self.ldict[f]
1178 + states, tokname = _statetoken(f,self.stateinfo)
1179 + self.toknames[f] = tokname
1181 + if hasattr(t,"__call__"):
1182 + if tokname == 'error':
1183 + for s in states:
1184 + self.errorf[s] = t
1185 + elif tokname == 'ignore':
1186 + line = func_code(t).co_firstlineno
1187 + file = func_code(t).co_filename
1188 + self.log.error("%s:%d: Rule '%s' must be defined as a string",file,line,t.__name__)
1189 + self.error = 1
1190 + else:
1191 + for s in states:
1192 + self.funcsym[s].append((f,t))
1193 + elif isinstance(t, StringTypes):
1194 + if tokname == 'ignore':
1195 + for s in states:
1196 + self.ignore[s] = t
1197 + if "\\" in t:
1198 + self.log.warning("%s contains a literal backslash '\\'",f)
1200 + elif tokname == 'error':
1201 + self.log.error("Rule '%s' must be defined as a function", f)
1202 + self.error = 1
1203 + else:
1204 + for s in states:
1205 + self.strsym[s].append((f,t))
1206 + else:
1207 + self.log.error("%s not defined as a function or string", f)
1208 + self.error = 1
1210 + # Sort the functions by line number
1211 + for f in self.funcsym.values():
1212 + if sys.version_info[0] < 3:
1213 + f.sort(lambda x,y: cmp(func_code(x[1]).co_firstlineno,func_code(y[1]).co_firstlineno))
1214 + else:
1215 + # Python 3.0
1216 + f.sort(key=lambda x: func_code(x[1]).co_firstlineno)
1218 + # Sort the strings by regular expression length
1219 + for s in self.strsym.values():
1220 + if sys.version_info[0] < 3:
1221 + s.sort(lambda x,y: (len(x[1]) < len(y[1])) - (len(x[1]) > len(y[1])))
1222 + else:
1223 + # Python 3.0
1224 + s.sort(key=lambda x: len(x[1]),reverse=True)
1226 + # Validate all of the t_rules collected
1227 + def validate_rules(self):
1228 + for state in self.stateinfo:
1229 + # Validate all rules defined by functions
1233 + for fname, f in self.funcsym[state]:
1234 + line = func_code(f).co_firstlineno
1235 + file = func_code(f).co_filename
1236 + self.files[file] = 1
1238 + tokname = self.toknames[fname]
1239 + if isinstance(f, types.MethodType):
1240 + reqargs = 2
1241 + else:
1242 + reqargs = 1
1243 + nargs = func_code(f).co_argcount
1244 + if nargs > reqargs:
1245 + self.log.error("%s:%d: Rule '%s' has too many arguments",file,line,f.__name__)
1246 + self.error = 1
1247 + continue
1249 + if nargs < reqargs:
1250 + self.log.error("%s:%d: Rule '%s' requires an argument", file,line,f.__name__)
1251 + self.error = 1
1252 + continue
1254 + if not f.__doc__:
1255 + self.log.error("%s:%d: No regular expression defined for rule '%s'",file,line,f.__name__)
1256 + self.error = 1
1257 + continue
1259 + try:
1260 + c = re.compile("(?P<%s>%s)" % (fname,f.__doc__), re.VERBOSE | self.reflags)
1261 + if c.match(""):
1262 + self.log.error("%s:%d: Regular expression for rule '%s' matches empty string", file,line,f.__name__)
1263 + self.error = 1
1264 + except re.error:
1265 + _etype, e, _etrace = sys.exc_info()
1266 + self.log.error("%s:%d: Invalid regular expression for rule '%s'. %s", file,line,f.__name__,e)
1267 + if '#' in f.__doc__:
1268 + self.log.error("%s:%d. Make sure '#' in rule '%s' is escaped with '\\#'",file,line, f.__name__)
1269 + self.error = 1
1271 + # Validate all rules defined by strings
1272 + for name,r in self.strsym[state]:
1273 + tokname = self.toknames[name]
1274 + if tokname == 'error':
1275 + self.log.error("Rule '%s' must be defined as a function", name)
1276 + self.error = 1
1277 + continue
1279 + if not tokname in self.tokens and tokname.find("ignore_") < 0:
1280 + self.log.error("Rule '%s' defined for an unspecified token %s",name,tokname)
1281 + self.error = 1
1282 + continue
1284 + try:
1285 + c = re.compile("(?P<%s>%s)" % (name,r),re.VERBOSE | self.reflags)
1286 + if (c.match("")):
1287 + self.log.error("Regular expression for rule '%s' matches empty string",name)
1288 + self.error = 1
1289 + except re.error:
1290 + _etype, e, _etrace = sys.exc_info()
1291 + self.log.error("Invalid regular expression for rule '%s'. %s",name,e)
1292 + if '#' in r:
1293 + self.log.error("Make sure '#' in rule '%s' is escaped with '\\#'",name)
1294 + self.error = 1
1296 + if not self.funcsym[state] and not self.strsym[state]:
1297 + self.log.error("No rules defined for state '%s'",state)
1298 + self.error = 1
1300 + # Validate the error function
1301 + efunc = self.errorf.get(state,None)
1302 + if efunc:
1303 + f = efunc
1304 + line = func_code(f).co_firstlineno
1305 + file = func_code(f).co_filename
1306 + self.files[file] = 1
1308 + if isinstance(f, types.MethodType):
1309 + reqargs = 2
1310 + else:
1311 + reqargs = 1
1312 + nargs = func_code(f).co_argcount
1313 + if nargs > reqargs:
1314 + self.log.error("%s:%d: Rule '%s' has too many arguments",file,line,f.__name__)
1315 + self.error = 1
1317 + if nargs < reqargs:
1318 + self.log.error("%s:%d: Rule '%s' requires an argument", file,line,f.__name__)
1319 + self.error = 1
1321 + for f in self.files:
1322 + self.validate_file(f)
1325 + # -----------------------------------------------------------------------------
1326 + # validate_file()
1328 + # This checks to see if there are duplicated t_rulename() functions or strings
1329 + # in the parser input file. This is done using a simple regular expression
1330 + # match on each line in the given file.
1331 + # -----------------------------------------------------------------------------
1333 + def validate_file(self,filename):
1334 + import os.path
1335 + base,ext = os.path.splitext(filename)
1336 + if ext != '.py': return # No idea what the file is. Return OK
1338 + try:
1339 + f = open(filename)
1340 + lines = f.readlines()
1341 + f.close()
1342 + except IOError:
1343 + return # Couldn't find the file. Don't worry about it
1345 + fre = re.compile(r'\s*def\s+(t_[a-zA-Z_0-9]*)\(')
1346 + sre = re.compile(r'\s*(t_[a-zA-Z_0-9]*)\s*=')
1348 + counthash = { }
1349 + linen = 1
1350 + for l in lines:
1351 + m = fre.match(l)
1352 + if not m:
1353 + m = sre.match(l)
1354 + if m:
1355 + name = m.group(1)
1356 + prev = counthash.get(name)
1357 + if not prev:
1358 + counthash[name] = linen
1359 + else:
1360 + self.log.error("%s:%d: Rule %s redefined. Previously defined on line %d",filename,linen,name,prev)
1361 + self.error = 1
1362 + linen += 1
1364 # -----------------------------------------------------------------------------
1365 # lex(module)
1367 # Build all of the regular expression rules from definitions in the supplied module
1368 # -----------------------------------------------------------------------------
1369 -def lex(module=None,object=None,debug=0,optimize=0,lextab="lextab",reflags=0,nowarn=0,outputdir=""):
1370 +def lex(module=None,object=None,debug=0,optimize=0,lextab="lextab",reflags=0,nowarn=0,outputdir="", debuglog=None, errorlog=None):
1371 global lexer
1372 ldict = None
1373 stateinfo = { 'INITIAL' : 'inclusive'}
1374 - error = 0
1375 - files = { }
1376 lexobj = Lexer()
1377 - lexobj.lexdebug = debug
1378 lexobj.lexoptimize = optimize
1379 global token,input
1381 - if nowarn: warn = 0
1382 - else: warn = 1
1383 + if errorlog is None:
1384 + errorlog = PlyLogger(sys.stderr)
1386 + if debug:
1387 + if debuglog is None:
1388 + debuglog = PlyLogger(sys.stderr)
1390 + # Get the module dictionary used for the lexer
1391 if object: module = object
1393 if module:
1394 - # User supplied a module object.
1395 - if isinstance(module, types.ModuleType):
1396 - ldict = module.__dict__
1397 - elif isinstance(module, _INSTANCETYPE):
1398 - _items = [(k,getattr(module,k)) for k in dir(module)]
1399 - ldict = { }
1400 - for (i,v) in _items:
1401 - ldict[i] = v
1402 - else:
1403 - raise ValueError,"Expected a module or instance"
1404 - lexobj.lexmodule = module
1405 + _items = [(k,getattr(module,k)) for k in dir(module)]
1406 + ldict = dict(_items)
1407 + else:
1408 + ldict = get_caller_module_dict(2)
1410 - else:
1411 - # No module given. We might be able to get information from the caller.
1412 - try:
1413 - raise RuntimeError
1414 - except RuntimeError:
1415 - e,b,t = sys.exc_info()
1416 - f = t.tb_frame
1417 - f = f.f_back # Walk out to our calling function
1418 - if f.f_globals is f.f_locals: # Collect global and local variations from caller
1419 - ldict = f.f_globals
1420 - else:
1421 - ldict = f.f_globals.copy()
1422 - ldict.update(f.f_locals)
1423 + # Collect parser information from the dictionary
1424 + linfo = LexerReflect(ldict,log=errorlog,reflags=reflags)
1425 + linfo.get_all()
1426 + if not optimize:
1427 + if linfo.validate_all():
1428 + raise SyntaxError("Can't build lexer")
1430 if optimize and lextab:
1431 try:
1432 lexobj.readtab(lextab,ldict)
1433 token = lexobj.token
1434 input = lexobj.input
1435 lexer = lexobj
1436 return lexobj
1438 except ImportError:
1439 pass
1441 - # Get the tokens, states, and literals variables (if any)
1443 - tokens = ldict.get("tokens",None)
1444 - states = ldict.get("states",None)
1445 - literals = ldict.get("literals","")
1447 - if not tokens:
1448 - raise SyntaxError,"lex: module does not define 'tokens'"
1450 - if not (isinstance(tokens,types.ListType) or isinstance(tokens,types.TupleType)):
1451 - raise SyntaxError,"lex: tokens must be a list or tuple."
1452 + # Dump some basic debugging information
1453 + if debug:
1454 + debuglog.info("lex: tokens = %r", linfo.tokens)
1455 + debuglog.info("lex: literals = %r", linfo.literals)
1456 + debuglog.info("lex: states = %r", linfo.stateinfo)
1458 # Build a dictionary of valid token names
1459 lexobj.lextokens = { }
1460 - if not optimize:
1461 - for n in tokens:
1462 - if not _is_identifier.match(n):
1463 - print >>sys.stderr, "lex: Bad token name '%s'" % n
1464 - error = 1
1465 - if warn and lexobj.lextokens.has_key(n):
1466 - print >>sys.stderr, "lex: Warning. Token '%s' multiply defined." % n
1467 - lexobj.lextokens[n] = None
1468 + for n in linfo.tokens:
1469 + lexobj.lextokens[n] = 1
1471 + # Get literals specification
1472 + if isinstance(linfo.literals,(list,tuple)):
1473 + lexobj.lexliterals = type(linfo.literals[0])().join(linfo.literals)
1474 else:
1475 - for n in tokens: lexobj.lextokens[n] = None
1476 + lexobj.lexliterals = linfo.literals
1478 - if debug:
1479 - print "lex: tokens = '%s'" % lexobj.lextokens.keys()
1481 - try:
1482 - for c in literals:
1483 - if not (isinstance(c,types.StringType) or isinstance(c,types.UnicodeType)) or len(c) > 1:
1484 - print >>sys.stderr, "lex: Invalid literal %s. Must be a single character" % repr(c)
1485 - error = 1
1486 - continue
1488 - except TypeError:
1489 - print >>sys.stderr, "lex: Invalid literals specification. literals must be a sequence of characters."
1490 - error = 1
1492 - lexobj.lexliterals = literals
1494 - # Build statemap
1495 - if states:
1496 - if not (isinstance(states,types.TupleType) or isinstance(states,types.ListType)):
1497 - print >>sys.stderr, "lex: states must be defined as a tuple or list."
1498 - error = 1
1499 - else:
1500 - for s in states:
1501 - if not isinstance(s,types.TupleType) or len(s) != 2:
1502 - print >>sys.stderr, "lex: invalid state specifier %s. Must be a tuple (statename,'exclusive|inclusive')" % repr(s)
1503 - error = 1
1504 - continue
1505 - name, statetype = s
1506 - if not isinstance(name,types.StringType):
1507 - print >>sys.stderr, "lex: state name %s must be a string" % repr(name)
1508 - error = 1
1509 - continue
1510 - if not (statetype == 'inclusive' or statetype == 'exclusive'):
1511 - print >>sys.stderr, "lex: state type for state %s must be 'inclusive' or 'exclusive'" % name
1512 - error = 1
1513 - continue
1514 - if stateinfo.has_key(name):
1515 - print >>sys.stderr, "lex: state '%s' already defined." % name
1516 - error = 1
1517 - continue
1518 - stateinfo[name] = statetype
1520 - # Get a list of symbols with the t_ or s_ prefix
1521 - tsymbols = [f for f in ldict.keys() if f[:2] == 't_' ]
1523 - # Now build up a list of functions and a list of strings
1525 - funcsym = { } # Symbols defined as functions
1526 - strsym = { } # Symbols defined as strings
1527 - toknames = { } # Mapping of symbols to token names
1529 - for s in stateinfo.keys():
1530 - funcsym[s] = []
1531 - strsym[s] = []
1533 - ignore = { } # Ignore strings by state
1534 - errorf = { } # Error functions by state
1536 - if len(tsymbols) == 0:
1537 - raise SyntaxError,"lex: no rules of the form t_rulename are defined."
1539 - for f in tsymbols:
1540 - t = ldict[f]
1541 - states, tokname = _statetoken(f,stateinfo)
1542 - toknames[f] = tokname
1544 - if callable(t):
1545 - for s in states: funcsym[s].append((f,t))
1546 - elif (isinstance(t, types.StringType) or isinstance(t,types.UnicodeType)):
1547 - for s in states: strsym[s].append((f,t))
1548 - else:
1549 - print >>sys.stderr, "lex: %s not defined as a function or string" % f
1550 - error = 1
1552 - # Sort the functions by line number
1553 - for f in funcsym.values():
1554 - f.sort(lambda x,y: cmp(x[1].func_code.co_firstlineno,y[1].func_code.co_firstlineno))
1556 - # Sort the strings by regular expression length
1557 - for s in strsym.values():
1558 - s.sort(lambda x,y: (len(x[1]) < len(y[1])) - (len(x[1]) > len(y[1])))
1559 + # Get the stateinfo dictionary
1560 + stateinfo = linfo.stateinfo
1562 regexs = { }
1564 # Build the master regular expressions
1565 - for state in stateinfo.keys():
1566 + for state in stateinfo:
1567 regex_list = []
1569 # Add rules defined by functions first
1570 - for fname, f in funcsym[state]:
1571 - line = f.func_code.co_firstlineno
1572 - file = f.func_code.co_filename
1573 - files[file] = None
1574 - tokname = toknames[fname]
1576 - ismethod = isinstance(f, types.MethodType)
1578 - if not optimize:
1579 - nargs = f.func_code.co_argcount
1580 - if ismethod:
1581 - reqargs = 2
1582 - else:
1583 - reqargs = 1
1584 - if nargs > reqargs:
1585 - print >>sys.stderr, "%s:%d: Rule '%s' has too many arguments." % (file,line,f.__name__)
1586 - error = 1
1587 - continue
1589 - if nargs < reqargs:
1590 - print >>sys.stderr, "%s:%d: Rule '%s' requires an argument." % (file,line,f.__name__)
1591 - error = 1
1592 - continue
1594 - if tokname == 'ignore':
1595 - print >>sys.stderr, "%s:%d: Rule '%s' must be defined as a string." % (file,line,f.__name__)
1596 - error = 1
1597 - continue
1599 - if tokname == 'error':
1600 - errorf[state] = f
1601 - continue
1603 - if f.__doc__:
1604 - if not optimize:
1605 - try:
1606 - c = re.compile("(?P<%s>%s)" % (fname,f.__doc__), re.VERBOSE | reflags)
1607 - if c.match(""):
1608 - print >>sys.stderr, "%s:%d: Regular expression for rule '%s' matches empty string." % (file,line,f.__name__)
1609 - error = 1
1610 - continue
1611 - except re.error,e:
1612 - print >>sys.stderr, "%s:%d: Invalid regular expression for rule '%s'. %s" % (file,line,f.__name__,e)
1613 - if '#' in f.__doc__:
1614 - print >>sys.stderr, "%s:%d. Make sure '#' in rule '%s' is escaped with '\\#'." % (file,line, f.__name__)
1615 - error = 1
1616 - continue
1618 - if debug:
1619 - print "lex: Adding rule %s -> '%s' (state '%s')" % (f.__name__,f.__doc__, state)
1621 - # Okay. The regular expression seemed okay. Let's append it to the master regular
1622 - # expression we're building
1624 - regex_list.append("(?P<%s>%s)" % (fname,f.__doc__))
1625 - else:
1626 - print >>sys.stderr, "%s:%d: No regular expression defined for rule '%s'" % (file,line,f.__name__)
1627 + for fname, f in linfo.funcsym[state]:
1628 + line = func_code(f).co_firstlineno
1629 + file = func_code(f).co_filename
1630 + regex_list.append("(?P<%s>%s)" % (fname,f.__doc__))
1631 + if debug:
1632 + debuglog.info("lex: Adding rule %s -> '%s' (state '%s')",fname,f.__doc__, state)
1634 # Now add all of the simple rules
1635 - for name,r in strsym[state]:
1636 - tokname = toknames[name]
1638 - if tokname == 'ignore':
1639 - if "\\" in r:
1640 - print >>sys.stderr, "lex: Warning. %s contains a literal backslash '\\'" % name
1641 - ignore[state] = r
1642 - continue
1644 - if not optimize:
1645 - if tokname == 'error':
1646 - raise SyntaxError,"lex: Rule '%s' must be defined as a function" % name
1647 - error = 1
1648 - continue
1650 - if not lexobj.lextokens.has_key(tokname) and tokname.find("ignore_") < 0:
1651 - print >>sys.stderr, "lex: Rule '%s' defined for an unspecified token %s." % (name,tokname)
1652 - error = 1
1653 - continue
1654 - try:
1655 - c = re.compile("(?P<%s>%s)" % (name,r),re.VERBOSE | reflags)
1656 - if (c.match("")):
1657 - print >>sys.stderr, "lex: Regular expression for rule '%s' matches empty string." % name
1658 - error = 1
1659 - continue
1660 - except re.error,e:
1661 - print >>sys.stderr, "lex: Invalid regular expression for rule '%s'. %s" % (name,e)
1662 - if '#' in r:
1663 - print >>sys.stderr, "lex: Make sure '#' in rule '%s' is escaped with '\\#'." % name
1665 - error = 1
1666 - continue
1667 - if debug:
1668 - print "lex: Adding rule %s -> '%s' (state '%s')" % (name,r,state)
1670 + for name,r in linfo.strsym[state]:
1671 regex_list.append("(?P<%s>%s)" % (name,r))
1673 - if not regex_list:
1674 - print >>sys.stderr, "lex: No rules defined for state '%s'" % state
1675 - error = 1
1676 + if debug:
1677 + debuglog.info("lex: Adding rule %s -> '%s' (state '%s')",name,r, state)
1679 regexs[state] = regex_list
1682 - if not optimize:
1683 - for f in files.keys():
1684 - if not _validate_file(f):
1685 - error = 1
1687 - if error:
1688 - raise SyntaxError,"lex: Unable to build lexer."
1690 - # From this point forward, we're reasonably confident that we can build the lexer.
1691 - # No more errors will be generated, but there might be some warning messages.
1693 # Build the master regular expressions
1695 - for state in regexs.keys():
1696 - lexre, re_text, re_names = _form_master_re(regexs[state],reflags,ldict,toknames)
1697 + if debug:
1698 + debuglog.info("lex: ==== MASTER REGEXS FOLLOW ====")
1700 + for state in regexs:
1701 + lexre, re_text, re_names = _form_master_re(regexs[state],reflags,ldict,linfo.toknames)
1702 lexobj.lexstatere[state] = lexre
1703 lexobj.lexstateretext[state] = re_text
1704 lexobj.lexstaterenames[state] = re_names
1705 if debug:
1706 for i in range(len(re_text)):
1707 - print "lex: state '%s'. regex[%d] = '%s'" % (state, i, re_text[i])
1708 + debuglog.info("lex: state '%s' : regex[%d] = '%s'",state, i, re_text[i])
1710 - # For inclusive states, we need to add the INITIAL state
1711 - for state,type in stateinfo.items():
1712 - if state != "INITIAL" and type == 'inclusive':
1713 + # For inclusive states, we need to add the regular expressions from the INITIAL state
1714 + for state,stype in stateinfo.items():
1715 + if state != "INITIAL" and stype == 'inclusive':
1716 lexobj.lexstatere[state].extend(lexobj.lexstatere['INITIAL'])
1717 lexobj.lexstateretext[state].extend(lexobj.lexstateretext['INITIAL'])
1718 lexobj.lexstaterenames[state].extend(lexobj.lexstaterenames['INITIAL'])
1720 lexobj.lexstateinfo = stateinfo
1721 lexobj.lexre = lexobj.lexstatere["INITIAL"]
1722 lexobj.lexretext = lexobj.lexstateretext["INITIAL"]
1723 + lexobj.lexreflags = reflags
1725 # Set up ignore variables
1726 - lexobj.lexstateignore = ignore
1727 + lexobj.lexstateignore = linfo.ignore
1728 lexobj.lexignore = lexobj.lexstateignore.get("INITIAL","")
1730 # Set up error functions
1731 - lexobj.lexstateerrorf = errorf
1732 - lexobj.lexerrorf = errorf.get("INITIAL",None)
1733 - if warn and not lexobj.lexerrorf:
1734 - print >>sys.stderr, "lex: Warning. no t_error rule is defined."
1735 + lexobj.lexstateerrorf = linfo.errorf
1736 + lexobj.lexerrorf = linfo.errorf.get("INITIAL",None)
1737 + if not lexobj.lexerrorf:
1738 + errorlog.warning("No t_error rule is defined")
1740 # Check state information for ignore and error rules
1741 for s,stype in stateinfo.items():
1742 if stype == 'exclusive':
1743 - if warn and not errorf.has_key(s):
1744 - print >>sys.stderr, "lex: Warning. no error rule is defined for exclusive state '%s'" % s
1745 - if warn and not ignore.has_key(s) and lexobj.lexignore:
1746 - print >>sys.stderr, "lex: Warning. no ignore rule is defined for exclusive state '%s'" % s
1747 + if not s in linfo.errorf:
1748 + errorlog.warning("No error rule is defined for exclusive state '%s'", s)
1749 + if not s in linfo.ignore and lexobj.lexignore:
1750 + errorlog.warning("No ignore rule is defined for exclusive state '%s'", s)
1751 elif stype == 'inclusive':
1752 - if not errorf.has_key(s):
1753 - errorf[s] = errorf.get("INITIAL",None)
1754 - if not ignore.has_key(s):
1755 - ignore[s] = ignore.get("INITIAL","")
1757 + if not s in linfo.errorf:
1758 + linfo.errorf[s] = linfo.errorf.get("INITIAL",None)
1759 + if not s in linfo.ignore:
1760 + linfo.ignore[s] = linfo.ignore.get("INITIAL","")
1762 # Create global versions of the token() and input() functions
1763 token = lexobj.token
1764 input = lexobj.input
1765 lexer = lexobj
1767 # If in optimize mode, we write the lextab
1768 if lextab and optimize:
1769 @@ -851,45 +1014,44 @@ def lex(module=None,object=None,debug=0,
1770 def runmain(lexer=None,data=None):
1771 if not data:
1772 try:
1773 filename = sys.argv[1]
1774 f = open(filename)
1775 data = f.read()
1776 f.close()
1777 except IndexError:
1778 - print "Reading from standard input (type EOF to end):"
1779 + sys.stdout.write("Reading from standard input (type EOF to end):\n")
1780 data = sys.stdin.read()
1782 if lexer:
1783 _input = lexer.input
1784 else:
1785 _input = input
1786 _input(data)
1787 if lexer:
1788 _token = lexer.token
1789 else:
1790 _token = token
1792 while 1:
1793 tok = _token()
1794 if not tok: break
1795 - print "(%s,%r,%d,%d)" % (tok.type, tok.value, tok.lineno,tok.lexpos)
1797 + sys.stdout.write("(%s,%r,%d,%d)\n" % (tok.type, tok.value, tok.lineno,tok.lexpos))
1799 # -----------------------------------------------------------------------------
1800 # @TOKEN(regex)
1802 # This decorator function can be used to set the regex expression on a function
1803 # when its docstring might need to be set in an alternative way
1804 # -----------------------------------------------------------------------------
1806 def TOKEN(r):
1807 def set_doc(f):
1808 - if callable(r):
1809 + if hasattr(r,"__call__"):
1810 f.__doc__ = r.__doc__
1811 else:
1812 f.__doc__ = r
1813 return f
1814 return set_doc
1816 # Alternative spelling of the TOKEN decorator
1817 Token = TOKEN
1818 diff --git a/other-licenses/ply/ply/yacc.py b/other-licenses/ply/ply/yacc.py
1819 --- a/other-licenses/ply/ply/yacc.py
1820 +++ b/other-licenses/ply/ply/yacc.py
1821 @@ -1,31 +1,40 @@
1822 -#-----------------------------------------------------------------------------
1823 +# -----------------------------------------------------------------------------
1824 # ply: yacc.py
1826 -# Author(s): David M. Beazley (dave@dabeaz.com)
1827 +# Copyright (C) 2001-2009,
1828 +# David M. Beazley (Dabeaz LLC)
1829 +# All rights reserved.
1831 -# Copyright (C) 2001-2008, David M. Beazley
1832 +# Redistribution and use in source and binary forms, with or without
1833 +# modification, are permitted provided that the following conditions are
1834 +# met:
1836 +# * Redistributions of source code must retain the above copyright notice,
1837 +# this list of conditions and the following disclaimer.
1838 +# * Redistributions in binary form must reproduce the above copyright notice,
1839 +# this list of conditions and the following disclaimer in the documentation
1840 +# and/or other materials provided with the distribution.
1841 +# * Neither the name of the David Beazley or Dabeaz LLC may be used to
1842 +# endorse or promote products derived from this software without
1843 +# specific prior written permission.
1845 -# This library is free software; you can redistribute it and/or
1846 -# modify it under the terms of the GNU Lesser General Public
1847 -# License as published by the Free Software Foundation; either
1848 -# version 2.1 of the License, or (at your option) any later version.
1850 -# This library is distributed in the hope that it will be useful,
1851 -# but WITHOUT ANY WARRANTY; without even the implied warranty of
1852 -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
1853 -# Lesser General Public License for more details.
1855 -# You should have received a copy of the GNU Lesser General Public
1856 -# License along with this library; if not, write to the Free Software
1857 -# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
1859 -# See the file COPYING for a complete copy of the LGPL.
1861 +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
1862 +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
1863 +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
1864 +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
1865 +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
1866 +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
1867 +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
1868 +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
1869 +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
1870 +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
1871 +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
1872 +# -----------------------------------------------------------------------------
1874 # This implements an LR parser that is constructed from grammar rules defined
1875 # as Python functions. The grammer is specified by supplying the BNF inside
1876 # Python documentation strings. The inspiration for this technique was borrowed
1877 # from John Aycock's Spark parsing system. PLY might be viewed as cross between
1878 # Spark and the GNU bison utility.
1880 # The current implementation is only somewhat object-oriented. The
1881 @@ -45,18 +54,18 @@
1883 # Construction of LR parsing tables is fairly complicated and expensive.
1884 # To make this module run fast, a *LOT* of work has been put into
1885 # optimization---often at the expensive of readability and what might
1886 # consider to be good Python "coding style." Modify the code at your
1887 # own risk!
1888 # ----------------------------------------------------------------------------
1890 -__version__ = "2.5"
1891 -__tabversion__ = "2.4" # Table version
1892 +__version__ = "3.3"
1893 +__tabversion__ = "3.2" # Table version
1895 #-----------------------------------------------------------------------------
1896 # === User configurable parameters ===
1898 # Change these to modify the default behavior of yacc (if you wish)
1899 #-----------------------------------------------------------------------------
1901 yaccdebug = 1 # Debugging mode. If set, yacc generates a
1902 @@ -66,34 +75,93 @@ debug_file = 'parser.out' # Default
1903 tab_module = 'parsetab' # Default name of the table module
1904 default_lr = 'LALR' # Default LR table generation method
1906 error_count = 3 # Number of symbols that must be shifted to leave recovery mode
1908 yaccdevel = 0 # Set to True if developing yacc. This turns off optimized
1909 # implementations of certain functions.
1911 -import re, types, sys, cStringIO, md5, os.path
1913 +resultlimit = 40 # Size limit of results when running in debug mode.
1915 +pickle_protocol = 0 # Protocol to use when writing pickle files
1917 +import re, types, sys, os.path
1919 +# Compatibility function for python 2.6/3.0
1920 +if sys.version_info[0] < 3:
1921 + def func_code(f):
1922 + return f.func_code
1923 +else:
1924 + def func_code(f):
1925 + return f.__code__
1927 +# Compatibility
1928 +try:
1929 + MAXINT = sys.maxint
1930 +except AttributeError:
1931 + MAXINT = sys.maxsize
1933 +# Python 2.x/3.0 compatibility.
1934 +def load_ply_lex():
1935 + if sys.version_info[0] < 3:
1936 + import lex
1937 + else:
1938 + import ply.lex as lex
1939 + return lex
1941 +# This object is a stand-in for a logging object created by the
1942 +# logging module. PLY will use this by default to create things
1943 +# such as the parser.out file. If a user wants more detailed
1944 +# information, they can create their own logging object and pass
1945 +# it into PLY.
1947 +class PlyLogger(object):
1948 + def __init__(self,f):
1949 + self.f = f
1950 + def debug(self,msg,*args,**kwargs):
1951 + self.f.write((msg % args) + "\n")
1952 + info = debug
1954 + def warning(self,msg,*args,**kwargs):
1955 + self.f.write("WARNING: "+ (msg % args) + "\n")
1957 + def error(self,msg,*args,**kwargs):
1958 + self.f.write("ERROR: " + (msg % args) + "\n")
1960 + critical = debug
1962 +# Null logger is used when no output is generated. Does nothing.
1963 +class NullLogger(object):
1964 + def __getattribute__(self,name):
1965 + return self
1966 + def __call__(self,*args,**kwargs):
1967 + return self
1969 # Exception raised for yacc-related errors
1970 class YaccError(Exception): pass
1972 -# Exception raised for errors raised in production rules
1973 -class SyntaxError(Exception): pass
1976 -# Available instance types. This is used when parsers are defined by a class.
1977 -# it's a little funky because I want to preserve backwards compatibility
1978 -# with Python 2.0 where types.ObjectType is undefined.
1980 -try:
1981 - _INSTANCETYPE = (types.InstanceType, types.ObjectType)
1982 -except AttributeError:
1983 - _INSTANCETYPE = types.InstanceType
1984 - class object: pass # Note: needed if no new-style classes present
1985 +# Format the result message that the parser produces when running in debug mode.
1986 +def format_result(r):
1987 + repr_str = repr(r)
1988 + if '\n' in repr_str: repr_str = repr(repr_str)
1989 + if len(repr_str) > resultlimit:
1990 + repr_str = repr_str[:resultlimit]+" ..."
1991 + result = "<%s @ 0x%x> (%s)" % (type(r).__name__,id(r),repr_str)
1992 + return result
1995 +# Format stack entries when the parser is running in debug mode
1996 +def format_stack_entry(r):
1997 + repr_str = repr(r)
1998 + if '\n' in repr_str: repr_str = repr(repr_str)
1999 + if len(repr_str) < 16:
2000 + return repr_str
2001 + else:
2002 + return "<%s @ 0x%x>" % (type(r).__name__,id(r))
2004 #-----------------------------------------------------------------------------
2005 # === LR Parsing Engine ===
2007 # The following classes are used for the LR parser itself. These are not
2008 # used during table construction and are independent of the actual LR
2009 # table generation algorithm
2010 #-----------------------------------------------------------------------------
2011 @@ -137,16 +205,19 @@ class YaccProduction:
2012 return [s.value for s in self.slice[i:j]]
2014 def __len__(self):
2015 return len(self.slice)
2017 def lineno(self,n):
2018 return getattr(self.slice[n],"lineno",0)
2020 + def set_lineno(self,n,lineno):
2021 + self.slice[n].lineno = lineno
2023 def linespan(self,n):
2024 startline = getattr(self.slice[n],"lineno",0)
2025 endline = getattr(self.slice[n],"endlineno",startline)
2026 return startline,endline
2028 def lexpos(self,n):
2029 return getattr(self.slice[n],"lexpos",0)
2031 @@ -154,51 +225,44 @@ class YaccProduction:
2032 startpos = getattr(self.slice[n],"lexpos",0)
2033 endpos = getattr(self.slice[n],"endlexpos",startpos)
2034 return startpos,endpos
2036 def error(self):
2037 raise SyntaxError
2040 -# The LR Parsing engine. This is defined as a class so that multiple parsers
2041 -# can exist in the same process. A user never instantiates this directly.
2042 -# Instead, the global yacc() function should be used to create a suitable Parser
2043 -# object.
2045 -class Parser:
2046 - def __init__(self,magic=None):
2048 - # This is a hack to keep users from trying to instantiate a Parser
2049 - # object directly.
2051 - if magic != "xyzzy":
2052 - raise YaccError, "Can't directly instantiate Parser. Use yacc() instead."
2054 - # Reset internal state
2055 - self.productions = None # List of productions
2056 - self.errorfunc = None # Error handling function
2057 - self.action = { } # LR Action table
2058 - self.goto = { } # LR goto table
2059 - self.require = { } # Attribute require table
2060 - self.method = "Unknown LR" # Table construction method used
2061 +# -----------------------------------------------------------------------------
2062 +# == LRParser ==
2064 +# The LR Parsing engine.
2065 +# -----------------------------------------------------------------------------
2067 +class LRParser:
2068 + def __init__(self,lrtab,errorf):
2069 + self.productions = lrtab.lr_productions
2070 + self.action = lrtab.lr_action
2071 + self.goto = lrtab.lr_goto
2072 + self.errorfunc = errorf
2074 def errok(self):
2075 self.errorok = 1
2077 def restart(self):
2078 del self.statestack[:]
2079 del self.symstack[:]
2080 sym = YaccSymbol()
2081 sym.type = '$end'
2082 self.symstack.append(sym)
2083 self.statestack.append(0)
2085 def parse(self,input=None,lexer=None,debug=0,tracking=0,tokenfunc=None):
2086 if debug or yaccdevel:
2087 + if isinstance(debug,int):
2088 + debug = PlyLogger(sys.stderr)
2089 return self.parsedebug(input,lexer,debug,tracking,tokenfunc)
2090 elif tracking:
2091 return self.parseopt(input,lexer,debug,tracking,tokenfunc)
2092 else:
2093 return self.parseopt_notrack(input,lexer,debug,tracking,tokenfunc)
2096 # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2097 @@ -210,30 +274,34 @@ class Parser:
2098 # enclosed in:
2100 # #--! DEBUG
2101 # statements
2102 # #--! DEBUG
2104 # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2106 - def parsedebug(self,input=None,lexer=None,debug=0,tracking=0,tokenfunc=None):
2107 + def parsedebug(self,input=None,lexer=None,debug=None,tracking=0,tokenfunc=None):
2108 lookahead = None # Current lookahead symbol
2109 lookaheadstack = [ ] # Stack of lookahead symbols
2110 actions = self.action # Local reference to action table (to avoid lookup on self.)
2111 goto = self.goto # Local reference to goto table (to avoid lookup on self.)
2112 prod = self.productions # Local reference to production list (to avoid lookup on self.)
2113 pslice = YaccProduction(None) # Production object passed to grammar rules
2114 errorcount = 0 # Used during error recovery
2115 - endsym = "$end" # End symbol
2117 + # --! DEBUG
2118 + debug.info("PLY: PARSE DEBUG START")
2119 + # --! DEBUG
2121 # If no lexer was given, we will try to use the lex module
2122 if not lexer:
2123 - import lex
2124 + lex = load_ply_lex()
2125 lexer = lex.lexer
2128 # Set up the lexer and parser objects on pslice
2129 pslice.lexer = lexer
2130 pslice.parser = self
2132 # If input was supplied, pass to lexer
2133 if input is not None:
2134 lexer.input(input)
2136 @@ -252,65 +320,55 @@ class Parser:
2138 pslice.stack = symstack # Put in the production
2139 errtoken = None # Err token
2141 # The start state is assumed to be (0,$end)
2143 statestack.append(0)
2144 sym = YaccSymbol()
2145 - sym.type = endsym
2146 + sym.type = "$end"
2147 symstack.append(sym)
2148 state = 0
2149 while 1:
2150 # Get the next symbol on the input. If a lookahead symbol
2151 # is already set, we just use that. Otherwise, we'll pull
2152 # the next token off of the lookaheadstack or from the lexer
2154 # --! DEBUG
2155 - if debug > 1:
2156 - print 'state', state
2157 + debug.debug('')
2158 + debug.debug('State : %s', state)
2159 # --! DEBUG
2161 if not lookahead:
2162 if not lookaheadstack:
2163 lookahead = get_token() # Get the next token
2164 else:
2165 lookahead = lookaheadstack.pop()
2166 if not lookahead:
2167 lookahead = YaccSymbol()
2168 - lookahead.type = endsym
2169 + lookahead.type = "$end"
2171 # --! DEBUG
2172 - if debug:
2173 - errorlead = ("%s . %s" % (" ".join([xx.type for xx in symstack][1:]), str(lookahead))).lstrip()
2174 + debug.debug('Stack : %s',
2175 + ("%s . %s" % (" ".join([xx.type for xx in symstack][1:]), str(lookahead))).lstrip())
2176 # --! DEBUG
2178 # Check the action table
2179 ltype = lookahead.type
2180 t = actions[state].get(ltype)
2182 - # --! DEBUG
2183 - if debug > 1:
2184 - print 'action', t
2185 - # --! DEBUG
2187 if t is not None:
2188 if t > 0:
2189 # shift a symbol on the stack
2190 - if ltype is endsym:
2191 - # Error, end of input
2192 - sys.stderr.write("yacc: Parse error. EOF\n")
2193 - return
2194 statestack.append(t)
2195 state = t
2197 # --! DEBUG
2198 - if debug > 1:
2199 - sys.stderr.write("%-60s shift state %s\n" % (errorlead, t))
2200 + debug.debug("Action : Shift and goto state %s", t)
2201 # --! DEBUG
2203 symstack.append(lookahead)
2204 lookahead = None
2206 # Decrease error count on successful shift
2207 if errorcount: errorcount -=1
2208 continue
2209 @@ -322,18 +380,21 @@ class Parser:
2210 plen = p.len
2212 # Get production function
2213 sym = YaccSymbol()
2214 sym.type = pname # Production name
2215 sym.value = None
2217 # --! DEBUG
2218 - if debug > 1:
2219 - sys.stderr.write("%-60s reduce %d\n" % (errorlead, -t))
2220 + if plen:
2221 + debug.info("Action : Reduce rule [%s] with %s and goto state %d", p.str, "["+",".join([format_stack_entry(_v.value) for _v in symstack[-plen:]])+"]",-t)
2222 + else:
2223 + debug.info("Action : Reduce rule [%s] with %s and goto state %d", p.str, [],-t)
2225 # --! DEBUG
2227 if plen:
2228 targ = symstack[-plen-1:]
2229 targ[0] = sym
2231 # --! TRACKING
2232 if tracking:
2233 @@ -350,19 +411,22 @@ class Parser:
2234 # The code enclosed in this section is duplicated
2235 # below as a performance optimization. Make sure
2236 # changes get made in both locations.
2238 pslice.slice = targ
2240 try:
2241 # Call the grammar rule with our special slice object
2242 - p.func(pslice)
2243 del symstack[-plen:]
2244 del statestack[-plen:]
2245 + p.callable(pslice)
2246 + # --! DEBUG
2247 + debug.info("Result : %s", format_result(pslice[0]))
2248 + # --! DEBUG
2249 symstack.append(sym)
2250 state = goto[statestack[-1]][pname]
2251 statestack.append(state)
2252 except SyntaxError:
2253 # If an error was set. Enter error recovery state
2254 lookaheadstack.append(lookahead)
2255 symstack.pop()
2256 statestack.pop()
2257 @@ -388,17 +452,20 @@ class Parser:
2258 # The code enclosed in this section is duplicated
2259 # above as a performance optimization. Make sure
2260 # changes get made in both locations.
2262 pslice.slice = targ
2264 try:
2265 # Call the grammar rule with our special slice object
2266 - p.func(pslice)
2267 + p.callable(pslice)
2268 + # --! DEBUG
2269 + debug.info("Result : %s", format_result(pslice[0]))
2270 + # --! DEBUG
2271 symstack.append(sym)
2272 state = goto[statestack[-1]][pname]
2273 statestack.append(state)
2274 except SyntaxError:
2275 # If an error was set. Enter error recovery state
2276 lookaheadstack.append(lookahead)
2277 symstack.pop()
2278 statestack.pop()
2279 @@ -407,46 +474,53 @@ class Parser:
2280 lookahead = sym
2281 errorcount = error_count
2282 self.errorok = 0
2283 continue
2284 # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2286 if t == 0:
2287 n = symstack[-1]
2288 - return getattr(n,"value",None)
2289 + result = getattr(n,"value",None)
2290 + # --! DEBUG
2291 + debug.info("Done : Returning %s", format_result(result))
2292 + debug.info("PLY: PARSE DEBUG END")
2293 + # --! DEBUG
2294 + return result
2296 if t == None:
2298 # --! DEBUG
2299 - if debug:
2300 - sys.stderr.write(errorlead + "\n")
2301 + debug.error('Error : %s',
2302 + ("%s . %s" % (" ".join([xx.type for xx in symstack][1:]), str(lookahead))).lstrip())
2303 # --! DEBUG
2305 # We have some kind of parsing error here. To handle
2306 # this, we are going to push the current token onto
2307 # the tokenstack and replace it with an 'error' token.
2308 # If there are any synchronization rules, they may
2309 # catch it.
2311 # In addition to pushing the error token, we call call
2312 # the user defined p_error() function if this is the
2313 # first syntax error. This function is only called if
2314 # errorcount == 0.
2315 if errorcount == 0 or self.errorok:
2316 errorcount = error_count
2317 self.errorok = 0
2318 errtoken = lookahead
2319 - if errtoken.type is endsym:
2320 + if errtoken.type == "$end":
2321 errtoken = None # End of file!
2322 if self.errorfunc:
2323 global errok,token,restart
2324 errok = self.errok # Set some special functions available in error recovery
2325 token = get_token
2326 restart = self.restart
2327 + if errtoken and not hasattr(errtoken,'lexer'):
2328 + errtoken.lexer = lexer
2329 tok = self.errorfunc(errtoken)
2330 del errok, token, restart # Delete special functions
2332 if self.errorok:
2333 # User must have done some kind of panic
2334 # mode recovery on their own. The
2335 # returned token is the next lookahead
2336 lookahead = tok
2337 @@ -466,29 +540,29 @@ class Parser:
2339 else:
2340 errorcount = error_count
2342 # case 1: the statestack only has 1 entry on it. If we're in this state, the
2343 # entire parse has been rolled back and we're completely hosed. The token is
2344 # discarded and we just keep going.
2346 - if len(statestack) <= 1 and lookahead.type is not endsym:
2347 + if len(statestack) <= 1 and lookahead.type != "$end":
2348 lookahead = None
2349 errtoken = None
2350 state = 0
2351 # Nuke the pushback stack
2352 del lookaheadstack[:]
2353 continue
2355 # case 2: the statestack has a couple of entries on it, but we're
2356 # at the end of the file. nuke the top entry and generate an error token
2358 # Start nuking entries on the stack
2359 - if lookahead.type is endsym:
2360 + if lookahead.type == "$end":
2361 # Whoa. We're really hosed here. Bail out
2362 return
2364 if lookahead.type != 'error':
2365 sym = symstack[-1]
2366 if sym.type == 'error':
2367 # Hmmm. Error is on top of stack, we'll just nuke input
2368 # symbol and continue
2369 @@ -504,17 +578,17 @@ class Parser:
2370 else:
2371 symstack.pop()
2372 statestack.pop()
2373 state = statestack[-1] # Potential bug fix
2375 continue
2377 # Call an error function here
2378 - raise RuntimeError, "yacc: internal parser error!!!\n"
2379 + raise RuntimeError("yacc: internal parser error!!!\n")
2381 # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2382 # parseopt().
2384 # Optimized version of parse() method. DO NOT EDIT THIS CODE DIRECTLY.
2385 # Edit the debug version above, then copy any modifications to the method
2386 # below while removing #--! DEBUG sections.
2387 # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2388 @@ -526,17 +600,17 @@ class Parser:
2389 actions = self.action # Local reference to action table (to avoid lookup on self.)
2390 goto = self.goto # Local reference to goto table (to avoid lookup on self.)
2391 prod = self.productions # Local reference to production list (to avoid lookup on self.)
2392 pslice = YaccProduction(None) # Production object passed to grammar rules
2393 errorcount = 0 # Used during error recovery
2395 # If no lexer was given, we will try to use the lex module
2396 if not lexer:
2397 - import lex
2398 + lex = load_ply_lex()
2399 lexer = lex.lexer
2401 # Set up the lexer and parser objects on pslice
2402 pslice.lexer = lexer
2403 pslice.parser = self
2405 # If input was supplied, pass to lexer
2406 if input is not None:
2407 @@ -581,20 +655,16 @@ class Parser:
2409 # Check the action table
2410 ltype = lookahead.type
2411 t = actions[state].get(ltype)
2413 if t is not None:
2414 if t > 0:
2415 # shift a symbol on the stack
2416 - if ltype == '$end':
2417 - # Error, end of input
2418 - sys.stderr.write("yacc: Parse error. EOF\n")
2419 - return
2420 statestack.append(t)
2421 state = t
2423 symstack.append(lookahead)
2424 lookahead = None
2426 # Decrease error count on successful shift
2427 if errorcount: errorcount -=1
2428 @@ -630,19 +700,19 @@ class Parser:
2429 # The code enclosed in this section is duplicated
2430 # below as a performance optimization. Make sure
2431 # changes get made in both locations.
2433 pslice.slice = targ
2435 try:
2436 # Call the grammar rule with our special slice object
2437 - p.func(pslice)
2438 del symstack[-plen:]
2439 del statestack[-plen:]
2440 + p.callable(pslice)
2441 symstack.append(sym)
2442 state = goto[statestack[-1]][pname]
2443 statestack.append(state)
2444 except SyntaxError:
2445 # If an error was set. Enter error recovery state
2446 lookaheadstack.append(lookahead)
2447 symstack.pop()
2448 statestack.pop()
2449 @@ -668,17 +738,17 @@ class Parser:
2450 # The code enclosed in this section is duplicated
2451 # above as a performance optimization. Make sure
2452 # changes get made in both locations.
2454 pslice.slice = targ
2456 try:
2457 # Call the grammar rule with our special slice object
2458 - p.func(pslice)
2459 + p.callable(pslice)
2460 symstack.append(sym)
2461 state = goto[statestack[-1]][pname]
2462 statestack.append(state)
2463 except SyntaxError:
2464 # If an error was set. Enter error recovery state
2465 lookaheadstack.append(lookahead)
2466 symstack.pop()
2467 statestack.pop()
2468 @@ -712,16 +782,18 @@ class Parser:
2469 errtoken = lookahead
2470 if errtoken.type == '$end':
2471 errtoken = None # End of file!
2472 if self.errorfunc:
2473 global errok,token,restart
2474 errok = self.errok # Set some special functions available in error recovery
2475 token = get_token
2476 restart = self.restart
2477 + if errtoken and not hasattr(errtoken,'lexer'):
2478 + errtoken.lexer = lexer
2479 tok = self.errorfunc(errtoken)
2480 del errok, token, restart # Delete special functions
2482 if self.errorok:
2483 # User must have done some kind of panic
2484 # mode recovery on their own. The
2485 # returned token is the next lookahead
2486 lookahead = tok
2487 @@ -779,17 +851,17 @@ class Parser:
2488 else:
2489 symstack.pop()
2490 statestack.pop()
2491 state = statestack[-1] # Potential bug fix
2493 continue
2495 # Call an error function here
2496 - raise RuntimeError, "yacc: internal parser error!!!\n"
2497 + raise RuntimeError("yacc: internal parser error!!!\n")
2499 # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2500 # parseopt_notrack().
2502 # Optimized version of parseopt() with line number tracking removed.
2503 # DO NOT EDIT THIS CODE DIRECTLY. Copy the optimized version and remove
2504 # code in the #--! TRACKING sections
2505 # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2506 @@ -800,17 +872,17 @@ class Parser:
2507 actions = self.action # Local reference to action table (to avoid lookup on self.)
2508 goto = self.goto # Local reference to goto table (to avoid lookup on self.)
2509 prod = self.productions # Local reference to production list (to avoid lookup on self.)
2510 pslice = YaccProduction(None) # Production object passed to grammar rules
2511 errorcount = 0 # Used during error recovery
2513 # If no lexer was given, we will try to use the lex module
2514 if not lexer:
2515 - import lex
2516 + lex = load_ply_lex()
2517 lexer = lex.lexer
2519 # Set up the lexer and parser objects on pslice
2520 pslice.lexer = lexer
2521 pslice.parser = self
2523 # If input was supplied, pass to lexer
2524 if input is not None:
2525 @@ -855,20 +927,16 @@ class Parser:
2527 # Check the action table
2528 ltype = lookahead.type
2529 t = actions[state].get(ltype)
2531 if t is not None:
2532 if t > 0:
2533 # shift a symbol on the stack
2534 - if ltype == '$end':
2535 - # Error, end of input
2536 - sys.stderr.write("yacc: Parse error. EOF\n")
2537 - return
2538 statestack.append(t)
2539 state = t
2541 symstack.append(lookahead)
2542 lookahead = None
2544 # Decrease error count on successful shift
2545 if errorcount: errorcount -=1
2546 @@ -893,19 +961,19 @@ class Parser:
2547 # The code enclosed in this section is duplicated
2548 # below as a performance optimization. Make sure
2549 # changes get made in both locations.
2551 pslice.slice = targ
2553 try:
2554 # Call the grammar rule with our special slice object
2555 - p.func(pslice)
2556 del symstack[-plen:]
2557 del statestack[-plen:]
2558 + p.callable(pslice)
2559 symstack.append(sym)
2560 state = goto[statestack[-1]][pname]
2561 statestack.append(state)
2562 except SyntaxError:
2563 # If an error was set. Enter error recovery state
2564 lookaheadstack.append(lookahead)
2565 symstack.pop()
2566 statestack.pop()
2567 @@ -925,17 +993,17 @@ class Parser:
2568 # The code enclosed in this section is duplicated
2569 # above as a performance optimization. Make sure
2570 # changes get made in both locations.
2572 pslice.slice = targ
2574 try:
2575 # Call the grammar rule with our special slice object
2576 - p.func(pslice)
2577 + p.callable(pslice)
2578 symstack.append(sym)
2579 state = goto[statestack[-1]][pname]
2580 statestack.append(state)
2581 except SyntaxError:
2582 # If an error was set. Enter error recovery state
2583 lookaheadstack.append(lookahead)
2584 symstack.pop()
2585 statestack.pop()
2586 @@ -969,16 +1037,18 @@ class Parser:
2587 errtoken = lookahead
2588 if errtoken.type == '$end':
2589 errtoken = None # End of file!
2590 if self.errorfunc:
2591 global errok,token,restart
2592 errok = self.errok # Set some special functions available in error recovery
2593 token = get_token
2594 restart = self.restart
2595 + if errtoken and not hasattr(errtoken,'lexer'):
2596 + errtoken.lexer = lexer
2597 tok = self.errorfunc(errtoken)
2598 del errok, token, restart # Delete special functions
2600 if self.errorok:
2601 # User must have done some kind of panic
2602 # mode recovery on their own. The
2603 # returned token is the next lookahead
2604 lookahead = tok
2605 @@ -1036,1115 +1106,783 @@ class Parser:
2606 else:
2607 symstack.pop()
2608 statestack.pop()
2609 state = statestack[-1] # Potential bug fix
2611 continue
2613 # Call an error function here
2614 - raise RuntimeError, "yacc: internal parser error!!!\n"
2616 + raise RuntimeError("yacc: internal parser error!!!\n")
2618 # -----------------------------------------------------------------------------
2619 -# === Parser Construction ===
2620 +# === Grammar Representation ===
2622 -# The following functions and variables are used to implement the yacc() function
2623 -# itself. This is pretty hairy stuff involving lots of error checking,
2624 -# construction of LR items, kernels, and so forth. Although a lot of
2625 -# this work is done using global variables, the resulting Parser object
2626 -# is completely self contained--meaning that it is safe to repeatedly
2627 -# call yacc() with different grammars in the same application.
2628 +# The following functions, classes, and variables are used to represent and
2629 +# manipulate the rules that make up a grammar.
2630 # -----------------------------------------------------------------------------
2632 -# -----------------------------------------------------------------------------
2633 -# validate_file()
2635 -# This function checks to see if there are duplicated p_rulename() functions
2636 -# in the parser module file. Without this function, it is really easy for
2637 -# users to make mistakes by cutting and pasting code fragments (and it's a real
2638 -# bugger to try and figure out why the resulting parser doesn't work). Therefore,
2639 -# we just do a little regular expression pattern matching of def statements
2640 -# to try and detect duplicates.
2641 -# -----------------------------------------------------------------------------
2643 -def validate_file(filename):
2644 - base,ext = os.path.splitext(filename)
2645 - if ext != '.py': return 1 # No idea. Assume it's okay.
2647 - try:
2648 - f = open(filename)
2649 - lines = f.readlines()
2650 - f.close()
2651 - except IOError:
2652 - return 1 # Oh well
2654 - # Match def p_funcname(
2655 - fre = re.compile(r'\s*def\s+(p_[a-zA-Z_0-9]*)\(')
2656 - counthash = { }
2657 - linen = 1
2658 - noerror = 1
2659 - for l in lines:
2660 - m = fre.match(l)
2661 - if m:
2662 - name = m.group(1)
2663 - prev = counthash.get(name)
2664 - if not prev:
2665 - counthash[name] = linen
2666 - else:
2667 - sys.stderr.write("%s:%d: Function %s redefined. Previously defined on line %d\n" % (filename,linen,name,prev))
2668 - noerror = 0
2669 - linen += 1
2670 - return noerror
2672 -# This function looks for functions that might be grammar rules, but which don't have the proper p_suffix.
2673 -def validate_dict(d):
2674 - for n,v in d.items():
2675 - if n[0:2] == 'p_' and type(v) in (types.FunctionType, types.MethodType): continue
2676 - if n[0:2] == 't_': continue
2678 - if n[0:2] == 'p_':
2679 - sys.stderr.write("yacc: Warning. '%s' not defined as a function\n" % n)
2680 - if 1 and isinstance(v,types.FunctionType) and v.func_code.co_argcount == 1:
2681 - try:
2682 - doc = v.__doc__.split(" ")
2683 - if doc[1] == ':':
2684 - sys.stderr.write("%s:%d: Warning. Possible grammar rule '%s' defined without p_ prefix.\n" % (v.func_code.co_filename, v.func_code.co_firstlineno,n))
2685 - except StandardError:
2686 - pass
2688 -# -----------------------------------------------------------------------------
2689 -# === GRAMMAR FUNCTIONS ===
2691 -# The following global variables and functions are used to store, manipulate,
2692 -# and verify the grammar rules specified by the user.
2693 -# -----------------------------------------------------------------------------
2695 -# Initialize all of the global variables used during grammar construction
2696 -def initialize_vars():
2697 - global Productions, Prodnames, Prodmap, Terminals
2698 - global Nonterminals, First, Follow, Precedence, UsedPrecedence, LRitems
2699 - global Errorfunc, Signature, Requires
2701 - Productions = [None] # A list of all of the productions. The first
2702 - # entry is always reserved for the purpose of
2703 - # building an augmented grammar
2705 - Prodnames = { } # A dictionary mapping the names of nonterminals to a list of all
2706 - # productions of that nonterminal.
2708 - Prodmap = { } # A dictionary that is only used to detect duplicate
2709 - # productions.
2711 - Terminals = { } # A dictionary mapping the names of terminal symbols to a
2712 - # list of the rules where they are used.
2714 - Nonterminals = { } # A dictionary mapping names of nonterminals to a list
2715 - # of rule numbers where they are used.
2717 - First = { } # A dictionary of precomputed FIRST(x) symbols
2719 - Follow = { } # A dictionary of precomputed FOLLOW(x) symbols
2721 - Precedence = { } # Precedence rules for each terminal. Contains tuples of the
2722 - # form ('right',level) or ('nonassoc', level) or ('left',level)
2724 - UsedPrecedence = { } # Precedence rules that were actually used by the grammer.
2725 - # This is only used to provide error checking and to generate
2726 - # a warning about unused precedence rules.
2728 - LRitems = [ ] # A list of all LR items for the grammar. These are the
2729 - # productions with the "dot" like E -> E . PLUS E
2731 - Errorfunc = None # User defined error handler
2733 - Signature = md5.new() # Digital signature of the grammar rules, precedence
2734 - # and other information. Used to determined when a
2735 - # parsing table needs to be regenerated.
2737 - Signature.update(__tabversion__)
2739 - Requires = { } # Requires list
2741 - # File objects used when creating the parser.out debugging file
2742 - global _vf, _vfc
2743 - _vf = cStringIO.StringIO()
2744 - _vfc = cStringIO.StringIO()
2745 +import re
2747 +# regex matching identifiers
2748 +_is_identifier = re.compile(r'^[a-zA-Z0-9_-]+$')
2750 # -----------------------------------------------------------------------------
2751 # class Production:
2753 # This class stores the raw information about a single production or grammar rule.
2754 -# It has a few required attributes:
2755 +# A grammar rule refers to a specification such as this:
2757 -# name - Name of the production (nonterminal)
2758 -# prod - A list of symbols making up its production
2759 +# expr : expr PLUS term
2761 +# Here are the basic attributes defined on all productions
2763 +# name - Name of the production. For example 'expr'
2764 +# prod - A list of symbols on the right side ['expr','PLUS','term']
2765 +# prec - Production precedence level
2766 # number - Production number.
2767 +# func - Function that executes on reduce
2768 +# file - File where production function is defined
2769 +# lineno - Line number where production function is defined
2771 -# In addition, a few additional attributes are used to help with debugging or
2772 -# optimization of table generation.
2773 +# The following attributes are defined or optional.
2775 -# file - File where production action is defined.
2776 -# lineno - Line number where action is defined
2777 -# func - Action function
2778 -# prec - Precedence level
2779 -# lr_next - Next LR item. Example, if we are ' E -> E . PLUS E'
2780 -# then lr_next refers to 'E -> E PLUS . E'
2781 -# lr_index - LR item index (location of the ".") in the prod list.
2782 +# len - Length of the production (number of symbols on right hand side)
2783 +# usyms - Set of unique symbols found in the production
2784 +# -----------------------------------------------------------------------------
2786 +class Production(object):
2787 + reduced = 0
2788 + def __init__(self,number,name,prod,precedence=('right',0),func=None,file='',line=0):
2789 + self.name = name
2790 + self.prod = tuple(prod)
2791 + self.number = number
2792 + self.func = func
2793 + self.callable = None
2794 + self.file = file
2795 + self.line = line
2796 + self.prec = precedence
2798 + # Internal settings used during table construction
2800 + self.len = len(self.prod) # Length of the production
2802 + # Create a list of unique production symbols used in the production
2803 + self.usyms = [ ]
2804 + for s in self.prod:
2805 + if s not in self.usyms:
2806 + self.usyms.append(s)
2808 + # List of all LR items for the production
2809 + self.lr_items = []
2810 + self.lr_next = None
2812 + # Create a string representation
2813 + if self.prod:
2814 + self.str = "%s -> %s" % (self.name," ".join(self.prod))
2815 + else:
2816 + self.str = "%s -> <empty>" % self.name
2818 + def __str__(self):
2819 + return self.str
2821 + def __repr__(self):
2822 + return "Production("+str(self)+")"
2824 + def __len__(self):
2825 + return len(self.prod)
2827 + def __nonzero__(self):
2828 + return 1
2830 + def __getitem__(self,index):
2831 + return self.prod[index]
2833 + # Return the nth lr_item from the production (or None if at the end)
2834 + def lr_item(self,n):
2835 + if n > len(self.prod): return None
2836 + p = LRItem(self,n)
2838 + # Precompute the list of productions immediately following. Hack. Remove later
2839 + try:
2840 + p.lr_after = Prodnames[p.prod[n+1]]
2841 + except (IndexError,KeyError):
2842 + p.lr_after = []
2843 + try:
2844 + p.lr_before = p.prod[n-1]
2845 + except IndexError:
2846 + p.lr_before = None
2848 + return p
2850 + # Bind the production function name to a callable
2851 + def bind(self,pdict):
2852 + if self.func:
2853 + self.callable = pdict[self.func]
2855 +# This class serves as a minimal standin for Production objects when
2856 +# reading table data from files. It only contains information
2857 +# actually used by the LR parsing engine, plus some additional
2858 +# debugging information.
2859 +class MiniProduction(object):
2860 + def __init__(self,str,name,len,func,file,line):
2861 + self.name = name
2862 + self.len = len
2863 + self.func = func
2864 + self.callable = None
2865 + self.file = file
2866 + self.line = line
2867 + self.str = str
2868 + def __str__(self):
2869 + return self.str
2870 + def __repr__(self):
2871 + return "MiniProduction(%s)" % self.str
2873 + # Bind the production function name to a callable
2874 + def bind(self,pdict):
2875 + if self.func:
2876 + self.callable = pdict[self.func]
2879 +# -----------------------------------------------------------------------------
2880 +# class LRItem
2882 +# This class represents a specific stage of parsing a production rule. For
2883 +# example:
2885 +# expr : expr . PLUS term
2887 +# In the above, the "." represents the current location of the parse. Here
2888 +# basic attributes:
2890 +# name - Name of the production. For example 'expr'
2891 +# prod - A list of symbols on the right side ['expr','.', 'PLUS','term']
2892 +# number - Production number.
2894 +# lr_next Next LR item. Example, if we are ' expr -> expr . PLUS term'
2895 +# then lr_next refers to 'expr -> expr PLUS . term'
2896 +# lr_index - LR item index (location of the ".") in the prod list.
2897 # lookaheads - LALR lookahead symbols for this item
2898 -# len - Length of the production (number of symbols on right hand side)
2899 +# len - Length of the production (number of symbols on right hand side)
2900 +# lr_after - List of all productions that immediately follow
2901 +# lr_before - Grammar symbol immediately before
2902 # -----------------------------------------------------------------------------
2904 -class Production:
2905 - def __init__(self,**kw):
2906 - for k,v in kw.items():
2907 - setattr(self,k,v)
2908 - self.lr_index = -1
2909 - self.lr0_added = 0 # Flag indicating whether or not added to LR0 closure
2910 - self.lr1_added = 0 # Flag indicating whether or not added to LR1
2911 - self.usyms = [ ]
2912 +class LRItem(object):
2913 + def __init__(self,p,n):
2914 + self.name = p.name
2915 + self.prod = list(p.prod)
2916 + self.number = p.number
2917 + self.lr_index = n
2918 self.lookaheads = { }
2919 - self.lk_added = { }
2920 - self.setnumbers = [ ]
2921 + self.prod.insert(n,".")
2922 + self.prod = tuple(self.prod)
2923 + self.len = len(self.prod)
2924 + self.usyms = p.usyms
2926 def __str__(self):
2927 if self.prod:
2928 s = "%s -> %s" % (self.name," ".join(self.prod))
2929 else:
2930 s = "%s -> <empty>" % self.name
2931 return s
2933 def __repr__(self):
2934 - return str(self)
2936 - # Compute lr_items from the production
2937 - def lr_item(self,n):
2938 - if n > len(self.prod): return None
2939 - p = Production()
2940 - p.name = self.name
2941 - p.prod = list(self.prod)
2942 - p.number = self.number
2943 - p.lr_index = n
2944 - p.lookaheads = { }
2945 - p.setnumbers = self.setnumbers
2946 - p.prod.insert(n,".")
2947 - p.prod = tuple(p.prod)
2948 - p.len = len(p.prod)
2949 - p.usyms = self.usyms
2951 - # Precompute list of productions immediately following
2952 + return "LRItem("+str(self)+")"
2954 +# -----------------------------------------------------------------------------
2955 +# rightmost_terminal()
2957 +# Return the rightmost terminal from a list of symbols. Used in add_production()
2958 +# -----------------------------------------------------------------------------
2959 +def rightmost_terminal(symbols, terminals):
2960 + i = len(symbols) - 1
2961 + while i >= 0:
2962 + if symbols[i] in terminals:
2963 + return symbols[i]
2964 + i -= 1
2965 + return None
2967 +# -----------------------------------------------------------------------------
2968 +# === GRAMMAR CLASS ===
2970 +# The following class represents the contents of the specified grammar along
2971 +# with various computed properties such as first sets, follow sets, LR items, etc.
2972 +# This data is used for critical parts of the table generation process later.
2973 +# -----------------------------------------------------------------------------
2975 +class GrammarError(YaccError): pass
2977 +class Grammar(object):
2978 + def __init__(self,terminals):
2979 + self.Productions = [None] # A list of all of the productions. The first
2980 + # entry is always reserved for the purpose of
2981 + # building an augmented grammar
2983 + self.Prodnames = { } # A dictionary mapping the names of nonterminals to a list of all
2984 + # productions of that nonterminal.
2986 + self.Prodmap = { } # A dictionary that is only used to detect duplicate
2987 + # productions.
2989 + self.Terminals = { } # A dictionary mapping the names of terminal symbols to a
2990 + # list of the rules where they are used.
2992 + for term in terminals:
2993 + self.Terminals[term] = []
2995 + self.Terminals['error'] = []
2997 + self.Nonterminals = { } # A dictionary mapping names of nonterminals to a list
2998 + # of rule numbers where they are used.
3000 + self.First = { } # A dictionary of precomputed FIRST(x) symbols
3002 + self.Follow = { } # A dictionary of precomputed FOLLOW(x) symbols
3004 + self.Precedence = { } # Precedence rules for each terminal. Contains tuples of the
3005 + # form ('right',level) or ('nonassoc', level) or ('left',level)
3007 + self.UsedPrecedence = { } # Precedence rules that were actually used by the grammer.
3008 + # This is only used to provide error checking and to generate
3009 + # a warning about unused precedence rules.
3011 + self.Start = None # Starting symbol for the grammar
3014 + def __len__(self):
3015 + return len(self.Productions)
3017 + def __getitem__(self,index):
3018 + return self.Productions[index]
3020 + # -----------------------------------------------------------------------------
3021 + # set_precedence()
3023 + # Sets the precedence for a given terminal. assoc is the associativity such as
3024 + # 'left','right', or 'nonassoc'. level is a numeric level.
3026 + # -----------------------------------------------------------------------------
3028 + def set_precedence(self,term,assoc,level):
3029 + assert self.Productions == [None],"Must call set_precedence() before add_production()"
3030 + if term in self.Precedence:
3031 + raise GrammarError("Precedence already specified for terminal '%s'" % term)
3032 + if assoc not in ['left','right','nonassoc']:
3033 + raise GrammarError("Associativity must be one of 'left','right', or 'nonassoc'")
3034 + self.Precedence[term] = (assoc,level)
3036 + # -----------------------------------------------------------------------------
3037 + # add_production()
3039 + # Given an action function, this function assembles a production rule and
3040 + # computes its precedence level.
3042 + # The production rule is supplied as a list of symbols. For example,
3043 + # a rule such as 'expr : expr PLUS term' has a production name of 'expr' and
3044 + # symbols ['expr','PLUS','term'].
3046 + # Precedence is determined by the precedence of the right-most non-terminal
3047 + # or the precedence of a terminal specified by %prec.
3049 + # A variety of error checks are performed to make sure production symbols
3050 + # are valid and that %prec is used correctly.
3051 + # -----------------------------------------------------------------------------
3053 + def add_production(self,prodname,syms,func=None,file='',line=0):
3055 + if prodname in self.Terminals:
3056 + raise GrammarError("%s:%d: Illegal rule name '%s'. Already defined as a token" % (file,line,prodname))
3057 + if prodname == 'error':
3058 + raise GrammarError("%s:%d: Illegal rule name '%s'. error is a reserved word" % (file,line,prodname))
3059 + if not _is_identifier.match(prodname):
3060 + raise GrammarError("%s:%d: Illegal rule name '%s'" % (file,line,prodname))
3062 + # Look for literal tokens
3063 + for n,s in enumerate(syms):
3064 + if s[0] in "'\"":
3065 + try:
3066 + c = eval(s)
3067 + if (len(c) > 1):
3068 + raise GrammarError("%s:%d: Literal token %s in rule '%s' may only be a single character" % (file,line,s, prodname))
3069 + if not c in self.Terminals:
3070 + self.Terminals[c] = []
3071 + syms[n] = c
3072 + continue
3073 + except SyntaxError:
3074 + pass
3075 + if not _is_identifier.match(s) and s != '%prec':
3076 + raise GrammarError("%s:%d: Illegal name '%s' in rule '%s'" % (file,line,s, prodname))
3078 + # Determine the precedence level
3079 + if '%prec' in syms:
3080 + if syms[-1] == '%prec':
3081 + raise GrammarError("%s:%d: Syntax error. Nothing follows %%prec" % (file,line))
3082 + if syms[-2] != '%prec':
3083 + raise GrammarError("%s:%d: Syntax error. %%prec can only appear at the end of a grammar rule" % (file,line))
3084 + precname = syms[-1]
3085 + prodprec = self.Precedence.get(precname,None)
3086 + if not prodprec:
3087 + raise GrammarError("%s:%d: Nothing known about the precedence of '%s'" % (file,line,precname))
3088 + else:
3089 + self.UsedPrecedence[precname] = 1
3090 + del syms[-2:] # Drop %prec from the rule
3091 + else:
3092 + # If no %prec, precedence is determined by the rightmost terminal symbol
3093 + precname = rightmost_terminal(syms,self.Terminals)
3094 + prodprec = self.Precedence.get(precname,('right',0))
3096 + # See if the rule is already in the rulemap
3097 + map = "%s -> %s" % (prodname,syms)
3098 + if map in self.Prodmap:
3099 + m = self.Prodmap[map]
3100 + raise GrammarError("%s:%d: Duplicate rule %s. " % (file,line, m) +
3101 + "Previous definition at %s:%d" % (m.file, m.line))
3103 + # From this point on, everything is valid. Create a new Production instance
3104 + pnumber = len(self.Productions)
3105 + if not prodname in self.Nonterminals:
3106 + self.Nonterminals[prodname] = [ ]
3108 + # Add the production number to Terminals and Nonterminals
3109 + for t in syms:
3110 + if t in self.Terminals:
3111 + self.Terminals[t].append(pnumber)
3112 + else:
3113 + if not t in self.Nonterminals:
3114 + self.Nonterminals[t] = [ ]
3115 + self.Nonterminals[t].append(pnumber)
3117 + # Create a production and add it to the list of productions
3118 + p = Production(pnumber,prodname,syms,prodprec,func,file,line)
3119 + self.Productions.append(p)
3120 + self.Prodmap[map] = p
3122 + # Add to the global productions list
3123 try:
3124 - p.lrafter = Prodnames[p.prod[n+1]]
3125 - except (IndexError,KeyError),e:
3126 - p.lrafter = []
3127 - try:
3128 - p.lrbefore = p.prod[n-1]
3129 - except IndexError:
3130 - p.lrbefore = None
3132 - return p
3134 -class MiniProduction:
3135 - pass
3137 -# regex matching identifiers
3138 -_is_identifier = re.compile(r'^[a-zA-Z0-9_-]+$')
3140 -# -----------------------------------------------------------------------------
3141 -# add_production()
3143 -# Given an action function, this function assembles a production rule.
3144 -# The production rule is assumed to be found in the function's docstring.
3145 -# This rule has the general syntax:
3147 -# name1 ::= production1
3148 -# | production2
3149 -# | production3
3150 -# ...
3151 -# | productionn
3152 -# name2 ::= production1
3153 -# | production2
3154 -# ...
3155 -# -----------------------------------------------------------------------------
3157 -def add_production(f,file,line,prodname,syms):
3159 - if Terminals.has_key(prodname):
3160 - sys.stderr.write("%s:%d: Illegal rule name '%s'. Already defined as a token.\n" % (file,line,prodname))
3161 - return -1
3162 - if prodname == 'error':
3163 - sys.stderr.write("%s:%d: Illegal rule name '%s'. error is a reserved word.\n" % (file,line,prodname))
3164 - return -1
3166 - if not _is_identifier.match(prodname):
3167 - sys.stderr.write("%s:%d: Illegal rule name '%s'\n" % (file,line,prodname))
3168 - return -1
3170 - for x in range(len(syms)):
3171 - s = syms[x]
3172 - if s[0] in "'\"":
3173 - try:
3174 - c = eval(s)
3175 - if (len(c) > 1):
3176 - sys.stderr.write("%s:%d: Literal token %s in rule '%s' may only be a single character\n" % (file,line,s, prodname))
3177 - return -1
3178 - if not Terminals.has_key(c):
3179 - Terminals[c] = []
3180 - syms[x] = c
3181 - continue
3182 - except SyntaxError:
3183 - pass
3184 - if not _is_identifier.match(s) and s != '%prec':
3185 - sys.stderr.write("%s:%d: Illegal name '%s' in rule '%s'\n" % (file,line,s, prodname))
3186 - return -1
3188 - # See if the rule is already in the rulemap
3189 - map = "%s -> %s" % (prodname,syms)
3190 - if Prodmap.has_key(map):
3191 - m = Prodmap[map]
3192 - sys.stderr.write("%s:%d: Duplicate rule %s.\n" % (file,line, m))
3193 - sys.stderr.write("%s:%d: Previous definition at %s:%d\n" % (file,line, m.file, m.line))
3194 - return -1
3196 - p = Production()
3197 - p.name = prodname
3198 - p.prod = syms
3199 - p.file = file
3200 - p.line = line
3201 - p.func = f
3202 - p.number = len(Productions)
3205 - Productions.append(p)
3206 - Prodmap[map] = p
3207 - if not Nonterminals.has_key(prodname):
3208 - Nonterminals[prodname] = [ ]
3210 - # Add all terminals to Terminals
3211 - i = 0
3212 - while i < len(p.prod):
3213 - t = p.prod[i]
3214 - if t == '%prec':
3215 - try:
3216 - precname = p.prod[i+1]
3217 - except IndexError:
3218 - sys.stderr.write("%s:%d: Syntax error. Nothing follows %%prec.\n" % (p.file,p.line))
3219 - return -1
3221 - prec = Precedence.get(precname,None)
3222 - if not prec:
3223 - sys.stderr.write("%s:%d: Nothing known about the precedence of '%s'\n" % (p.file,p.line,precname))
3224 - return -1
3225 - else:
3226 - p.prec = prec
3227 - UsedPrecedence[precname] = 1
3228 - del p.prod[i]
3229 - del p.prod[i]
3230 - continue
3232 - if Terminals.has_key(t):
3233 - Terminals[t].append(p.number)
3234 - # Is a terminal. We'll assign a precedence to p based on this
3235 - if not hasattr(p,"prec"):
3236 - p.prec = Precedence.get(t,('right',0))
3237 - else:
3238 - if not Nonterminals.has_key(t):
3239 - Nonterminals[t] = [ ]
3240 - Nonterminals[t].append(p.number)
3241 - i += 1
3243 - if not hasattr(p,"prec"):
3244 - p.prec = ('right',0)
3246 - # Set final length of productions
3247 - p.len = len(p.prod)
3248 - p.prod = tuple(p.prod)
3250 - # Calculate unique syms in the production
3251 - p.usyms = [ ]
3252 - for s in p.prod:
3253 - if s not in p.usyms:
3254 - p.usyms.append(s)
3256 - # Add to the global productions list
3257 - try:
3258 - Prodnames[p.name].append(p)
3259 - except KeyError:
3260 - Prodnames[p.name] = [ p ]
3261 - return 0
3263 -# Given a raw rule function, this function rips out its doc string
3264 -# and adds rules to the grammar
3266 -def add_function(f):
3267 - line = f.func_code.co_firstlineno
3268 - file = f.func_code.co_filename
3269 - error = 0
3271 - if isinstance(f,types.MethodType):
3272 - reqdargs = 2
3273 - else:
3274 - reqdargs = 1
3276 - if f.func_code.co_argcount > reqdargs:
3277 - sys.stderr.write("%s:%d: Rule '%s' has too many arguments.\n" % (file,line,f.__name__))
3278 - return -1
3280 - if f.func_code.co_argcount < reqdargs:
3281 - sys.stderr.write("%s:%d: Rule '%s' requires an argument.\n" % (file,line,f.__name__))
3282 - return -1
3284 - if f.__doc__:
3285 - # Split the doc string into lines
3286 - pstrings = f.__doc__.splitlines()
3287 - lastp = None
3288 - dline = line
3289 - for ps in pstrings:
3290 - dline += 1
3291 - p = ps.split()
3292 + self.Prodnames[prodname].append(p)
3293 + except KeyError:
3294 + self.Prodnames[prodname] = [ p ]
3295 + return 0
3297 + # -----------------------------------------------------------------------------
3298 + # set_start()
3300 + # Sets the starting symbol and creates the augmented grammar. Production
3301 + # rule 0 is S' -> start where start is the start symbol.
3302 + # -----------------------------------------------------------------------------
3304 + def set_start(self,start=None):
3305 + if not start:
3306 + start = self.Productions[1].name
3307 + if start not in self.Nonterminals:
3308 + raise GrammarError("start symbol %s undefined" % start)
3309 + self.Productions[0] = Production(0,"S'",[start])
3310 + self.Nonterminals[start].append(0)
3311 + self.Start = start
3313 + # -----------------------------------------------------------------------------
3314 + # find_unreachable()
3316 + # Find all of the nonterminal symbols that can't be reached from the starting
3317 + # symbol. Returns a list of nonterminals that can't be reached.
3318 + # -----------------------------------------------------------------------------
3320 + def find_unreachable(self):
3322 + # Mark all symbols that are reachable from a symbol s
3323 + def mark_reachable_from(s):
3324 + if reachable[s]:
3325 + # We've already reached symbol s.
3326 + return
3327 + reachable[s] = 1
3328 + for p in self.Prodnames.get(s,[]):
3329 + for r in p.prod:
3330 + mark_reachable_from(r)
3332 + reachable = { }
3333 + for s in list(self.Terminals) + list(self.Nonterminals):
3334 + reachable[s] = 0
3336 + mark_reachable_from( self.Productions[0].prod[0] )
3338 + return [s for s in list(self.Nonterminals)
3339 + if not reachable[s]]
3341 + # -----------------------------------------------------------------------------
3342 + # infinite_cycles()
3344 + # This function looks at the various parsing rules and tries to detect
3345 + # infinite recursion cycles (grammar rules where there is no possible way
3346 + # to derive a string of only terminals).
3347 + # -----------------------------------------------------------------------------
3349 + def infinite_cycles(self):
3350 + terminates = {}
3352 + # Terminals:
3353 + for t in self.Terminals:
3354 + terminates[t] = 1
3356 + terminates['$end'] = 1
3358 + # Nonterminals:
3360 + # Initialize to false:
3361 + for n in self.Nonterminals:
3362 + terminates[n] = 0
3364 + # Then propagate termination until no change:
3365 + while 1:
3366 + some_change = 0
3367 + for (n,pl) in self.Prodnames.items():
3368 + # Nonterminal n terminates iff any of its productions terminates.
3369 + for p in pl:
3370 + # Production p terminates iff all of its rhs symbols terminate.
3371 + for s in p.prod:
3372 + if not terminates[s]:
3373 + # The symbol s does not terminate,
3374 + # so production p does not terminate.
3375 + p_terminates = 0
3376 + break
3377 + else:
3378 + # didn't break from the loop,
3379 + # so every symbol s terminates
3380 + # so production p terminates.
3381 + p_terminates = 1
3383 + if p_terminates:
3384 + # symbol n terminates!
3385 + if not terminates[n]:
3386 + terminates[n] = 1
3387 + some_change = 1
3388 + # Don't need to consider any more productions for this n.
3389 + break
3391 + if not some_change:
3392 + break
3394 + infinite = []
3395 + for (s,term) in terminates.items():
3396 + if not term:
3397 + if not s in self.Prodnames and not s in self.Terminals and s != 'error':
3398 + # s is used-but-not-defined, and we've already warned of that,
3399 + # so it would be overkill to say that it's also non-terminating.
3400 + pass
3401 + else:
3402 + infinite.append(s)
3404 + return infinite
3407 + # -----------------------------------------------------------------------------
3408 + # undefined_symbols()
3410 + # Find all symbols that were used the grammar, but not defined as tokens or
3411 + # grammar rules. Returns a list of tuples (sym, prod) where sym in the symbol
3412 + # and prod is the production where the symbol was used.
3413 + # -----------------------------------------------------------------------------
3414 + def undefined_symbols(self):
3415 + result = []
3416 + for p in self.Productions:
3417 if not p: continue
3418 - try:
3419 - if p[0] == '|':
3420 - # This is a continuation of a previous rule
3421 - if not lastp:
3422 - sys.stderr.write("%s:%d: Misplaced '|'.\n" % (file,dline))
3423 - return -1
3424 - prodname = lastp
3425 - if len(p) > 1:
3426 - syms = p[1:]
3427 - else:
3428 - syms = [ ]
3430 + for s in p.prod:
3431 + if not s in self.Prodnames and not s in self.Terminals and s != 'error':
3432 + result.append((s,p))
3433 + return result
3435 + # -----------------------------------------------------------------------------
3436 + # unused_terminals()
3438 + # Find all terminals that were defined, but not used by the grammar. Returns
3439 + # a list of all symbols.
3440 + # -----------------------------------------------------------------------------
3441 + def unused_terminals(self):
3442 + unused_tok = []
3443 + for s,v in self.Terminals.items():
3444 + if s != 'error' and not v:
3445 + unused_tok.append(s)
3447 + return unused_tok
3449 + # ------------------------------------------------------------------------------
3450 + # unused_rules()
3452 + # Find all grammar rules that were defined, but not used (maybe not reachable)
3453 + # Returns a list of productions.
3454 + # ------------------------------------------------------------------------------
3456 + def unused_rules(self):
3457 + unused_prod = []
3458 + for s,v in self.Nonterminals.items():
3459 + if not v:
3460 + p = self.Prodnames[s][0]
3461 + unused_prod.append(p)
3462 + return unused_prod
3464 + # -----------------------------------------------------------------------------
3465 + # unused_precedence()
3467 + # Returns a list of tuples (term,precedence) corresponding to precedence
3468 + # rules that were never used by the grammar. term is the name of the terminal
3469 + # on which precedence was applied and precedence is a string such as 'left' or
3470 + # 'right' corresponding to the type of precedence.
3471 + # -----------------------------------------------------------------------------
3473 + def unused_precedence(self):
3474 + unused = []
3475 + for termname in self.Precedence:
3476 + if not (termname in self.Terminals or termname in self.UsedPrecedence):
3477 + unused.append((termname,self.Precedence[termname][0]))
3479 + return unused
3481 + # -------------------------------------------------------------------------
3482 + # _first()
3484 + # Compute the value of FIRST1(beta) where beta is a tuple of symbols.
3486 + # During execution of compute_first1, the result may be incomplete.
3487 + # Afterward (e.g., when called from compute_follow()), it will be complete.
3488 + # -------------------------------------------------------------------------
3489 + def _first(self,beta):
3491 + # We are computing First(x1,x2,x3,...,xn)
3492 + result = [ ]
3493 + for x in beta:
3494 + x_produces_empty = 0
3496 + # Add all the non-<empty> symbols of First[x] to the result.
3497 + for f in self.First[x]:
3498 + if f == '<empty>':
3499 + x_produces_empty = 1
3500 else:
3501 - prodname = p[0]
3502 - lastp = prodname
3503 - assign = p[1]
3504 - if len(p) > 2:
3505 - syms = p[2:]
3506 - else:
3507 - syms = [ ]
3508 - if assign != ':' and assign != '::=':
3509 - sys.stderr.write("%s:%d: Syntax error. Expected ':'\n" % (file,dline))
3510 - return -1
3513 - e = add_production(f,file,dline,prodname,syms)
3514 - error += e
3517 - except StandardError:
3518 - sys.stderr.write("%s:%d: Syntax error in rule '%s'\n" % (file,dline,ps))
3519 - error -= 1
3520 - else:
3521 - sys.stderr.write("%s:%d: No documentation string specified in function '%s'\n" % (file,line,f.__name__))
3522 - return error
3525 -# Cycle checking code (Michael Dyck)
3527 -def compute_reachable():
3528 - '''
3529 - Find each symbol that can be reached from the start symbol.
3530 - Print a warning for any nonterminals that can't be reached.
3531 - (Unused terminals have already had their warning.)
3532 - '''
3533 - Reachable = { }
3534 - for s in Terminals.keys() + Nonterminals.keys():
3535 - Reachable[s] = 0
3537 - mark_reachable_from( Productions[0].prod[0], Reachable )
3539 - for s in Nonterminals.keys():
3540 - if not Reachable[s]:
3541 - sys.stderr.write("yacc: Symbol '%s' is unreachable.\n" % s)
3543 -def mark_reachable_from(s, Reachable):
3544 - '''
3545 - Mark all symbols that are reachable from symbol s.
3546 - '''
3547 - if Reachable[s]:
3548 - # We've already reached symbol s.
3549 - return
3550 - Reachable[s] = 1
3551 - for p in Prodnames.get(s,[]):
3552 - for r in p.prod:
3553 - mark_reachable_from(r, Reachable)
3555 -# -----------------------------------------------------------------------------
3556 -# compute_terminates()
3558 -# This function looks at the various parsing rules and tries to detect
3559 -# infinite recursion cycles (grammar rules where there is no possible way
3560 -# to derive a string of only terminals).
3561 -# -----------------------------------------------------------------------------
3562 -def compute_terminates():
3563 - '''
3564 - Raise an error for any symbols that don't terminate.
3565 - '''
3566 - Terminates = {}
3568 - # Terminals:
3569 - for t in Terminals.keys():
3570 - Terminates[t] = 1
3572 - Terminates['$end'] = 1
3574 - # Nonterminals:
3576 - # Initialize to false:
3577 - for n in Nonterminals.keys():
3578 - Terminates[n] = 0
3580 - # Then propagate termination until no change:
3581 - while 1:
3582 - some_change = 0
3583 - for (n,pl) in Prodnames.items():
3584 - # Nonterminal n terminates iff any of its productions terminates.
3585 - for p in pl:
3586 - # Production p terminates iff all of its rhs symbols terminate.
3587 - for s in p.prod:
3588 - if not Terminates[s]:
3589 - # The symbol s does not terminate,
3590 - # so production p does not terminate.
3591 - p_terminates = 0
3592 - break
3593 - else:
3594 - # didn't break from the loop,
3595 - # so every symbol s terminates
3596 - # so production p terminates.
3597 - p_terminates = 1
3599 - if p_terminates:
3600 - # symbol n terminates!
3601 - if not Terminates[n]:
3602 - Terminates[n] = 1
3603 - some_change = 1
3604 - # Don't need to consider any more productions for this n.
3605 - break
3607 - if not some_change:
3608 - break
3610 - some_error = 0
3611 - for (s,terminates) in Terminates.items():
3612 - if not terminates:
3613 - if not Prodnames.has_key(s) and not Terminals.has_key(s) and s != 'error':
3614 - # s is used-but-not-defined, and we've already warned of that,
3615 - # so it would be overkill to say that it's also non-terminating.
3616 + if f not in result: result.append(f)
3618 + if x_produces_empty:
3619 + # We have to consider the next x in beta,
3620 + # i.e. stay in the loop.
3621 pass
3622 else:
3623 - sys.stderr.write("yacc: Infinite recursion detected for symbol '%s'.\n" % s)
3624 - some_error = 1
3626 - return some_error
3627 + # We don't have to consider any further symbols in beta.
3628 + break
3629 + else:
3630 + # There was no 'break' from the loop,
3631 + # so x_produces_empty was true for all x in beta,
3632 + # so beta produces empty as well.
3633 + result.append('<empty>')
3635 + return result
3637 + # -------------------------------------------------------------------------
3638 + # compute_first()
3640 + # Compute the value of FIRST1(X) for all symbols
3641 + # -------------------------------------------------------------------------
3642 + def compute_first(self):
3643 + if self.First:
3644 + return self.First
3646 + # Terminals:
3647 + for t in self.Terminals:
3648 + self.First[t] = [t]
3650 + self.First['$end'] = ['$end']
3652 + # Nonterminals:
3654 + # Initialize to the empty set:
3655 + for n in self.Nonterminals:
3656 + self.First[n] = []
3658 + # Then propagate symbols until no change:
3659 + while 1:
3660 + some_change = 0
3661 + for n in self.Nonterminals:
3662 + for p in self.Prodnames[n]:
3663 + for f in self._first(p.prod):
3664 + if f not in self.First[n]:
3665 + self.First[n].append( f )
3666 + some_change = 1
3667 + if not some_change:
3668 + break
3670 + return self.First
3672 + # ---------------------------------------------------------------------
3673 + # compute_follow()
3675 + # Computes all of the follow sets for every non-terminal symbol. The
3676 + # follow set is the set of all symbols that might follow a given
3677 + # non-terminal. See the Dragon book, 2nd Ed. p. 189.
3678 + # ---------------------------------------------------------------------
3679 + def compute_follow(self,start=None):
3680 + # If already computed, return the result
3681 + if self.Follow:
3682 + return self.Follow
3684 + # If first sets not computed yet, do that first.
3685 + if not self.First:
3686 + self.compute_first()
3688 + # Add '$end' to the follow list of the start symbol
3689 + for k in self.Nonterminals:
3690 + self.Follow[k] = [ ]
3692 + if not start:
3693 + start = self.Productions[1].name
3695 + self.Follow[start] = [ '$end' ]
3697 + while 1:
3698 + didadd = 0
3699 + for p in self.Productions[1:]:
3700 + # Here is the production set
3701 + for i in range(len(p.prod)):
3702 + B = p.prod[i]
3703 + if B in self.Nonterminals:
3704 + # Okay. We got a non-terminal in a production
3705 + fst = self._first(p.prod[i+1:])
3706 + hasempty = 0
3707 + for f in fst:
3708 + if f != '<empty>' and f not in self.Follow[B]:
3709 + self.Follow[B].append(f)
3710 + didadd = 1
3711 + if f == '<empty>':
3712 + hasempty = 1
3713 + if hasempty or i == (len(p.prod)-1):
3714 + # Add elements of follow(a) to follow(b)
3715 + for f in self.Follow[p.name]:
3716 + if f not in self.Follow[B]:
3717 + self.Follow[B].append(f)
3718 + didadd = 1
3719 + if not didadd: break
3720 + return self.Follow
3723 + # -----------------------------------------------------------------------------
3724 + # build_lritems()
3726 + # This function walks the list of productions and builds a complete set of the
3727 + # LR items. The LR items are stored in two ways: First, they are uniquely
3728 + # numbered and placed in the list _lritems. Second, a linked list of LR items
3729 + # is built for each production. For example:
3731 + # E -> E PLUS E
3733 + # Creates the list
3735 + # [E -> . E PLUS E, E -> E . PLUS E, E -> E PLUS . E, E -> E PLUS E . ]
3736 + # -----------------------------------------------------------------------------
3738 + def build_lritems(self):
3739 + for p in self.Productions:
3740 + lastlri = p
3741 + i = 0
3742 + lr_items = []
3743 + while 1:
3744 + if i > len(p):
3745 + lri = None
3746 + else:
3747 + lri = LRItem(p,i)
3748 + # Precompute the list of productions immediately following
3749 + try:
3750 + lri.lr_after = self.Prodnames[lri.prod[i+1]]
3751 + except (IndexError,KeyError):
3752 + lri.lr_after = []
3753 + try:
3754 + lri.lr_before = lri.prod[i-1]
3755 + except IndexError:
3756 + lri.lr_before = None
3758 + lastlri.lr_next = lri
3759 + if not lri: break
3760 + lr_items.append(lri)
3761 + lastlri = lri
3762 + i += 1
3763 + p.lr_items = lr_items
3765 # -----------------------------------------------------------------------------
3766 -# verify_productions()
3767 +# == Class LRTable ==
3769 -# This function examines all of the supplied rules to see if they seem valid.
3770 +# This basic class represents a basic table of LR parsing information.
3771 +# Methods for generating the tables are not defined here. They are defined
3772 +# in the derived class LRGeneratedTable.
3773 # -----------------------------------------------------------------------------
3774 -def verify_productions(cycle_check=1):
3775 - error = 0
3776 - for p in Productions:
3777 - if not p: continue
3779 - for s in p.prod:
3780 - if not Prodnames.has_key(s) and not Terminals.has_key(s) and s != 'error':
3781 - sys.stderr.write("%s:%d: Symbol '%s' used, but not defined as a token or a rule.\n" % (p.file,p.line,s))
3782 - error = 1
3783 - continue
3785 - unused_tok = 0
3786 - # Now verify all of the tokens
3787 - if yaccdebug:
3788 - _vf.write("Unused terminals:\n\n")
3789 - for s,v in Terminals.items():
3790 - if s != 'error' and not v:
3791 - sys.stderr.write("yacc: Warning. Token '%s' defined, but not used.\n" % s)
3792 - if yaccdebug: _vf.write(" %s\n"% s)
3793 - unused_tok += 1
3795 - # Print out all of the productions
3796 - if yaccdebug:
3797 - _vf.write("\nGrammar\n\n")
3798 - for i in range(1,len(Productions)):
3799 - _vf.write("Rule %-5d %s\n" % (i, Productions[i]))
3801 - unused_prod = 0
3802 - # Verify the use of all productions
3803 - for s,v in Nonterminals.items():
3804 - if not v:
3805 - p = Prodnames[s][0]
3806 - sys.stderr.write("%s:%d: Warning. Rule '%s' defined, but not used.\n" % (p.file,p.line, s))
3807 - unused_prod += 1
3810 - if unused_tok == 1:
3811 - sys.stderr.write("yacc: Warning. There is 1 unused token.\n")
3812 - if unused_tok > 1:
3813 - sys.stderr.write("yacc: Warning. There are %d unused tokens.\n" % unused_tok)
3815 - if unused_prod == 1:
3816 - sys.stderr.write("yacc: Warning. There is 1 unused rule.\n")
3817 - if unused_prod > 1:
3818 - sys.stderr.write("yacc: Warning. There are %d unused rules.\n" % unused_prod)
3820 - if yaccdebug:
3821 - _vf.write("\nTerminals, with rules where they appear\n\n")
3822 - ks = Terminals.keys()
3823 - ks.sort()
3824 - for k in ks:
3825 - _vf.write("%-20s : %s\n" % (k, " ".join([str(s) for s in Terminals[k]])))
3826 - _vf.write("\nNonterminals, with rules where they appear\n\n")
3827 - ks = Nonterminals.keys()
3828 - ks.sort()
3829 - for k in ks:
3830 - _vf.write("%-20s : %s\n" % (k, " ".join([str(s) for s in Nonterminals[k]])))
3832 - if (cycle_check):
3833 - compute_reachable()
3834 - error += compute_terminates()
3835 -# error += check_cycles()
3836 - return error
3839 +class VersionError(YaccError): pass
3841 +class LRTable(object):
3842 + def __init__(self):
3843 + self.lr_action = None
3844 + self.lr_goto = None
3845 + self.lr_productions = None
3846 + self.lr_method = None
3848 + def read_table(self,module):
3849 + if isinstance(module,types.ModuleType):
3850 + parsetab = module
3851 + else:
3852 + if sys.version_info[0] < 3:
3853 + exec("import %s as parsetab" % module)
3854 + else:
3855 + env = { }
3856 + exec("import %s as parsetab" % module, env, env)
3857 + parsetab = env['parsetab']
3859 + if parsetab._tabversion != __tabversion__:
3860 + raise VersionError("yacc table file version is out of date")
3862 + self.lr_action = parsetab._lr_action
3863 + self.lr_goto = parsetab._lr_goto
3865 + self.lr_productions = []
3866 + for p in parsetab._lr_productions:
3867 + self.lr_productions.append(MiniProduction(*p))
3869 + self.lr_method = parsetab._lr_method
3870 + return parsetab._lr_signature
3872 + def read_pickle(self,filename):
3873 + try:
3874 + import cPickle as pickle
3875 + except ImportError:
3876 + import pickle
3878 + in_f = open(filename,"rb")
3880 + tabversion = pickle.load(in_f)
3881 + if tabversion != __tabversion__:
3882 + raise VersionError("yacc table file version is out of date")
3883 + self.lr_method = pickle.load(in_f)
3884 + signature = pickle.load(in_f)
3885 + self.lr_action = pickle.load(in_f)
3886 + self.lr_goto = pickle.load(in_f)
3887 + productions = pickle.load(in_f)
3889 + self.lr_productions = []
3890 + for p in productions:
3891 + self.lr_productions.append(MiniProduction(*p))
3893 + in_f.close()
3894 + return signature
3896 + # Bind all production function names to callable objects in pdict
3897 + def bind_callables(self,pdict):
3898 + for p in self.lr_productions:
3899 + p.bind(pdict)
3901 # -----------------------------------------------------------------------------
3902 -# build_lritems()
3903 +# === LR Generator ===
3905 -# This function walks the list of productions and builds a complete set of the
3906 -# LR items. The LR items are stored in two ways: First, they are uniquely
3907 -# numbered and placed in the list _lritems. Second, a linked list of LR items
3908 -# is built for each production. For example:
3910 -# E -> E PLUS E
3912 -# Creates the list
3914 -# [E -> . E PLUS E, E -> E . PLUS E, E -> E PLUS . E, E -> E PLUS E . ]
3915 +# The following classes and functions are used to generate LR parsing tables on
3916 +# a grammar.
3917 # -----------------------------------------------------------------------------
3919 -def build_lritems():
3920 - for p in Productions:
3921 - lastlri = p
3922 - lri = p.lr_item(0)
3923 - i = 0
3924 - while 1:
3925 - lri = p.lr_item(i)
3926 - lastlri.lr_next = lri
3927 - if not lri: break
3928 - lri.lr_num = len(LRitems)
3929 - LRitems.append(lri)
3930 - lastlri = lri
3931 - i += 1
3933 - # In order for the rest of the parser generator to work, we need to
3934 - # guarantee that no more lritems are generated. Therefore, we nuke
3935 - # the p.lr_item method. (Only used in debugging)
3936 - # Production.lr_item = None
3938 -# -----------------------------------------------------------------------------
3939 -# add_precedence()
3941 -# Given a list of precedence rules, add to the precedence table.
3942 -# -----------------------------------------------------------------------------
3944 -def add_precedence(plist):
3945 - plevel = 0
3946 - error = 0
3947 - for p in plist:
3948 - plevel += 1
3949 - try:
3950 - prec = p[0]
3951 - terms = p[1:]
3952 - if prec != 'left' and prec != 'right' and prec != 'nonassoc':
3953 - sys.stderr.write("yacc: Invalid precedence '%s'\n" % prec)
3954 - return -1
3955 - for t in terms:
3956 - if Precedence.has_key(t):
3957 - sys.stderr.write("yacc: Precedence already specified for terminal '%s'\n" % t)
3958 - error += 1
3959 - continue
3960 - Precedence[t] = (prec,plevel)
3961 - except:
3962 - sys.stderr.write("yacc: Invalid precedence table.\n")
3963 - error += 1
3965 - return error
3967 -# -----------------------------------------------------------------------------
3968 -# check_precedence()
3970 -# Checks the use of the Precedence tables. This makes sure all of the symbols
3971 -# are terminals or were used with %prec
3972 -# -----------------------------------------------------------------------------
3974 -def check_precedence():
3975 - error = 0
3976 - for precname in Precedence.keys():
3977 - if not (Terminals.has_key(precname) or UsedPrecedence.has_key(precname)):
3978 - sys.stderr.write("yacc: Precedence rule '%s' defined for unknown symbol '%s'\n" % (Precedence[precname][0],precname))
3979 - error += 1
3980 - return error
3982 -# -----------------------------------------------------------------------------
3983 -# augment_grammar()
3985 -# Compute the augmented grammar. This is just a rule S' -> start where start
3986 -# is the starting symbol.
3987 -# -----------------------------------------------------------------------------
3989 -def augment_grammar(start=None):
3990 - if not start:
3991 - start = Productions[1].name
3992 - Productions[0] = Production(name="S'",prod=[start],number=0,len=1,prec=('right',0),func=None)
3993 - Productions[0].usyms = [ start ]
3994 - Nonterminals[start].append(0)
3997 -# -------------------------------------------------------------------------
3998 -# first()
4000 -# Compute the value of FIRST1(beta) where beta is a tuple of symbols.
4002 -# During execution of compute_first1, the result may be incomplete.
4003 -# Afterward (e.g., when called from compute_follow()), it will be complete.
4004 -# -------------------------------------------------------------------------
4005 -def first(beta):
4007 - # We are computing First(x1,x2,x3,...,xn)
4008 - result = [ ]
4009 - for x in beta:
4010 - x_produces_empty = 0
4012 - # Add all the non-<empty> symbols of First[x] to the result.
4013 - for f in First[x]:
4014 - if f == '<empty>':
4015 - x_produces_empty = 1
4016 - else:
4017 - if f not in result: result.append(f)
4019 - if x_produces_empty:
4020 - # We have to consider the next x in beta,
4021 - # i.e. stay in the loop.
4022 - pass
4023 - else:
4024 - # We don't have to consider any further symbols in beta.
4025 - break
4026 - else:
4027 - # There was no 'break' from the loop,
4028 - # so x_produces_empty was true for all x in beta,
4029 - # so beta produces empty as well.
4030 - result.append('<empty>')
4032 - return result
4035 -# FOLLOW(x)
4036 -# Given a non-terminal. This function computes the set of all symbols
4037 -# that might follow it. Dragon book, p. 189.
4039 -def compute_follow(start=None):
4040 - # Add '$end' to the follow list of the start symbol
4041 - for k in Nonterminals.keys():
4042 - Follow[k] = [ ]
4044 - if not start:
4045 - start = Productions[1].name
4047 - Follow[start] = [ '$end' ]
4049 - while 1:
4050 - didadd = 0
4051 - for p in Productions[1:]:
4052 - # Here is the production set
4053 - for i in range(len(p.prod)):
4054 - B = p.prod[i]
4055 - if Nonterminals.has_key(B):
4056 - # Okay. We got a non-terminal in a production
4057 - fst = first(p.prod[i+1:])
4058 - hasempty = 0
4059 - for f in fst:
4060 - if f != '<empty>' and f not in Follow[B]:
4061 - Follow[B].append(f)
4062 - didadd = 1
4063 - if f == '<empty>':
4064 - hasempty = 1
4065 - if hasempty or i == (len(p.prod)-1):
4066 - # Add elements of follow(a) to follow(b)
4067 - for f in Follow[p.name]:
4068 - if f not in Follow[B]:
4069 - Follow[B].append(f)
4070 - didadd = 1
4071 - if not didadd: break
4073 - if 0 and yaccdebug:
4074 - _vf.write('\nFollow:\n')
4075 - for k in Nonterminals.keys():
4076 - _vf.write("%-20s : %s\n" % (k, " ".join([str(s) for s in Follow[k]])))
4078 -# -------------------------------------------------------------------------
4079 -# compute_first1()
4081 -# Compute the value of FIRST1(X) for all symbols
4082 -# -------------------------------------------------------------------------
4083 -def compute_first1():
4085 - # Terminals:
4086 - for t in Terminals.keys():
4087 - First[t] = [t]
4089 - First['$end'] = ['$end']
4090 - First['#'] = ['#'] # what's this for?
4092 - # Nonterminals:
4094 - # Initialize to the empty set:
4095 - for n in Nonterminals.keys():
4096 - First[n] = []
4098 - # Then propagate symbols until no change:
4099 - while 1:
4100 - some_change = 0
4101 - for n in Nonterminals.keys():
4102 - for p in Prodnames[n]:
4103 - for f in first(p.prod):
4104 - if f not in First[n]:
4105 - First[n].append( f )
4106 - some_change = 1
4107 - if not some_change:
4108 - break
4110 - if 0 and yaccdebug:
4111 - _vf.write('\nFirst:\n')
4112 - for k in Nonterminals.keys():
4113 - _vf.write("%-20s : %s\n" %
4114 - (k, " ".join([str(s) for s in First[k]])))
4116 -# -----------------------------------------------------------------------------
4117 -# === SLR Generation ===
4119 -# The following functions are used to construct SLR (Simple LR) parsing tables
4120 -# as described on p.221-229 of the dragon book.
4121 -# -----------------------------------------------------------------------------
4123 -# Global variables for the LR parsing engine
4124 -def lr_init_vars():
4125 - global _lr_action, _lr_goto, _lr_method
4126 - global _lr_goto_cache, _lr0_cidhash
4128 - _lr_action = { } # Action table
4129 - _lr_goto = { } # Goto table
4130 - _lr_method = "Unknown" # LR method used
4131 - _lr_goto_cache = { }
4132 - _lr0_cidhash = { }
4135 -# Compute the LR(0) closure operation on I, where I is a set of LR(0) items.
4136 -# prodlist is a list of productions.
4138 -_add_count = 0 # Counter used to detect cycles
4140 -def lr0_closure(I):
4141 - global _add_count
4143 - _add_count += 1
4144 - prodlist = Productions
4146 - # Add everything in I to J
4147 - J = I[:]
4148 - didadd = 1
4149 - while didadd:
4150 - didadd = 0
4151 - for j in J:
4152 - for x in j.lrafter:
4153 - if x.lr0_added == _add_count: continue
4154 - # Add B --> .G to J
4155 - J.append(x.lr_next)
4156 - x.lr0_added = _add_count
4157 - didadd = 1
4159 - return J
4161 -# Compute the LR(0) goto function goto(I,X) where I is a set
4162 -# of LR(0) items and X is a grammar symbol. This function is written
4163 -# in a way that guarantees uniqueness of the generated goto sets
4164 -# (i.e. the same goto set will never be returned as two different Python
4165 -# objects). With uniqueness, we can later do fast set comparisons using
4166 -# id(obj) instead of element-wise comparison.
4168 -def lr0_goto(I,x):
4169 - # First we look for a previously cached entry
4170 - g = _lr_goto_cache.get((id(I),x),None)
4171 - if g: return g
4173 - # Now we generate the goto set in a way that guarantees uniqueness
4174 - # of the result
4176 - s = _lr_goto_cache.get(x,None)
4177 - if not s:
4178 - s = { }
4179 - _lr_goto_cache[x] = s
4181 - gs = [ ]
4182 - for p in I:
4183 - n = p.lr_next
4184 - if n and n.lrbefore == x:
4185 - s1 = s.get(id(n),None)
4186 - if not s1:
4187 - s1 = { }
4188 - s[id(n)] = s1
4189 - gs.append(n)
4190 - s = s1
4191 - g = s.get('$end',None)
4192 - if not g:
4193 - if gs:
4194 - g = lr0_closure(gs)
4195 - s['$end'] = g
4196 - else:
4197 - s['$end'] = gs
4198 - _lr_goto_cache[(id(I),x)] = g
4199 - return g
4201 -_lr0_cidhash = { }
4203 -# Compute the LR(0) sets of item function
4204 -def lr0_items():
4206 - C = [ lr0_closure([Productions[0].lr_next]) ]
4207 - i = 0
4208 - for I in C:
4209 - _lr0_cidhash[id(I)] = i
4210 - i += 1
4212 - # Loop over the items in C and each grammar symbols
4213 - i = 0
4214 - while i < len(C):
4215 - I = C[i]
4216 - i += 1
4218 - # Collect all of the symbols that could possibly be in the goto(I,X) sets
4219 - asyms = { }
4220 - for ii in I:
4221 - for s in ii.usyms:
4222 - asyms[s] = None
4224 - for x in asyms.keys():
4225 - g = lr0_goto(I,x)
4226 - if not g: continue
4227 - if _lr0_cidhash.has_key(id(g)): continue
4228 - _lr0_cidhash[id(g)] = len(C)
4229 - C.append(g)
4231 - return C
4233 -# -----------------------------------------------------------------------------
4234 -# ==== LALR(1) Parsing ====
4236 -# LALR(1) parsing is almost exactly the same as SLR except that instead of
4237 -# relying upon Follow() sets when performing reductions, a more selective
4238 -# lookahead set that incorporates the state of the LR(0) machine is utilized.
4239 -# Thus, we mainly just have to focus on calculating the lookahead sets.
4241 -# The method used here is due to DeRemer and Pennelo (1982).
4243 -# DeRemer, F. L., and T. J. Pennelo: "Efficient Computation of LALR(1)
4244 -# Lookahead Sets", ACM Transactions on Programming Languages and Systems,
4245 -# Vol. 4, No. 4, Oct. 1982, pp. 615-649
4247 -# Further details can also be found in:
4249 -# J. Tremblay and P. Sorenson, "The Theory and Practice of Compiler Writing",
4250 -# McGraw-Hill Book Company, (1985).
4252 -# Note: This implementation is a complete replacement of the LALR(1)
4253 -# implementation in PLY-1.x releases. That version was based on
4254 -# a less efficient algorithm and it had bugs in its implementation.
4255 -# -----------------------------------------------------------------------------
4257 -# -----------------------------------------------------------------------------
4258 -# compute_nullable_nonterminals()
4260 -# Creates a dictionary containing all of the non-terminals that might produce
4261 -# an empty production.
4262 -# -----------------------------------------------------------------------------
4264 -def compute_nullable_nonterminals():
4265 - nullable = {}
4266 - num_nullable = 0
4267 - while 1:
4268 - for p in Productions[1:]:
4269 - if p.len == 0:
4270 - nullable[p.name] = 1
4271 - continue
4272 - for t in p.prod:
4273 - if not nullable.has_key(t): break
4274 - else:
4275 - nullable[p.name] = 1
4276 - if len(nullable) == num_nullable: break
4277 - num_nullable = len(nullable)
4278 - return nullable
4280 -# -----------------------------------------------------------------------------
4281 -# find_nonterminal_trans(C)
4283 -# Given a set of LR(0) items, this functions finds all of the non-terminal
4284 -# transitions. These are transitions in which a dot appears immediately before
4285 -# a non-terminal. Returns a list of tuples of the form (state,N) where state
4286 -# is the state number and N is the nonterminal symbol.
4288 -# The input C is the set of LR(0) items.
4289 -# -----------------------------------------------------------------------------
4291 -def find_nonterminal_transitions(C):
4292 - trans = []
4293 - for state in range(len(C)):
4294 - for p in C[state]:
4295 - if p.lr_index < p.len - 1:
4296 - t = (state,p.prod[p.lr_index+1])
4297 - if Nonterminals.has_key(t[1]):
4298 - if t not in trans: trans.append(t)
4299 - state = state + 1
4300 - return trans
4302 -# -----------------------------------------------------------------------------
4303 -# dr_relation()
4305 -# Computes the DR(p,A) relationships for non-terminal transitions. The input
4306 -# is a tuple (state,N) where state is a number and N is a nonterminal symbol.
4308 -# Returns a list of terminals.
4309 -# -----------------------------------------------------------------------------
4311 -def dr_relation(C,trans,nullable):
4312 - dr_set = { }
4313 - state,N = trans
4314 - terms = []
4316 - g = lr0_goto(C[state],N)
4317 - for p in g:
4318 - if p.lr_index < p.len - 1:
4319 - a = p.prod[p.lr_index+1]
4320 - if Terminals.has_key(a):
4321 - if a not in terms: terms.append(a)
4323 - # This extra bit is to handle the start state
4324 - if state == 0 and N == Productions[0].prod[0]:
4325 - terms.append('$end')
4327 - return terms
4329 -# -----------------------------------------------------------------------------
4330 -# reads_relation()
4332 -# Computes the READS() relation (p,A) READS (t,C).
4333 -# -----------------------------------------------------------------------------
4335 -def reads_relation(C, trans, empty):
4336 - # Look for empty transitions
4337 - rel = []
4338 - state, N = trans
4340 - g = lr0_goto(C[state],N)
4341 - j = _lr0_cidhash.get(id(g),-1)
4342 - for p in g:
4343 - if p.lr_index < p.len - 1:
4344 - a = p.prod[p.lr_index + 1]
4345 - if empty.has_key(a):
4346 - rel.append((j,a))
4348 - return rel
4350 -# -----------------------------------------------------------------------------
4351 -# compute_lookback_includes()
4353 -# Determines the lookback and includes relations
4355 -# LOOKBACK:
4357 -# This relation is determined by running the LR(0) state machine forward.
4358 -# For example, starting with a production "N : . A B C", we run it forward
4359 -# to obtain "N : A B C ." We then build a relationship between this final
4360 -# state and the starting state. These relationships are stored in a dictionary
4361 -# lookdict.
4363 -# INCLUDES:
4365 -# Computes the INCLUDE() relation (p,A) INCLUDES (p',B).
4367 -# This relation is used to determine non-terminal transitions that occur
4368 -# inside of other non-terminal transition states. (p,A) INCLUDES (p', B)
4369 -# if the following holds:
4371 -# B -> LAT, where T -> epsilon and p' -L-> p
4373 -# L is essentially a prefix (which may be empty), T is a suffix that must be
4374 -# able to derive an empty string. State p' must lead to state p with the string L.
4376 -# -----------------------------------------------------------------------------
4378 -def compute_lookback_includes(C,trans,nullable):
4380 - lookdict = {} # Dictionary of lookback relations
4381 - includedict = {} # Dictionary of include relations
4383 - # Make a dictionary of non-terminal transitions
4384 - dtrans = {}
4385 - for t in trans:
4386 - dtrans[t] = 1
4388 - # Loop over all transitions and compute lookbacks and includes
4389 - for state,N in trans:
4390 - lookb = []
4391 - includes = []
4392 - for p in C[state]:
4393 - if p.name != N: continue
4395 - # Okay, we have a name match. We now follow the production all the way
4396 - # through the state machine until we get the . on the right hand side
4398 - lr_index = p.lr_index
4399 - j = state
4400 - while lr_index < p.len - 1:
4401 - lr_index = lr_index + 1
4402 - t = p.prod[lr_index]
4404 - # Check to see if this symbol and state are a non-terminal transition
4405 - if dtrans.has_key((j,t)):
4406 - # Yes. Okay, there is some chance that this is an includes relation
4407 - # the only way to know for certain is whether the rest of the
4408 - # production derives empty
4410 - li = lr_index + 1
4411 - while li < p.len:
4412 - if Terminals.has_key(p.prod[li]): break # No forget it
4413 - if not nullable.has_key(p.prod[li]): break
4414 - li = li + 1
4415 - else:
4416 - # Appears to be a relation between (j,t) and (state,N)
4417 - includes.append((j,t))
4419 - g = lr0_goto(C[j],t) # Go to next set
4420 - j = _lr0_cidhash.get(id(g),-1) # Go to next state
4422 - # When we get here, j is the final state, now we have to locate the production
4423 - for r in C[j]:
4424 - if r.name != p.name: continue
4425 - if r.len != p.len: continue
4426 - i = 0
4427 - # This look is comparing a production ". A B C" with "A B C ."
4428 - while i < r.lr_index:
4429 - if r.prod[i] != p.prod[i+1]: break
4430 - i = i + 1
4431 - else:
4432 - lookb.append((j,r))
4433 - for i in includes:
4434 - if not includedict.has_key(i): includedict[i] = []
4435 - includedict[i].append((state,N))
4436 - lookdict[(state,N)] = lookb
4438 - return lookdict,includedict
4440 # -----------------------------------------------------------------------------
4441 # digraph()
4442 # traverse()
4444 # The following two functions are used to compute set valued functions
4445 # of the form:
4447 # F(x) = F'(x) U U{F(y) | x R y}
4448 @@ -2176,720 +1914,1363 @@ def traverse(x,N,stack,F,X,R,FP):
4449 rel = R(x) # Get y's related to x
4450 for y in rel:
4451 if N[y] == 0:
4452 traverse(y,N,stack,F,X,R,FP)
4453 N[x] = min(N[x],N[y])
4454 for a in F.get(y,[]):
4455 if a not in F[x]: F[x].append(a)
4456 if N[x] == d:
4457 - N[stack[-1]] = sys.maxint
4458 + N[stack[-1]] = MAXINT
4459 F[stack[-1]] = F[x]
4460 element = stack.pop()
4461 while element != x:
4462 - N[stack[-1]] = sys.maxint
4463 + N[stack[-1]] = MAXINT
4464 F[stack[-1]] = F[x]
4465 element = stack.pop()
4467 +class LALRError(YaccError): pass
4469 # -----------------------------------------------------------------------------
4470 -# compute_read_sets()
4471 +# == LRGeneratedTable ==
4473 -# Given a set of LR(0) items, this function computes the read sets.
4475 -# Inputs: C = Set of LR(0) items
4476 -# ntrans = Set of nonterminal transitions
4477 -# nullable = Set of empty transitions
4479 -# Returns a set containing the read sets
4480 +# This class implements the LR table generation algorithm. There are no
4481 +# public methods except for write()
4482 # -----------------------------------------------------------------------------
4484 -def compute_read_sets(C, ntrans, nullable):
4485 - FP = lambda x: dr_relation(C,x,nullable)
4486 - R = lambda x: reads_relation(C,x,nullable)
4487 - F = digraph(ntrans,R,FP)
4488 - return F
4490 -# -----------------------------------------------------------------------------
4491 -# compute_follow_sets()
4493 -# Given a set of LR(0) items, a set of non-terminal transitions, a readset,
4494 -# and an include set, this function computes the follow sets
4496 -# Follow(p,A) = Read(p,A) U U {Follow(p',B) | (p,A) INCLUDES (p',B)}
4498 -# Inputs:
4499 -# ntrans = Set of nonterminal transitions
4500 -# readsets = Readset (previously computed)
4501 -# inclsets = Include sets (previously computed)
4503 -# Returns a set containing the follow sets
4504 -# -----------------------------------------------------------------------------
4506 -def compute_follow_sets(ntrans,readsets,inclsets):
4507 - FP = lambda x: readsets[x]
4508 - R = lambda x: inclsets.get(x,[])
4509 - F = digraph(ntrans,R,FP)
4510 - return F
4512 -# -----------------------------------------------------------------------------
4513 -# add_lookaheads()
4515 -# Attaches the lookahead symbols to grammar rules.
4517 -# Inputs: lookbacks - Set of lookback relations
4518 -# followset - Computed follow set
4520 -# This function directly attaches the lookaheads to productions contained
4521 -# in the lookbacks set
4522 -# -----------------------------------------------------------------------------
4524 -def add_lookaheads(lookbacks,followset):
4525 - for trans,lb in lookbacks.items():
4526 - # Loop over productions in lookback
4527 - for state,p in lb:
4528 - if not p.lookaheads.has_key(state):
4529 - p.lookaheads[state] = []
4530 - f = followset.get(trans,[])
4531 - for a in f:
4532 - if a not in p.lookaheads[state]: p.lookaheads[state].append(a)
4534 -# -----------------------------------------------------------------------------
4535 -# add_lalr_lookaheads()
4537 -# This function does all of the work of adding lookahead information for use
4538 -# with LALR parsing
4539 -# -----------------------------------------------------------------------------
4541 -def add_lalr_lookaheads(C):
4542 - # Determine all of the nullable nonterminals
4543 - nullable = compute_nullable_nonterminals()
4545 - # Find all non-terminal transitions
4546 - trans = find_nonterminal_transitions(C)
4548 - # Compute read sets
4549 - readsets = compute_read_sets(C,trans,nullable)
4551 - # Compute lookback/includes relations
4552 - lookd, included = compute_lookback_includes(C,trans,nullable)
4554 - # Compute LALR FOLLOW sets
4555 - followsets = compute_follow_sets(trans,readsets,included)
4557 - # Add all of the lookaheads
4558 - add_lookaheads(lookd,followsets)
4560 -# -----------------------------------------------------------------------------
4561 -# lr_parse_table()
4563 -# This function constructs the parse tables for SLR or LALR
4564 -# -----------------------------------------------------------------------------
4565 -def lr_parse_table(method):
4566 - global _lr_method
4567 - goto = _lr_goto # Goto array
4568 - action = _lr_action # Action array
4569 - actionp = { } # Action production array (temporary)
4571 - _lr_method = method
4573 - n_srconflict = 0
4574 - n_rrconflict = 0
4576 - if yaccdebug:
4577 - sys.stderr.write("yacc: Generating %s parsing table...\n" % method)
4578 - _vf.write("\n\nParsing method: %s\n\n" % method)
4580 - # Step 1: Construct C = { I0, I1, ... IN}, collection of LR(0) items
4581 - # This determines the number of states
4583 - C = lr0_items()
4585 - if method == 'LALR':
4586 - add_lalr_lookaheads(C)
4589 - # Build the parser table, state by state
4590 - st = 0
4591 - for I in C:
4592 - # Loop over each production in I
4593 - actlist = [ ] # List of actions
4594 - st_action = { }
4595 - st_actionp = { }
4596 - st_goto = { }
4597 - if yaccdebug:
4598 - _vf.write("\nstate %d\n\n" % st)
4599 +class LRGeneratedTable(LRTable):
4600 + def __init__(self,grammar,method='LALR',log=None):
4601 + if method not in ['SLR','LALR']:
4602 + raise LALRError("Unsupported method %s" % method)
4604 + self.grammar = grammar
4605 + self.lr_method = method
4607 + # Set up the logger
4608 + if not log:
4609 + log = NullLogger()
4610 + self.log = log
4612 + # Internal attributes
4613 + self.lr_action = {} # Action table
4614 + self.lr_goto = {} # Goto table
4615 + self.lr_productions = grammar.Productions # Copy of grammar Production array
4616 + self.lr_goto_cache = {} # Cache of computed gotos
4617 + self.lr0_cidhash = {} # Cache of closures
4619 + self._add_count = 0 # Internal counter used to detect cycles
4621 + # Diagonistic information filled in by the table generator
4622 + self.sr_conflict = 0
4623 + self.rr_conflict = 0
4624 + self.conflicts = [] # List of conflicts
4626 + self.sr_conflicts = []
4627 + self.rr_conflicts = []
4629 + # Build the tables
4630 + self.grammar.build_lritems()
4631 + self.grammar.compute_first()
4632 + self.grammar.compute_follow()
4633 + self.lr_parse_table()
4635 + # Compute the LR(0) closure operation on I, where I is a set of LR(0) items.
4637 + def lr0_closure(self,I):
4638 + self._add_count += 1
4640 + # Add everything in I to J
4641 + J = I[:]
4642 + didadd = 1
4643 + while didadd:
4644 + didadd = 0
4645 + for j in J:
4646 + for x in j.lr_after:
4647 + if getattr(x,"lr0_added",0) == self._add_count: continue
4648 + # Add B --> .G to J
4649 + J.append(x.lr_next)
4650 + x.lr0_added = self._add_count
4651 + didadd = 1
4653 + return J
4655 + # Compute the LR(0) goto function goto(I,X) where I is a set
4656 + # of LR(0) items and X is a grammar symbol. This function is written
4657 + # in a way that guarantees uniqueness of the generated goto sets
4658 + # (i.e. the same goto set will never be returned as two different Python
4659 + # objects). With uniqueness, we can later do fast set comparisons using
4660 + # id(obj) instead of element-wise comparison.
4662 + def lr0_goto(self,I,x):
4663 + # First we look for a previously cached entry
4664 + g = self.lr_goto_cache.get((id(I),x),None)
4665 + if g: return g
4667 + # Now we generate the goto set in a way that guarantees uniqueness
4668 + # of the result
4670 + s = self.lr_goto_cache.get(x,None)
4671 + if not s:
4672 + s = { }
4673 + self.lr_goto_cache[x] = s
4675 + gs = [ ]
4676 + for p in I:
4677 + n = p.lr_next
4678 + if n and n.lr_before == x:
4679 + s1 = s.get(id(n),None)
4680 + if not s1:
4681 + s1 = { }
4682 + s[id(n)] = s1
4683 + gs.append(n)
4684 + s = s1
4685 + g = s.get('$end',None)
4686 + if not g:
4687 + if gs:
4688 + g = self.lr0_closure(gs)
4689 + s['$end'] = g
4690 + else:
4691 + s['$end'] = gs
4692 + self.lr_goto_cache[(id(I),x)] = g
4693 + return g
4695 + # Compute the LR(0) sets of item function
4696 + def lr0_items(self):
4698 + C = [ self.lr0_closure([self.grammar.Productions[0].lr_next]) ]
4699 + i = 0
4700 + for I in C:
4701 + self.lr0_cidhash[id(I)] = i
4702 + i += 1
4704 + # Loop over the items in C and each grammar symbols
4705 + i = 0
4706 + while i < len(C):
4707 + I = C[i]
4708 + i += 1
4710 + # Collect all of the symbols that could possibly be in the goto(I,X) sets
4711 + asyms = { }
4712 + for ii in I:
4713 + for s in ii.usyms:
4714 + asyms[s] = None
4716 + for x in asyms:
4717 + g = self.lr0_goto(I,x)
4718 + if not g: continue
4719 + if id(g) in self.lr0_cidhash: continue
4720 + self.lr0_cidhash[id(g)] = len(C)
4721 + C.append(g)
4723 + return C
4725 + # -----------------------------------------------------------------------------
4726 + # ==== LALR(1) Parsing ====
4728 + # LALR(1) parsing is almost exactly the same as SLR except that instead of
4729 + # relying upon Follow() sets when performing reductions, a more selective
4730 + # lookahead set that incorporates the state of the LR(0) machine is utilized.
4731 + # Thus, we mainly just have to focus on calculating the lookahead sets.
4733 + # The method used here is due to DeRemer and Pennelo (1982).
4735 + # DeRemer, F. L., and T. J. Pennelo: "Efficient Computation of LALR(1)
4736 + # Lookahead Sets", ACM Transactions on Programming Languages and Systems,
4737 + # Vol. 4, No. 4, Oct. 1982, pp. 615-649
4739 + # Further details can also be found in:
4741 + # J. Tremblay and P. Sorenson, "The Theory and Practice of Compiler Writing",
4742 + # McGraw-Hill Book Company, (1985).
4744 + # -----------------------------------------------------------------------------
4746 + # -----------------------------------------------------------------------------
4747 + # compute_nullable_nonterminals()
4749 + # Creates a dictionary containing all of the non-terminals that might produce
4750 + # an empty production.
4751 + # -----------------------------------------------------------------------------
4753 + def compute_nullable_nonterminals(self):
4754 + nullable = {}
4755 + num_nullable = 0
4756 + while 1:
4757 + for p in self.grammar.Productions[1:]:
4758 + if p.len == 0:
4759 + nullable[p.name] = 1
4760 + continue
4761 + for t in p.prod:
4762 + if not t in nullable: break
4763 + else:
4764 + nullable[p.name] = 1
4765 + if len(nullable) == num_nullable: break
4766 + num_nullable = len(nullable)
4767 + return nullable
4769 + # -----------------------------------------------------------------------------
4770 + # find_nonterminal_trans(C)
4772 + # Given a set of LR(0) items, this functions finds all of the non-terminal
4773 + # transitions. These are transitions in which a dot appears immediately before
4774 + # a non-terminal. Returns a list of tuples of the form (state,N) where state
4775 + # is the state number and N is the nonterminal symbol.
4777 + # The input C is the set of LR(0) items.
4778 + # -----------------------------------------------------------------------------
4780 + def find_nonterminal_transitions(self,C):
4781 + trans = []
4782 + for state in range(len(C)):
4783 + for p in C[state]:
4784 + if p.lr_index < p.len - 1:
4785 + t = (state,p.prod[p.lr_index+1])
4786 + if t[1] in self.grammar.Nonterminals:
4787 + if t not in trans: trans.append(t)
4788 + state = state + 1
4789 + return trans
4791 + # -----------------------------------------------------------------------------
4792 + # dr_relation()
4794 + # Computes the DR(p,A) relationships for non-terminal transitions. The input
4795 + # is a tuple (state,N) where state is a number and N is a nonterminal symbol.
4797 + # Returns a list of terminals.
4798 + # -----------------------------------------------------------------------------
4800 + def dr_relation(self,C,trans,nullable):
4801 + dr_set = { }
4802 + state,N = trans
4803 + terms = []
4805 + g = self.lr0_goto(C[state],N)
4806 + for p in g:
4807 + if p.lr_index < p.len - 1:
4808 + a = p.prod[p.lr_index+1]
4809 + if a in self.grammar.Terminals:
4810 + if a not in terms: terms.append(a)
4812 + # This extra bit is to handle the start state
4813 + if state == 0 and N == self.grammar.Productions[0].prod[0]:
4814 + terms.append('$end')
4816 + return terms
4818 + # -----------------------------------------------------------------------------
4819 + # reads_relation()
4821 + # Computes the READS() relation (p,A) READS (t,C).
4822 + # -----------------------------------------------------------------------------
4824 + def reads_relation(self,C, trans, empty):
4825 + # Look for empty transitions
4826 + rel = []
4827 + state, N = trans
4829 + g = self.lr0_goto(C[state],N)
4830 + j = self.lr0_cidhash.get(id(g),-1)
4831 + for p in g:
4832 + if p.lr_index < p.len - 1:
4833 + a = p.prod[p.lr_index + 1]
4834 + if a in empty:
4835 + rel.append((j,a))
4837 + return rel
4839 + # -----------------------------------------------------------------------------
4840 + # compute_lookback_includes()
4842 + # Determines the lookback and includes relations
4844 + # LOOKBACK:
4846 + # This relation is determined by running the LR(0) state machine forward.
4847 + # For example, starting with a production "N : . A B C", we run it forward
4848 + # to obtain "N : A B C ." We then build a relationship between this final
4849 + # state and the starting state. These relationships are stored in a dictionary
4850 + # lookdict.
4852 + # INCLUDES:
4854 + # Computes the INCLUDE() relation (p,A) INCLUDES (p',B).
4856 + # This relation is used to determine non-terminal transitions that occur
4857 + # inside of other non-terminal transition states. (p,A) INCLUDES (p', B)
4858 + # if the following holds:
4860 + # B -> LAT, where T -> epsilon and p' -L-> p
4862 + # L is essentially a prefix (which may be empty), T is a suffix that must be
4863 + # able to derive an empty string. State p' must lead to state p with the string L.
4865 + # -----------------------------------------------------------------------------
4867 + def compute_lookback_includes(self,C,trans,nullable):
4869 + lookdict = {} # Dictionary of lookback relations
4870 + includedict = {} # Dictionary of include relations
4872 + # Make a dictionary of non-terminal transitions
4873 + dtrans = {}
4874 + for t in trans:
4875 + dtrans[t] = 1
4877 + # Loop over all transitions and compute lookbacks and includes
4878 + for state,N in trans:
4879 + lookb = []
4880 + includes = []
4881 + for p in C[state]:
4882 + if p.name != N: continue
4884 + # Okay, we have a name match. We now follow the production all the way
4885 + # through the state machine until we get the . on the right hand side
4887 + lr_index = p.lr_index
4888 + j = state
4889 + while lr_index < p.len - 1:
4890 + lr_index = lr_index + 1
4891 + t = p.prod[lr_index]
4893 + # Check to see if this symbol and state are a non-terminal transition
4894 + if (j,t) in dtrans:
4895 + # Yes. Okay, there is some chance that this is an includes relation
4896 + # the only way to know for certain is whether the rest of the
4897 + # production derives empty
4899 + li = lr_index + 1
4900 + while li < p.len:
4901 + if p.prod[li] in self.grammar.Terminals: break # No forget it
4902 + if not p.prod[li] in nullable: break
4903 + li = li + 1
4904 + else:
4905 + # Appears to be a relation between (j,t) and (state,N)
4906 + includes.append((j,t))
4908 + g = self.lr0_goto(C[j],t) # Go to next set
4909 + j = self.lr0_cidhash.get(id(g),-1) # Go to next state
4911 + # When we get here, j is the final state, now we have to locate the production
4912 + for r in C[j]:
4913 + if r.name != p.name: continue
4914 + if r.len != p.len: continue
4915 + i = 0
4916 + # This look is comparing a production ". A B C" with "A B C ."
4917 + while i < r.lr_index:
4918 + if r.prod[i] != p.prod[i+1]: break
4919 + i = i + 1
4920 + else:
4921 + lookb.append((j,r))
4922 + for i in includes:
4923 + if not i in includedict: includedict[i] = []
4924 + includedict[i].append((state,N))
4925 + lookdict[(state,N)] = lookb
4927 + return lookdict,includedict
4929 + # -----------------------------------------------------------------------------
4930 + # compute_read_sets()
4932 + # Given a set of LR(0) items, this function computes the read sets.
4934 + # Inputs: C = Set of LR(0) items
4935 + # ntrans = Set of nonterminal transitions
4936 + # nullable = Set of empty transitions
4938 + # Returns a set containing the read sets
4939 + # -----------------------------------------------------------------------------
4941 + def compute_read_sets(self,C, ntrans, nullable):
4942 + FP = lambda x: self.dr_relation(C,x,nullable)
4943 + R = lambda x: self.reads_relation(C,x,nullable)
4944 + F = digraph(ntrans,R,FP)
4945 + return F
4947 + # -----------------------------------------------------------------------------
4948 + # compute_follow_sets()
4950 + # Given a set of LR(0) items, a set of non-terminal transitions, a readset,
4951 + # and an include set, this function computes the follow sets
4953 + # Follow(p,A) = Read(p,A) U U {Follow(p',B) | (p,A) INCLUDES (p',B)}
4955 + # Inputs:
4956 + # ntrans = Set of nonterminal transitions
4957 + # readsets = Readset (previously computed)
4958 + # inclsets = Include sets (previously computed)
4960 + # Returns a set containing the follow sets
4961 + # -----------------------------------------------------------------------------
4963 + def compute_follow_sets(self,ntrans,readsets,inclsets):
4964 + FP = lambda x: readsets[x]
4965 + R = lambda x: inclsets.get(x,[])
4966 + F = digraph(ntrans,R,FP)
4967 + return F
4969 + # -----------------------------------------------------------------------------
4970 + # add_lookaheads()
4972 + # Attaches the lookahead symbols to grammar rules.
4974 + # Inputs: lookbacks - Set of lookback relations
4975 + # followset - Computed follow set
4977 + # This function directly attaches the lookaheads to productions contained
4978 + # in the lookbacks set
4979 + # -----------------------------------------------------------------------------
4981 + def add_lookaheads(self,lookbacks,followset):
4982 + for trans,lb in lookbacks.items():
4983 + # Loop over productions in lookback
4984 + for state,p in lb:
4985 + if not state in p.lookaheads:
4986 + p.lookaheads[state] = []
4987 + f = followset.get(trans,[])
4988 + for a in f:
4989 + if a not in p.lookaheads[state]: p.lookaheads[state].append(a)
4991 + # -----------------------------------------------------------------------------
4992 + # add_lalr_lookaheads()
4994 + # This function does all of the work of adding lookahead information for use
4995 + # with LALR parsing
4996 + # -----------------------------------------------------------------------------
4998 + def add_lalr_lookaheads(self,C):
4999 + # Determine all of the nullable nonterminals
5000 + nullable = self.compute_nullable_nonterminals()
5002 + # Find all non-terminal transitions
5003 + trans = self.find_nonterminal_transitions(C)
5005 + # Compute read sets
5006 + readsets = self.compute_read_sets(C,trans,nullable)
5008 + # Compute lookback/includes relations
5009 + lookd, included = self.compute_lookback_includes(C,trans,nullable)
5011 + # Compute LALR FOLLOW sets
5012 + followsets = self.compute_follow_sets(trans,readsets,included)
5014 + # Add all of the lookaheads
5015 + self.add_lookaheads(lookd,followsets)
5017 + # -----------------------------------------------------------------------------
5018 + # lr_parse_table()
5020 + # This function constructs the parse tables for SLR or LALR
5021 + # -----------------------------------------------------------------------------
5022 + def lr_parse_table(self):
5023 + Productions = self.grammar.Productions
5024 + Precedence = self.grammar.Precedence
5025 + goto = self.lr_goto # Goto array
5026 + action = self.lr_action # Action array
5027 + log = self.log # Logger for output
5029 + actionp = { } # Action production array (temporary)
5031 + log.info("Parsing method: %s", self.lr_method)
5033 + # Step 1: Construct C = { I0, I1, ... IN}, collection of LR(0) items
5034 + # This determines the number of states
5036 + C = self.lr0_items()
5038 + if self.lr_method == 'LALR':
5039 + self.add_lalr_lookaheads(C)
5041 + # Build the parser table, state by state
5042 + st = 0
5043 + for I in C:
5044 + # Loop over each production in I
5045 + actlist = [ ] # List of actions
5046 + st_action = { }
5047 + st_actionp = { }
5048 + st_goto = { }
5049 + log.info("")
5050 + log.info("state %d", st)
5051 + log.info("")
5052 for p in I:
5053 - _vf.write(" (%d) %s\n" % (p.number, str(p)))
5054 - _vf.write("\n")
5056 - for p in I:
5057 - try:
5058 - if p.len == p.lr_index + 1:
5059 - if p.name == "S'":
5060 - # Start symbol. Accept!
5061 - st_action["$end"] = 0
5062 - st_actionp["$end"] = p
5063 + log.info(" (%d) %s", p.number, str(p))
5064 + log.info("")
5066 + for p in I:
5067 + if p.len == p.lr_index + 1:
5068 + if p.name == "S'":
5069 + # Start symbol. Accept!
5070 + st_action["$end"] = 0
5071 + st_actionp["$end"] = p
5072 + else:
5073 + # We are at the end of a production. Reduce!
5074 + if self.lr_method == 'LALR':
5075 + laheads = p.lookaheads[st]
5076 + else:
5077 + laheads = self.grammar.Follow[p.name]
5078 + for a in laheads:
5079 + actlist.append((a,p,"reduce using rule %d (%s)" % (p.number,p)))
5080 + r = st_action.get(a,None)
5081 + if r is not None:
5082 + # Whoa. Have a shift/reduce or reduce/reduce conflict
5083 + if r > 0:
5084 + # Need to decide on shift or reduce here
5085 + # By default we favor shifting. Need to add
5086 + # some precedence rules here.
5087 + sprec,slevel = Productions[st_actionp[a].number].prec
5088 + rprec,rlevel = Precedence.get(a,('right',0))
5089 + if (slevel < rlevel) or ((slevel == rlevel) and (rprec == 'left')):
5090 + # We really need to reduce here.
5091 + st_action[a] = -p.number
5092 + st_actionp[a] = p
5093 + if not slevel and not rlevel:
5094 + log.info(" ! shift/reduce conflict for %s resolved as reduce",a)
5095 + self.sr_conflicts.append((st,a,'reduce'))
5096 + Productions[p.number].reduced += 1
5097 + elif (slevel == rlevel) and (rprec == 'nonassoc'):
5098 + st_action[a] = None
5099 + else:
5100 + # Hmmm. Guess we'll keep the shift
5101 + if not rlevel:
5102 + log.info(" ! shift/reduce conflict for %s resolved as shift",a)
5103 + self.sr_conflicts.append((st,a,'shift'))
5104 + elif r < 0:
5105 + # Reduce/reduce conflict. In this case, we favor the rule
5106 + # that was defined first in the grammar file
5107 + oldp = Productions[-r]
5108 + pp = Productions[p.number]
5109 + if oldp.line > pp.line:
5110 + st_action[a] = -p.number
5111 + st_actionp[a] = p
5112 + chosenp,rejectp = pp,oldp
5113 + Productions[p.number].reduced += 1
5114 + Productions[oldp.number].reduced -= 1
5115 + else:
5116 + chosenp,rejectp = oldp,pp
5117 + self.rr_conflicts.append((st,chosenp,rejectp))
5118 + log.info(" ! reduce/reduce conflict for %s resolved using rule %d (%s)", a,st_actionp[a].number, st_actionp[a])
5119 + else:
5120 + raise LALRError("Unknown conflict in state %d" % st)
5121 + else:
5122 + st_action[a] = -p.number
5123 + st_actionp[a] = p
5124 + Productions[p.number].reduced += 1
5125 else:
5126 - # We are at the end of a production. Reduce!
5127 - if method == 'LALR':
5128 - laheads = p.lookaheads[st]
5129 - else:
5130 - laheads = Follow[p.name]
5131 - for a in laheads:
5132 - actlist.append((a,p,"reduce using rule %d (%s)" % (p.number,p)))
5133 - r = st_action.get(a,None)
5134 - if r is not None:
5135 - # Whoa. Have a shift/reduce or reduce/reduce conflict
5136 - if r > 0:
5137 - # Need to decide on shift or reduce here
5138 - # By default we favor shifting. Need to add
5139 - # some precedence rules here.
5140 - sprec,slevel = Productions[st_actionp[a].number].prec
5141 - rprec,rlevel = Precedence.get(a,('right',0))
5142 - if (slevel < rlevel) or ((slevel == rlevel) and (rprec == 'left')):
5143 - # We really need to reduce here.
5144 - st_action[a] = -p.number
5145 - st_actionp[a] = p
5146 - if not slevel and not rlevel:
5147 - _vfc.write("shift/reduce conflict in state %d resolved as reduce.\n" % st)
5148 - _vf.write(" ! shift/reduce conflict for %s resolved as reduce.\n" % a)
5149 - n_srconflict += 1
5150 - elif (slevel == rlevel) and (rprec == 'nonassoc'):
5151 - st_action[a] = None
5152 + i = p.lr_index
5153 + a = p.prod[i+1] # Get symbol right after the "."
5154 + if a in self.grammar.Terminals:
5155 + g = self.lr0_goto(I,a)
5156 + j = self.lr0_cidhash.get(id(g),-1)
5157 + if j >= 0:
5158 + # We are in a shift state
5159 + actlist.append((a,p,"shift and go to state %d" % j))
5160 + r = st_action.get(a,None)
5161 + if r is not None:
5162 + # Whoa have a shift/reduce or shift/shift conflict
5163 + if r > 0:
5164 + if r != j:
5165 + raise LALRError("Shift/shift conflict in state %d" % st)
5166 + elif r < 0:
5167 + # Do a precedence check.
5168 + # - if precedence of reduce rule is higher, we reduce.
5169 + # - if precedence of reduce is same and left assoc, we reduce.
5170 + # - otherwise we shift
5171 + rprec,rlevel = Productions[st_actionp[a].number].prec
5172 + sprec,slevel = Precedence.get(a,('right',0))
5173 + if (slevel > rlevel) or ((slevel == rlevel) and (rprec == 'right')):
5174 + # We decide to shift here... highest precedence to shift
5175 + Productions[st_actionp[a].number].reduced -= 1
5176 + st_action[a] = j
5177 + st_actionp[a] = p
5178 + if not rlevel:
5179 + log.info(" ! shift/reduce conflict for %s resolved as shift",a)
5180 + self.sr_conflicts.append((st,a,'shift'))
5181 + elif (slevel == rlevel) and (rprec == 'nonassoc'):
5182 + st_action[a] = None
5183 + else:
5184 + # Hmmm. Guess we'll keep the reduce
5185 + if not slevel and not rlevel:
5186 + log.info(" ! shift/reduce conflict for %s resolved as reduce",a)
5187 + self.sr_conflicts.append((st,a,'reduce'))
5189 else:
5190 - # Hmmm. Guess we'll keep the shift
5191 - if not rlevel:
5192 - _vfc.write("shift/reduce conflict in state %d resolved as shift.\n" % st)
5193 - _vf.write(" ! shift/reduce conflict for %s resolved as shift.\n" % a)
5194 - n_srconflict +=1
5195 - elif r < 0:
5196 - # Reduce/reduce conflict. In this case, we favor the rule
5197 - # that was defined first in the grammar file
5198 - oldp = Productions[-r]
5199 - pp = Productions[p.number]
5200 - if oldp.line > pp.line:
5201 - st_action[a] = -p.number
5202 - st_actionp[a] = p
5203 - # sys.stderr.write("Reduce/reduce conflict in state %d\n" % st)
5204 - n_rrconflict += 1
5205 - _vfc.write("reduce/reduce conflict in state %d resolved using rule %d (%s).\n" % (st, st_actionp[a].number, st_actionp[a]))
5206 - _vf.write(" ! reduce/reduce conflict for %s resolved using rule %d (%s).\n" % (a,st_actionp[a].number, st_actionp[a]))
5207 + raise LALRError("Unknown conflict in state %d" % st)
5208 else:
5209 - sys.stderr.write("Unknown conflict in state %d\n" % st)
5210 - else:
5211 - st_action[a] = -p.number
5212 - st_actionp[a] = p
5213 - else:
5214 - i = p.lr_index
5215 - a = p.prod[i+1] # Get symbol right after the "."
5216 - if Terminals.has_key(a):
5217 - g = lr0_goto(I,a)
5218 - j = _lr0_cidhash.get(id(g),-1)
5219 - if j >= 0:
5220 - # We are in a shift state
5221 - actlist.append((a,p,"shift and go to state %d" % j))
5222 - r = st_action.get(a,None)
5223 - if r is not None:
5224 - # Whoa have a shift/reduce or shift/shift conflict
5225 - if r > 0:
5226 - if r != j:
5227 - sys.stderr.write("Shift/shift conflict in state %d\n" % st)
5228 - elif r < 0:
5229 - # Do a precedence check.
5230 - # - if precedence of reduce rule is higher, we reduce.
5231 - # - if precedence of reduce is same and left assoc, we reduce.
5232 - # - otherwise we shift
5233 - rprec,rlevel = Productions[st_actionp[a].number].prec
5234 - sprec,slevel = Precedence.get(a,('right',0))
5235 - if (slevel > rlevel) or ((slevel == rlevel) and (rprec == 'right')):
5236 - # We decide to shift here... highest precedence to shift
5237 - st_action[a] = j
5238 - st_actionp[a] = p
5239 - if not rlevel:
5240 - n_srconflict += 1
5241 - _vfc.write("shift/reduce conflict in state %d resolved as shift.\n" % st)
5242 - _vf.write(" ! shift/reduce conflict for %s resolved as shift.\n" % a)
5243 - elif (slevel == rlevel) and (rprec == 'nonassoc'):
5244 - st_action[a] = None
5245 - else:
5246 - # Hmmm. Guess we'll keep the reduce
5247 - if not slevel and not rlevel:
5248 - n_srconflict +=1
5249 - _vfc.write("shift/reduce conflict in state %d resolved as reduce.\n" % st)
5250 - _vf.write(" ! shift/reduce conflict for %s resolved as reduce.\n" % a)
5252 - else:
5253 - sys.stderr.write("Unknown conflict in state %d\n" % st)
5254 - else:
5255 - st_action[a] = j
5256 - st_actionp[a] = p
5258 - except StandardError,e:
5259 - print sys.exc_info()
5260 - raise YaccError, "Hosed in lr_parse_table"
5262 - # Print the actions associated with each terminal
5263 - if yaccdebug:
5264 - _actprint = { }
5265 - for a,p,m in actlist:
5266 - if st_action.has_key(a):
5267 - if p is st_actionp[a]:
5268 - _vf.write(" %-15s %s\n" % (a,m))
5269 - _actprint[(a,m)] = 1
5270 - _vf.write("\n")
5271 - for a,p,m in actlist:
5272 - if st_action.has_key(a):
5273 - if p is not st_actionp[a]:
5274 - if not _actprint.has_key((a,m)):
5275 - _vf.write(" ! %-15s [ %s ]\n" % (a,m))
5276 + st_action[a] = j
5277 + st_actionp[a] = p
5279 + # Print the actions associated with each terminal
5280 + _actprint = { }
5281 + for a,p,m in actlist:
5282 + if a in st_action:
5283 + if p is st_actionp[a]:
5284 + log.info(" %-15s %s",a,m)
5285 _actprint[(a,m)] = 1
5287 - # Construct the goto table for this state
5288 - if yaccdebug:
5289 - _vf.write("\n")
5290 - nkeys = { }
5291 - for ii in I:
5292 - for s in ii.usyms:
5293 - if Nonterminals.has_key(s):
5294 - nkeys[s] = None
5295 - for n in nkeys.keys():
5296 - g = lr0_goto(I,n)
5297 - j = _lr0_cidhash.get(id(g),-1)
5298 - if j >= 0:
5299 - st_goto[n] = j
5300 - if yaccdebug:
5301 - _vf.write(" %-30s shift and go to state %d\n" % (n,j))
5303 - action[st] = st_action
5304 - actionp[st] = st_actionp
5305 - goto[st] = st_goto
5307 - st += 1
5309 - if yaccdebug:
5310 - if n_srconflict == 1:
5311 - sys.stderr.write("yacc: %d shift/reduce conflict\n" % n_srconflict)
5312 - if n_srconflict > 1:
5313 - sys.stderr.write("yacc: %d shift/reduce conflicts\n" % n_srconflict)
5314 - if n_rrconflict == 1:
5315 - sys.stderr.write("yacc: %d reduce/reduce conflict\n" % n_rrconflict)
5316 - if n_rrconflict > 1:
5317 - sys.stderr.write("yacc: %d reduce/reduce conflicts\n" % n_rrconflict)
5319 -# -----------------------------------------------------------------------------
5320 -# ==== LR Utility functions ====
5321 -# -----------------------------------------------------------------------------
5323 -# -----------------------------------------------------------------------------
5324 -# _lr_write_tables()
5326 -# This function writes the LR parsing tables to a file
5327 -# -----------------------------------------------------------------------------
5329 -def lr_write_tables(modulename=tab_module,outputdir=''):
5330 - if isinstance(modulename, types.ModuleType):
5331 - print >>sys.stderr, "Warning module %s is inconsistent with the grammar (ignored)" % modulename
5332 - return
5334 - basemodulename = modulename.split(".")[-1]
5335 - filename = os.path.join(outputdir,basemodulename) + ".py"
5336 - try:
5337 - f = open(filename,"w")
5339 - f.write("""
5340 + log.info("")
5341 + # Print the actions that were not used. (debugging)
5342 + not_used = 0
5343 + for a,p,m in actlist:
5344 + if a in st_action:
5345 + if p is not st_actionp[a]:
5346 + if not (a,m) in _actprint:
5347 + log.debug(" ! %-15s [ %s ]",a,m)
5348 + not_used = 1
5349 + _actprint[(a,m)] = 1
5350 + if not_used:
5351 + log.debug("")
5353 + # Construct the goto table for this state
5355 + nkeys = { }
5356 + for ii in I:
5357 + for s in ii.usyms:
5358 + if s in self.grammar.Nonterminals:
5359 + nkeys[s] = None
5360 + for n in nkeys:
5361 + g = self.lr0_goto(I,n)
5362 + j = self.lr0_cidhash.get(id(g),-1)
5363 + if j >= 0:
5364 + st_goto[n] = j
5365 + log.info(" %-30s shift and go to state %d",n,j)
5367 + action[st] = st_action
5368 + actionp[st] = st_actionp
5369 + goto[st] = st_goto
5370 + st += 1
5373 + # -----------------------------------------------------------------------------
5374 + # write()
5376 + # This function writes the LR parsing tables to a file
5377 + # -----------------------------------------------------------------------------
5379 + def write_table(self,modulename,outputdir='',signature=""):
5380 + basemodulename = modulename.split(".")[-1]
5381 + filename = os.path.join(outputdir,basemodulename) + ".py"
5382 + try:
5383 + f = open(filename,"w")
5385 + f.write("""
5386 # %s
5387 # This file is automatically generated. Do not edit.
5389 -_lr_method = %s
5391 -_lr_signature = %s
5392 -""" % (filename, repr(_lr_method), repr(Signature.digest())))
5394 - # Change smaller to 0 to go back to original tables
5395 - smaller = 1
5397 - # Factor out names to try and make smaller
5398 - if smaller:
5399 - items = { }
5401 - for s,nd in _lr_action.items():
5402 - for name,v in nd.items():
5403 - i = items.get(name)
5404 - if not i:
5405 - i = ([],[])
5406 - items[name] = i
5407 - i[0].append(s)
5408 - i[1].append(v)
5410 - f.write("\n_lr_action_items = {")
5411 - for k,v in items.items():
5412 - f.write("%r:([" % k)
5413 - for i in v[0]:
5414 - f.write("%r," % i)
5415 - f.write("],[")
5416 - for i in v[1]:
5417 - f.write("%r," % i)
5419 - f.write("]),")
5420 - f.write("}\n")
5422 - f.write("""
5423 +_tabversion = %r
5425 +_lr_method = %r
5427 +_lr_signature = %r
5428 + """ % (filename, __tabversion__, self.lr_method, signature))
5430 + # Change smaller to 0 to go back to original tables
5431 + smaller = 1
5433 + # Factor out names to try and make smaller
5434 + if smaller:
5435 + items = { }
5437 + for s,nd in self.lr_action.items():
5438 + for name,v in nd.items():
5439 + i = items.get(name)
5440 + if not i:
5441 + i = ([],[])
5442 + items[name] = i
5443 + i[0].append(s)
5444 + i[1].append(v)
5446 + f.write("\n_lr_action_items = {")
5447 + for k,v in items.items():
5448 + f.write("%r:([" % k)
5449 + for i in v[0]:
5450 + f.write("%r," % i)
5451 + f.write("],[")
5452 + for i in v[1]:
5453 + f.write("%r," % i)
5455 + f.write("]),")
5456 + f.write("}\n")
5458 + f.write("""
5459 _lr_action = { }
5460 for _k, _v in _lr_action_items.items():
5461 for _x,_y in zip(_v[0],_v[1]):
5462 - if not _lr_action.has_key(_x): _lr_action[_x] = { }
5463 + if not _x in _lr_action: _lr_action[_x] = { }
5464 _lr_action[_x][_k] = _y
5465 del _lr_action_items
5466 """)
5468 - else:
5469 - f.write("\n_lr_action = { ");
5470 - for k,v in _lr_action.items():
5471 - f.write("(%r,%r):%r," % (k[0],k[1],v))
5472 - f.write("}\n");
5474 - if smaller:
5475 - # Factor out names to try and make smaller
5476 - items = { }
5478 - for s,nd in _lr_goto.items():
5479 - for name,v in nd.items():
5480 - i = items.get(name)
5481 - if not i:
5482 - i = ([],[])
5483 - items[name] = i
5484 - i[0].append(s)
5485 - i[1].append(v)
5487 - f.write("\n_lr_goto_items = {")
5488 - for k,v in items.items():
5489 - f.write("%r:([" % k)
5490 - for i in v[0]:
5491 - f.write("%r," % i)
5492 - f.write("],[")
5493 - for i in v[1]:
5494 - f.write("%r," % i)
5496 - f.write("]),")
5497 - f.write("}\n")
5499 - f.write("""
5500 + else:
5501 + f.write("\n_lr_action = { ");
5502 + for k,v in self.lr_action.items():
5503 + f.write("(%r,%r):%r," % (k[0],k[1],v))
5504 + f.write("}\n");
5506 + if smaller:
5507 + # Factor out names to try and make smaller
5508 + items = { }
5510 + for s,nd in self.lr_goto.items():
5511 + for name,v in nd.items():
5512 + i = items.get(name)
5513 + if not i:
5514 + i = ([],[])
5515 + items[name] = i
5516 + i[0].append(s)
5517 + i[1].append(v)
5519 + f.write("\n_lr_goto_items = {")
5520 + for k,v in items.items():
5521 + f.write("%r:([" % k)
5522 + for i in v[0]:
5523 + f.write("%r," % i)
5524 + f.write("],[")
5525 + for i in v[1]:
5526 + f.write("%r," % i)
5528 + f.write("]),")
5529 + f.write("}\n")
5531 + f.write("""
5532 _lr_goto = { }
5533 for _k, _v in _lr_goto_items.items():
5534 for _x,_y in zip(_v[0],_v[1]):
5535 - if not _lr_goto.has_key(_x): _lr_goto[_x] = { }
5536 + if not _x in _lr_goto: _lr_goto[_x] = { }
5537 _lr_goto[_x][_k] = _y
5538 del _lr_goto_items
5539 """)
5540 + else:
5541 + f.write("\n_lr_goto = { ");
5542 + for k,v in self.lr_goto.items():
5543 + f.write("(%r,%r):%r," % (k[0],k[1],v))
5544 + f.write("}\n");
5546 + # Write production table
5547 + f.write("_lr_productions = [\n")
5548 + for p in self.lr_productions:
5549 + if p.func:
5550 + f.write(" (%r,%r,%d,%r,%r,%d),\n" % (p.str,p.name, p.len, p.func,p.file,p.line))
5551 + else:
5552 + f.write(" (%r,%r,%d,None,None,None),\n" % (str(p),p.name, p.len))
5553 + f.write("]\n")
5554 + f.close()
5556 + except IOError:
5557 + e = sys.exc_info()[1]
5558 + sys.stderr.write("Unable to create '%s'\n" % filename)
5559 + sys.stderr.write(str(e)+"\n")
5560 + return
5563 + # -----------------------------------------------------------------------------
5564 + # pickle_table()
5566 + # This function pickles the LR parsing tables to a supplied file object
5567 + # -----------------------------------------------------------------------------
5569 + def pickle_table(self,filename,signature=""):
5570 + try:
5571 + import cPickle as pickle
5572 + except ImportError:
5573 + import pickle
5574 + outf = open(filename,"wb")
5575 + pickle.dump(__tabversion__,outf,pickle_protocol)
5576 + pickle.dump(self.lr_method,outf,pickle_protocol)
5577 + pickle.dump(signature,outf,pickle_protocol)
5578 + pickle.dump(self.lr_action,outf,pickle_protocol)
5579 + pickle.dump(self.lr_goto,outf,pickle_protocol)
5581 + outp = []
5582 + for p in self.lr_productions:
5583 + if p.func:
5584 + outp.append((p.str,p.name, p.len, p.func,p.file,p.line))
5585 + else:
5586 + outp.append((str(p),p.name,p.len,None,None,None))
5587 + pickle.dump(outp,outf,pickle_protocol)
5588 + outf.close()
5590 +# -----------------------------------------------------------------------------
5591 +# === INTROSPECTION ===
5593 +# The following functions and classes are used to implement the PLY
5594 +# introspection features followed by the yacc() function itself.
5595 +# -----------------------------------------------------------------------------
5597 +# -----------------------------------------------------------------------------
5598 +# get_caller_module_dict()
5600 +# This function returns a dictionary containing all of the symbols defined within
5601 +# a caller further down the call stack. This is used to get the environment
5602 +# associated with the yacc() call if none was provided.
5603 +# -----------------------------------------------------------------------------
5605 +def get_caller_module_dict(levels):
5606 + try:
5607 + raise RuntimeError
5608 + except RuntimeError:
5609 + e,b,t = sys.exc_info()
5610 + f = t.tb_frame
5611 + while levels > 0:
5612 + f = f.f_back
5613 + levels -= 1
5614 + ldict = f.f_globals.copy()
5615 + if f.f_globals != f.f_locals:
5616 + ldict.update(f.f_locals)
5618 + return ldict
5620 +# -----------------------------------------------------------------------------
5621 +# parse_grammar()
5623 +# This takes a raw grammar rule string and parses it into production data
5624 +# -----------------------------------------------------------------------------
5625 +def parse_grammar(doc,file,line):
5626 + grammar = []
5627 + # Split the doc string into lines
5628 + pstrings = doc.splitlines()
5629 + lastp = None
5630 + dline = line
5631 + for ps in pstrings:
5632 + dline += 1
5633 + p = ps.split()
5634 + if not p: continue
5635 + try:
5636 + if p[0] == '|':
5637 + # This is a continuation of a previous rule
5638 + if not lastp:
5639 + raise SyntaxError("%s:%d: Misplaced '|'" % (file,dline))
5640 + prodname = lastp
5641 + syms = p[1:]
5642 + else:
5643 + prodname = p[0]
5644 + lastp = prodname
5645 + syms = p[2:]
5646 + assign = p[1]
5647 + if assign != ':' and assign != '::=':
5648 + raise SyntaxError("%s:%d: Syntax error. Expected ':'" % (file,dline))
5650 + grammar.append((file,dline,prodname,syms))
5651 + except SyntaxError:
5652 + raise
5653 + except Exception:
5654 + raise SyntaxError("%s:%d: Syntax error in rule '%s'" % (file,dline,ps.strip()))
5656 + return grammar
5658 +# -----------------------------------------------------------------------------
5659 +# ParserReflect()
5661 +# This class represents information extracted for building a parser including
5662 +# start symbol, error function, tokens, precedence list, action functions,
5663 +# etc.
5664 +# -----------------------------------------------------------------------------
5665 +class ParserReflect(object):
5666 + def __init__(self,pdict,log=None):
5667 + self.pdict = pdict
5668 + self.start = None
5669 + self.error_func = None
5670 + self.tokens = None
5671 + self.files = {}
5672 + self.grammar = []
5673 + self.error = 0
5675 + if log is None:
5676 + self.log = PlyLogger(sys.stderr)
5677 else:
5678 - f.write("\n_lr_goto = { ");
5679 - for k,v in _lr_goto.items():
5680 - f.write("(%r,%r):%r," % (k[0],k[1],v))
5681 - f.write("}\n");
5683 - # Write production table
5684 - f.write("_lr_productions = [\n")
5685 - for p in Productions:
5686 - if p:
5687 - if (p.func):
5688 - f.write(" (%r,%d,%r,%r,%d),\n" % (p.name, p.len, p.func.__name__,p.file,p.line))
5689 - else:
5690 - f.write(" (%r,%d,None,None,None),\n" % (p.name, p.len))
5691 + self.log = log
5693 + # Get all of the basic information
5694 + def get_all(self):
5695 + self.get_start()
5696 + self.get_error_func()
5697 + self.get_tokens()
5698 + self.get_precedence()
5699 + self.get_pfunctions()
5701 + # Validate all of the information
5702 + def validate_all(self):
5703 + self.validate_start()
5704 + self.validate_error_func()
5705 + self.validate_tokens()
5706 + self.validate_precedence()
5707 + self.validate_pfunctions()
5708 + self.validate_files()
5709 + return self.error
5711 + # Compute a signature over the grammar
5712 + def signature(self):
5713 + try:
5714 + from hashlib import md5
5715 + except ImportError:
5716 + from md5 import md5
5717 + try:
5718 + sig = md5()
5719 + if self.start:
5720 + sig.update(self.start.encode('latin-1'))
5721 + if self.prec:
5722 + sig.update("".join(["".join(p) for p in self.prec]).encode('latin-1'))
5723 + if self.tokens:
5724 + sig.update(" ".join(self.tokens).encode('latin-1'))
5725 + for f in self.pfuncs:
5726 + if f[3]:
5727 + sig.update(f[3].encode('latin-1'))
5728 + except (TypeError,ValueError):
5729 + pass
5730 + return sig.digest()
5732 + # -----------------------------------------------------------------------------
5733 + # validate_file()
5735 + # This method checks to see if there are duplicated p_rulename() functions
5736 + # in the parser module file. Without this function, it is really easy for
5737 + # users to make mistakes by cutting and pasting code fragments (and it's a real
5738 + # bugger to try and figure out why the resulting parser doesn't work). Therefore,
5739 + # we just do a little regular expression pattern matching of def statements
5740 + # to try and detect duplicates.
5741 + # -----------------------------------------------------------------------------
5743 + def validate_files(self):
5744 + # Match def p_funcname(
5745 + fre = re.compile(r'\s*def\s+(p_[a-zA-Z_0-9]*)\(')
5747 + for filename in self.files.keys():
5748 + base,ext = os.path.splitext(filename)
5749 + if ext != '.py': return 1 # No idea. Assume it's okay.
5751 + try:
5752 + f = open(filename)
5753 + lines = f.readlines()
5754 + f.close()
5755 + except IOError:
5756 + continue
5758 + counthash = { }
5759 + for linen,l in enumerate(lines):
5760 + linen += 1
5761 + m = fre.match(l)
5762 + if m:
5763 + name = m.group(1)
5764 + prev = counthash.get(name)
5765 + if not prev:
5766 + counthash[name] = linen
5767 + else:
5768 + self.log.warning("%s:%d: Function %s redefined. Previously defined on line %d", filename,linen,name,prev)
5770 + # Get the start symbol
5771 + def get_start(self):
5772 + self.start = self.pdict.get('start')
5774 + # Validate the start symbol
5775 + def validate_start(self):
5776 + if self.start is not None:
5777 + if not isinstance(self.start,str):
5778 + self.log.error("'start' must be a string")
5780 + # Look for error handler
5781 + def get_error_func(self):
5782 + self.error_func = self.pdict.get('p_error')
5784 + # Validate the error function
5785 + def validate_error_func(self):
5786 + if self.error_func:
5787 + if isinstance(self.error_func,types.FunctionType):
5788 + ismethod = 0
5789 + elif isinstance(self.error_func, types.MethodType):
5790 + ismethod = 1
5791 else:
5792 - f.write(" None,\n")
5793 - f.write("]\n")
5795 - f.close()
5797 - except IOError,e:
5798 - print >>sys.stderr, "Unable to create '%s'" % filename
5799 - print >>sys.stderr, e
5800 - return
5802 -def lr_read_tables(module=tab_module,optimize=0):
5803 - global _lr_action, _lr_goto, _lr_productions, _lr_method
5804 - try:
5805 - if isinstance(module,types.ModuleType):
5806 - parsetab = module
5807 - else:
5808 - exec "import %s as parsetab" % module
5810 - if (optimize) or (Signature.digest() == parsetab._lr_signature):
5811 - _lr_action = parsetab._lr_action
5812 - _lr_goto = parsetab._lr_goto
5813 - _lr_productions = parsetab._lr_productions
5814 - _lr_method = parsetab._lr_method
5815 - return 1
5816 - else:
5817 - return 0
5819 - except (ImportError,AttributeError):
5820 - return 0
5822 + self.log.error("'p_error' defined, but is not a function or method")
5823 + self.error = 1
5824 + return
5826 + eline = func_code(self.error_func).co_firstlineno
5827 + efile = func_code(self.error_func).co_filename
5828 + self.files[efile] = 1
5830 + if (func_code(self.error_func).co_argcount != 1+ismethod):
5831 + self.log.error("%s:%d: p_error() requires 1 argument",efile,eline)
5832 + self.error = 1
5834 + # Get the tokens map
5835 + def get_tokens(self):
5836 + tokens = self.pdict.get("tokens",None)
5837 + if not tokens:
5838 + self.log.error("No token list is defined")
5839 + self.error = 1
5840 + return
5842 + if not isinstance(tokens,(list, tuple)):
5843 + self.log.error("tokens must be a list or tuple")
5844 + self.error = 1
5845 + return
5847 + if not tokens:
5848 + self.log.error("tokens is empty")
5849 + self.error = 1
5850 + return
5852 + self.tokens = tokens
5854 + # Validate the tokens
5855 + def validate_tokens(self):
5856 + # Validate the tokens.
5857 + if 'error' in self.tokens:
5858 + self.log.error("Illegal token name 'error'. Is a reserved word")
5859 + self.error = 1
5860 + return
5862 + terminals = {}
5863 + for n in self.tokens:
5864 + if n in terminals:
5865 + self.log.warning("Token '%s' multiply defined", n)
5866 + terminals[n] = 1
5868 + # Get the precedence map (if any)
5869 + def get_precedence(self):
5870 + self.prec = self.pdict.get("precedence",None)
5872 + # Validate and parse the precedence map
5873 + def validate_precedence(self):
5874 + preclist = []
5875 + if self.prec:
5876 + if not isinstance(self.prec,(list,tuple)):
5877 + self.log.error("precedence must be a list or tuple")
5878 + self.error = 1
5879 + return
5880 + for level,p in enumerate(self.prec):
5881 + if not isinstance(p,(list,tuple)):
5882 + self.log.error("Bad precedence table")
5883 + self.error = 1
5884 + return
5886 + if len(p) < 2:
5887 + self.log.error("Malformed precedence entry %s. Must be (assoc, term, ..., term)",p)
5888 + self.error = 1
5889 + return
5890 + assoc = p[0]
5891 + if not isinstance(assoc,str):
5892 + self.log.error("precedence associativity must be a string")
5893 + self.error = 1
5894 + return
5895 + for term in p[1:]:
5896 + if not isinstance(term,str):
5897 + self.log.error("precedence items must be strings")
5898 + self.error = 1
5899 + return
5900 + preclist.append((term,assoc,level+1))
5901 + self.preclist = preclist
5903 + # Get all p_functions from the grammar
5904 + def get_pfunctions(self):
5905 + p_functions = []
5906 + for name, item in self.pdict.items():
5907 + if name[:2] != 'p_': continue
5908 + if name == 'p_error': continue
5909 + if isinstance(item,(types.FunctionType,types.MethodType)):
5910 + line = func_code(item).co_firstlineno
5911 + file = func_code(item).co_filename
5912 + p_functions.append((line,file,name,item.__doc__))
5914 + # Sort all of the actions by line number
5915 + p_functions.sort()
5916 + self.pfuncs = p_functions
5919 + # Validate all of the p_functions
5920 + def validate_pfunctions(self):
5921 + grammar = []
5922 + # Check for non-empty symbols
5923 + if len(self.pfuncs) == 0:
5924 + self.log.error("no rules of the form p_rulename are defined")
5925 + self.error = 1
5926 + return
5928 + for line, file, name, doc in self.pfuncs:
5929 + func = self.pdict[name]
5930 + if isinstance(func, types.MethodType):
5931 + reqargs = 2
5932 + else:
5933 + reqargs = 1
5934 + if func_code(func).co_argcount > reqargs:
5935 + self.log.error("%s:%d: Rule '%s' has too many arguments",file,line,func.__name__)
5936 + self.error = 1
5937 + elif func_code(func).co_argcount < reqargs:
5938 + self.log.error("%s:%d: Rule '%s' requires an argument",file,line,func.__name__)
5939 + self.error = 1
5940 + elif not func.__doc__:
5941 + self.log.warning("%s:%d: No documentation string specified in function '%s' (ignored)",file,line,func.__name__)
5942 + else:
5943 + try:
5944 + parsed_g = parse_grammar(doc,file,line)
5945 + for g in parsed_g:
5946 + grammar.append((name, g))
5947 + except SyntaxError:
5948 + e = sys.exc_info()[1]
5949 + self.log.error(str(e))
5950 + self.error = 1
5952 + # Looks like a valid grammar rule
5953 + # Mark the file in which defined.
5954 + self.files[file] = 1
5956 + # Secondary validation step that looks for p_ definitions that are not functions
5957 + # or functions that look like they might be grammar rules.
5959 + for n,v in self.pdict.items():
5960 + if n[0:2] == 'p_' and isinstance(v, (types.FunctionType, types.MethodType)): continue
5961 + if n[0:2] == 't_': continue
5962 + if n[0:2] == 'p_' and n != 'p_error':
5963 + self.log.warning("'%s' not defined as a function", n)
5964 + if ((isinstance(v,types.FunctionType) and func_code(v).co_argcount == 1) or
5965 + (isinstance(v,types.MethodType) and func_code(v).co_argcount == 2)):
5966 + try:
5967 + doc = v.__doc__.split(" ")
5968 + if doc[1] == ':':
5969 + self.log.warning("%s:%d: Possible grammar rule '%s' defined without p_ prefix",
5970 + func_code(v).co_filename, func_code(v).co_firstlineno,n)
5971 + except Exception:
5972 + pass
5974 + self.grammar = grammar
5976 # -----------------------------------------------------------------------------
5977 # yacc(module)
5979 -# Build the parser module
5980 +# Build a parser
5981 # -----------------------------------------------------------------------------
5983 -def yacc(method=default_lr, debug=yaccdebug, module=None, tabmodule=tab_module, start=None, check_recursion=1, optimize=0,write_tables=1,debugfile=debug_file,outputdir=''):
5984 - global yaccdebug
5985 - yaccdebug = debug
5987 - initialize_vars()
5988 - files = { }
5989 - error = 0
5992 - # Add parsing method to signature
5993 - Signature.update(method)
5995 - # If a "module" parameter was supplied, extract its dictionary.
5996 - # Note: a module may in fact be an instance as well.
5998 +def yacc(method='LALR', debug=yaccdebug, module=None, tabmodule=tab_module, start=None,
5999 + check_recursion=1, optimize=0, write_tables=1, debugfile=debug_file,outputdir='',
6000 + debuglog=None, errorlog = None, picklefile=None):
6002 + global parse # Reference to the parsing method of the last built parser
6004 + # If pickling is enabled, table files are not created
6006 + if picklefile:
6007 + write_tables = 0
6009 + if errorlog is None:
6010 + errorlog = PlyLogger(sys.stderr)
6012 + # Get the module dictionary used for the parser
6013 if module:
6014 - # User supplied a module object.
6015 - if isinstance(module, types.ModuleType):
6016 - ldict = module.__dict__
6017 - elif isinstance(module, _INSTANCETYPE):
6018 - _items = [(k,getattr(module,k)) for k in dir(module)]
6019 - ldict = { }
6020 - for i in _items:
6021 - ldict[i[0]] = i[1]
6022 + _items = [(k,getattr(module,k)) for k in dir(module)]
6023 + pdict = dict(_items)
6024 + else:
6025 + pdict = get_caller_module_dict(2)
6027 + # Collect parser information from the dictionary
6028 + pinfo = ParserReflect(pdict,log=errorlog)
6029 + pinfo.get_all()
6031 + if pinfo.error:
6032 + raise YaccError("Unable to build parser")
6034 + # Check signature against table files (if any)
6035 + signature = pinfo.signature()
6037 + # Read the tables
6038 + try:
6039 + lr = LRTable()
6040 + if picklefile:
6041 + read_signature = lr.read_pickle(picklefile)
6042 else:
6043 - raise ValueError,"Expected a module"
6045 - else:
6046 - # No module given. We might be able to get information from the caller.
6047 - # Throw an exception and unwind the traceback to get the globals
6049 + read_signature = lr.read_table(tabmodule)
6050 + if optimize or (read_signature == signature):
6051 + try:
6052 + lr.bind_callables(pinfo.pdict)
6053 + parser = LRParser(lr,pinfo.error_func)
6054 + parse = parser.parse
6055 + return parser
6056 + except Exception:
6057 + e = sys.exc_info()[1]
6058 + errorlog.warning("There was a problem loading the table file: %s", repr(e))
6059 + except VersionError:
6060 + e = sys.exc_info()
6061 + errorlog.warning(str(e))
6062 + except Exception:
6063 + pass
6065 + if debuglog is None:
6066 + if debug:
6067 + debuglog = PlyLogger(open(debugfile,"w"))
6068 + else:
6069 + debuglog = NullLogger()
6071 + debuglog.info("Created by PLY version %s (http://www.dabeaz.com/ply)", __version__)
6074 + errors = 0
6076 + # Validate the parser information
6077 + if pinfo.validate_all():
6078 + raise YaccError("Unable to build parser")
6080 + if not pinfo.error_func:
6081 + errorlog.warning("no p_error() function is defined")
6083 + # Create a grammar object
6084 + grammar = Grammar(pinfo.tokens)
6086 + # Set precedence level for terminals
6087 + for term, assoc, level in pinfo.preclist:
6088 try:
6089 - raise RuntimeError
6090 - except RuntimeError:
6091 - e,b,t = sys.exc_info()
6092 - f = t.tb_frame
6093 - f = f.f_back # Walk out to our calling function
6094 - if f.f_globals is f.f_locals: # Collect global and local variations from caller
6095 - ldict = f.f_globals
6096 - else:
6097 - ldict = f.f_globals.copy()
6098 - ldict.update(f.f_locals)
6100 - # Add starting symbol to signature
6101 - if not start:
6102 - start = ldict.get("start",None)
6103 - if start:
6104 - Signature.update(start)
6106 - # Look for error handler
6107 - ef = ldict.get('p_error',None)
6108 - if ef:
6109 - if isinstance(ef,types.FunctionType):
6110 - ismethod = 0
6111 - elif isinstance(ef, types.MethodType):
6112 - ismethod = 1
6113 + grammar.set_precedence(term,assoc,level)
6114 + except GrammarError:
6115 + e = sys.exc_info()[1]
6116 + errorlog.warning("%s",str(e))
6118 + # Add productions to the grammar
6119 + for funcname, gram in pinfo.grammar:
6120 + file, line, prodname, syms = gram
6121 + try:
6122 + grammar.add_production(prodname,syms,funcname,file,line)
6123 + except GrammarError:
6124 + e = sys.exc_info()[1]
6125 + errorlog.error("%s",str(e))
6126 + errors = 1
6128 + # Set the grammar start symbols
6129 + try:
6130 + if start is None:
6131 + grammar.set_start(pinfo.start)
6132 else:
6133 - raise YaccError,"'p_error' defined, but is not a function or method."
6134 - eline = ef.func_code.co_firstlineno
6135 - efile = ef.func_code.co_filename
6136 - files[efile] = None
6138 - if (ef.func_code.co_argcount != 1+ismethod):
6139 - raise YaccError,"%s:%d: p_error() requires 1 argument." % (efile,eline)
6140 - global Errorfunc
6141 - Errorfunc = ef
6142 - else:
6143 - print >>sys.stderr, "yacc: Warning. no p_error() function is defined."
6145 - # If running in optimized mode. We're going to read tables instead
6147 - if (optimize and lr_read_tables(tabmodule,1)):
6148 - # Read parse table
6149 - del Productions[:]
6150 - for p in _lr_productions:
6151 - if not p:
6152 - Productions.append(None)
6153 - else:
6154 - m = MiniProduction()
6155 - m.name = p[0]
6156 - m.len = p[1]
6157 - m.file = p[3]
6158 - m.line = p[4]
6159 - if p[2]:
6160 - m.func = ldict[p[2]]
6161 - Productions.append(m)
6163 - else:
6164 - # Get the tokens map
6165 - if (module and isinstance(module,_INSTANCETYPE)):
6166 - tokens = getattr(module,"tokens",None)
6167 - else:
6168 - tokens = ldict.get("tokens",None)
6170 - if not tokens:
6171 - raise YaccError,"module does not define a list 'tokens'"
6172 - if not (isinstance(tokens,types.ListType) or isinstance(tokens,types.TupleType)):
6173 - raise YaccError,"tokens must be a list or tuple."
6175 - # Check to see if a requires dictionary is defined.
6176 - requires = ldict.get("require",None)
6177 - if requires:
6178 - if not (isinstance(requires,types.DictType)):
6179 - raise YaccError,"require must be a dictionary."
6181 - for r,v in requires.items():
6182 - try:
6183 - if not (isinstance(v,types.ListType)):
6184 - raise TypeError
6185 - v1 = [x.split(".") for x in v]
6186 - Requires[r] = v1
6187 - except StandardError:
6188 - print >>sys.stderr, "Invalid specification for rule '%s' in require. Expected a list of strings" % r
6191 - # Build the dictionary of terminals. We a record a 0 in the
6192 - # dictionary to track whether or not a terminal is actually
6193 - # used in the grammar
6195 - if 'error' in tokens:
6196 - print >>sys.stderr, "yacc: Illegal token 'error'. Is a reserved word."
6197 - raise YaccError,"Illegal token name"
6199 - for n in tokens:
6200 - if Terminals.has_key(n):
6201 - print >>sys.stderr, "yacc: Warning. Token '%s' multiply defined." % n
6202 - Terminals[n] = [ ]
6204 - Terminals['error'] = [ ]
6206 - # Get the precedence map (if any)
6207 - prec = ldict.get("precedence",None)
6208 - if prec:
6209 - if not (isinstance(prec,types.ListType) or isinstance(prec,types.TupleType)):
6210 - raise YaccError,"precedence must be a list or tuple."
6211 - add_precedence(prec)
6212 - Signature.update(repr(prec))
6214 - for n in tokens:
6215 - if not Precedence.has_key(n):
6216 - Precedence[n] = ('right',0) # Default, right associative, 0 precedence
6218 - # Get the list of built-in functions with p_ prefix
6219 - symbols = [ldict[f] for f in ldict.keys()
6220 - if (type(ldict[f]) in (types.FunctionType, types.MethodType) and ldict[f].__name__[:2] == 'p_'
6221 - and ldict[f].__name__ != 'p_error')]
6223 - # Check for non-empty symbols
6224 - if len(symbols) == 0:
6225 - raise YaccError,"no rules of the form p_rulename are defined."
6227 - # Sort the symbols by line number
6228 - symbols.sort(lambda x,y: cmp(x.func_code.co_firstlineno,y.func_code.co_firstlineno))
6230 - # Add all of the symbols to the grammar
6231 - for f in symbols:
6232 - if (add_function(f)) < 0:
6233 - error += 1
6234 - else:
6235 - files[f.func_code.co_filename] = None
6237 - # Make a signature of the docstrings
6238 - for f in symbols:
6239 - if f.__doc__:
6240 - Signature.update(f.__doc__)
6242 - lr_init_vars()
6244 - if error:
6245 - raise YaccError,"Unable to construct parser."
6247 - if not lr_read_tables(tabmodule):
6249 - # Validate files
6250 - for filename in files.keys():
6251 - if not validate_file(filename):
6252 - error = 1
6254 - # Validate dictionary
6255 - validate_dict(ldict)
6257 - if start and not Prodnames.has_key(start):
6258 - raise YaccError,"Bad starting symbol '%s'" % start
6260 - augment_grammar(start)
6261 - error = verify_productions(cycle_check=check_recursion)
6262 - otherfunc = [ldict[f] for f in ldict.keys()
6263 - if (type(f) in (types.FunctionType,types.MethodType) and ldict[f].__name__[:2] != 'p_')]
6265 - # Check precedence rules
6266 - if check_precedence():
6267 - error = 1
6269 - if error:
6270 - raise YaccError,"Unable to construct parser."
6272 - build_lritems()
6273 - compute_first1()
6274 - compute_follow(start)
6276 - if method in ['SLR','LALR']:
6277 - lr_parse_table(method)
6278 - else:
6279 - raise YaccError, "Unknown parsing method '%s'" % method
6281 - if write_tables:
6282 - lr_write_tables(tabmodule,outputdir)
6284 - if yaccdebug:
6285 - try:
6286 - f = open(os.path.join(outputdir,debugfile),"w")
6287 - f.write(_vfc.getvalue())
6288 - f.write("\n\n")
6289 - f.write(_vf.getvalue())
6290 - f.close()
6291 - except IOError,e:
6292 - print >>sys.stderr, "yacc: can't create '%s'" % debugfile,e
6294 - # Made it here. Create a parser object and set up its internal state.
6295 - # Set global parse() method to bound method of parser object.
6297 - p = Parser("xyzzy")
6298 - p.productions = Productions
6299 - p.errorfunc = Errorfunc
6300 - p.action = _lr_action
6301 - p.goto = _lr_goto
6302 - p.method = _lr_method
6303 - p.require = Requires
6305 - global parse
6306 - parse = p.parse
6308 - global parser
6309 - parser = p
6311 - # Clean up all of the globals we created
6312 - if (not optimize):
6313 - yacc_cleanup()
6314 - return p
6316 -# yacc_cleanup function. Delete all of the global variables
6317 -# used during table construction
6319 -def yacc_cleanup():
6320 - global _lr_action, _lr_goto, _lr_method, _lr_goto_cache
6321 - del _lr_action, _lr_goto, _lr_method, _lr_goto_cache
6323 - global Productions, Prodnames, Prodmap, Terminals
6324 - global Nonterminals, First, Follow, Precedence, UsedPrecedence, LRitems
6325 - global Errorfunc, Signature, Requires
6327 - del Productions, Prodnames, Prodmap, Terminals
6328 - del Nonterminals, First, Follow, Precedence, UsedPrecedence, LRitems
6329 - del Errorfunc, Signature, Requires
6331 - global _vf, _vfc
6332 - del _vf, _vfc
6335 -# Stub that raises an error if parsing is attempted without first calling yacc()
6336 -def parse(*args,**kwargs):
6337 - raise YaccError, "yacc: No parser built with yacc()"
6338 + grammar.set_start(start)
6339 + except GrammarError:
6340 + e = sys.exc_info()[1]
6341 + errorlog.error(str(e))
6342 + errors = 1
6344 + if errors:
6345 + raise YaccError("Unable to build parser")
6347 + # Verify the grammar structure
6348 + undefined_symbols = grammar.undefined_symbols()
6349 + for sym, prod in undefined_symbols:
6350 + errorlog.error("%s:%d: Symbol '%s' used, but not defined as a token or a rule",prod.file,prod.line,sym)
6351 + errors = 1
6353 + unused_terminals = grammar.unused_terminals()
6354 + if unused_terminals:
6355 + debuglog.info("")
6356 + debuglog.info("Unused terminals:")
6357 + debuglog.info("")
6358 + for term in unused_terminals:
6359 + errorlog.warning("Token '%s' defined, but not used", term)
6360 + debuglog.info(" %s", term)
6362 + # Print out all productions to the debug log
6363 + if debug:
6364 + debuglog.info("")
6365 + debuglog.info("Grammar")
6366 + debuglog.info("")
6367 + for n,p in enumerate(grammar.Productions):
6368 + debuglog.info("Rule %-5d %s", n, p)
6370 + # Find unused non-terminals
6371 + unused_rules = grammar.unused_rules()
6372 + for prod in unused_rules:
6373 + errorlog.warning("%s:%d: Rule '%s' defined, but not used", prod.file, prod.line, prod.name)
6375 + if len(unused_terminals) == 1:
6376 + errorlog.warning("There is 1 unused token")
6377 + if len(unused_terminals) > 1:
6378 + errorlog.warning("There are %d unused tokens", len(unused_terminals))
6380 + if len(unused_rules) == 1:
6381 + errorlog.warning("There is 1 unused rule")
6382 + if len(unused_rules) > 1:
6383 + errorlog.warning("There are %d unused rules", len(unused_rules))
6385 + if debug:
6386 + debuglog.info("")
6387 + debuglog.info("Terminals, with rules where they appear")
6388 + debuglog.info("")
6389 + terms = list(grammar.Terminals)
6390 + terms.sort()
6391 + for term in terms:
6392 + debuglog.info("%-20s : %s", term, " ".join([str(s) for s in grammar.Terminals[term]]))
6394 + debuglog.info("")
6395 + debuglog.info("Nonterminals, with rules where they appear")
6396 + debuglog.info("")
6397 + nonterms = list(grammar.Nonterminals)
6398 + nonterms.sort()
6399 + for nonterm in nonterms:
6400 + debuglog.info("%-20s : %s", nonterm, " ".join([str(s) for s in grammar.Nonterminals[nonterm]]))
6401 + debuglog.info("")
6403 + if check_recursion:
6404 + unreachable = grammar.find_unreachable()
6405 + for u in unreachable:
6406 + errorlog.warning("Symbol '%s' is unreachable",u)
6408 + infinite = grammar.infinite_cycles()
6409 + for inf in infinite:
6410 + errorlog.error("Infinite recursion detected for symbol '%s'", inf)
6411 + errors = 1
6413 + unused_prec = grammar.unused_precedence()
6414 + for term, assoc in unused_prec:
6415 + errorlog.error("Precedence rule '%s' defined for unknown symbol '%s'", assoc, term)
6416 + errors = 1
6418 + if errors:
6419 + raise YaccError("Unable to build parser")
6421 + # Run the LRGeneratedTable on the grammar
6422 + if debug:
6423 + errorlog.debug("Generating %s tables", method)
6425 + lr = LRGeneratedTable(grammar,method,debuglog)
6427 + if debug:
6428 + num_sr = len(lr.sr_conflicts)
6430 + # Report shift/reduce and reduce/reduce conflicts
6431 + if num_sr == 1:
6432 + errorlog.warning("1 shift/reduce conflict")
6433 + elif num_sr > 1:
6434 + errorlog.warning("%d shift/reduce conflicts", num_sr)
6436 + num_rr = len(lr.rr_conflicts)
6437 + if num_rr == 1:
6438 + errorlog.warning("1 reduce/reduce conflict")
6439 + elif num_rr > 1:
6440 + errorlog.warning("%d reduce/reduce conflicts", num_rr)
6442 + # Write out conflicts to the output file
6443 + if debug and (lr.sr_conflicts or lr.rr_conflicts):
6444 + debuglog.warning("")
6445 + debuglog.warning("Conflicts:")
6446 + debuglog.warning("")
6448 + for state, tok, resolution in lr.sr_conflicts:
6449 + debuglog.warning("shift/reduce conflict for %s in state %d resolved as %s", tok, state, resolution)
6451 + already_reported = {}
6452 + for state, rule, rejected in lr.rr_conflicts:
6453 + if (state,id(rule),id(rejected)) in already_reported:
6454 + continue
6455 + debuglog.warning("reduce/reduce conflict in state %d resolved using rule (%s)", state, rule)
6456 + debuglog.warning("rejected rule (%s) in state %d", rejected,state)
6457 + errorlog.warning("reduce/reduce conflict in state %d resolved using rule (%s)", state, rule)
6458 + errorlog.warning("rejected rule (%s) in state %d", rejected, state)
6459 + already_reported[state,id(rule),id(rejected)] = 1
6461 + warned_never = []
6462 + for state, rule, rejected in lr.rr_conflicts:
6463 + if not rejected.reduced and (rejected not in warned_never):
6464 + debuglog.warning("Rule (%s) is never reduced", rejected)
6465 + errorlog.warning("Rule (%s) is never reduced", rejected)
6466 + warned_never.append(rejected)
6468 + # Write the table file if requested
6469 + if write_tables:
6470 + lr.write_table(tabmodule,outputdir,signature)
6472 + # Write a pickled version of the tables
6473 + if picklefile:
6474 + lr.pickle_table(picklefile,signature)
6476 + # Build the parser
6477 + lr.bind_callables(pinfo.pdict)
6478 + parser = LRParser(lr,pinfo.error_func)
6480 + parse = parser.parse
6481 + return parser