Sync usage with man page.
[netbsd-mini2440.git] / share / doc / papers / px / pxin2.n
blobf1b9b95795df779e260685d97cde37d18d255aaf
1 .\" $NetBSD: pxin2.n,v 1.2 1998/01/09 06:41:56 perry Exp $
2 .\"
3 .\" Copyright (c) 1979 The Regents of the University of California.
4 .\" All rights reserved.
5 .\"
6 .\" Redistribution and use in source and binary forms, with or without
7 .\" modification, are permitted provided that the following conditions
8 .\" are met:
9 .\" 1. Redistributions of source code must retain the above copyright
10 .\" notice, this list of conditions and the following disclaimer.
11 .\" 2. Redistributions in binary form must reproduce the above copyright
12 .\" notice, this list of conditions and the following disclaimer in the
13 .\" documentation and/or other materials provided with the distribution.
14 .\" 3. Neither the name of the University nor the names of its contributors
15 .\" may be used to endorse or promote products derived from this software
16 .\" without specific prior written permission.
17 .\"
18 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
19 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
20 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
21 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
22 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
23 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
24 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
25 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
26 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
27 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
28 .\" SUCH DAMAGE.
29 .\"
30 .\" @(#)pxin2.n 5.2 (Berkeley) 4/17/91
31 .\"
32 .if !\n(xx .so tmac.p
33 .nr H1 1
34 .if n .ND
35 .NH
36 Operations
37 .NH 2
38 Naming conventions and operation summary
39 .PP
40 Table 2.1 outlines the opcode typing convention.
41 The expression ``a above b'' means that `a' is on top
42 of the stack with `b' below it.
43 Table 2.3 describes each of the opcodes.
44 The character `*' at the end of a name specifies that
45 all operations with the root prefix
46 before the `*'
47 are summarized by one entry.
48 Table 2.2 gives the codes used
49 to describe the type inline data expected by each instruction.
50 .sp 2
51 .so table2.1.n
52 .sp 2
53 .so table2.2.n
54 .bp
55 .so table2.3.n
56 .bp
57 .NH 2
58 Basic control operations
59 .LP
60 .SH
61 HALT
62 .IP
63 Corresponds to the Pascal procedure
64 .I halt ;
65 causes execution to end with a post-mortem backtrace as if a run-time
66 error had occurred.
67 .SH
68 BEG s,W,w,"
69 .IP
70 Causes the second part of the block mark to be created, and
71 .I W
72 bytes of local variable space to be allocated and cleared to zero.
73 Stack overflow is detected here.
74 .I w
75 is the first line of the body of this section for error traceback,
76 and the inline string (length s) the character representation of its name.
77 .SH
78 NODUMP s,W,w,"
79 .IP
80 Equivalent to
81 .SM BEG ,
82 and used to begin the main program when the ``p''
83 option is disabled so that the post-mortem backtrace will be inhibited.
84 .SH
85 END
86 .IP
87 Complementary to the operators
88 .SM CALL
89 and
90 .SM BEG ,
91 exits the current block, calling the procedure
92 .I pclose
93 to flush buffers for and release any local files.
94 Restores the environment of the caller from the block mark.
95 If this is the end for the main program, all files are
96 .I flushed,
97 and the interpreter is exited.
98 .SH
99 CALL l,A
101 Saves the current line number, return address, and active display entry pointer
102 .I dp
103 in the first part of the block mark, then transfers to the entry point
104 given by the relative address
105 .I A ,
106 that is the beginning of a
107 .B procedure
109 .B function
110 at level
111 .I l.
113 PUSH s
115 Clears
116 .I s
117 bytes on the stack.
118 Used to make space for the return value of a
119 .B function
120 just before calling it.
122 POP s
125 .I s
126 bytes off the stack.
127 Used after a
128 .B function
130 .B procedure
131 returns to remove the arguments from the stack.
133 TRA a
135 Transfer control to relative address
136 .I a
137 as a local
138 .B goto
139 or part of a structured statement.
141 TRA4 A
143 Transfer control to an absolute address as part of a non-local
144 .B goto
145 or to branch over procedure bodies.
147 LINO s
149 Set current line number to
150 .I s.
151 For consistency, check that the expression stack is empty
152 as it should be (as this is the start of a statement.)
153 This consistency check will fail only if there is a bug in the
154 interpreter or the interpreter code has somehow been damaged.
155 Increment the statement count and if it exceeds the statement limit,
156 generate a fault.
158 GOTO l,A
160 Transfer control to address
161 .I A
162 that is in the block at level
163 .I l
164 of the display.
165 This is a non-local
166 .B goto.
167 Causes each block to be exited as if with
168 .SM END ,
169 flushing and freeing files with
170 .I pclose,
171 until the current display entry is at level
172 .I l.
174 SDUP*
176 Duplicate the word or long on the top of
177 the stack.
178 This is used mostly for constructing sets.
179 See section 2.11.
180 .NH 2
181 If and relational operators
183 IF a
185 The interpreter conditional transfers all take place using this operator
186 that examines the Boolean value on the top of the stack.
187 If the value is
188 .I true ,
189 the next code is executed,
190 otherwise control transfers to the specified address.
192 REL* r
194 These take two arguments on the stack,
195 and the sub-operation code specifies the relational operation to
196 be done, coded as follows with `a' above `b' on the stack:
200 lb lb
201 c a.
202 Code Operation
204 0 a = b
205 2 a <> b
206 4 a < b
207 6 a > b
208 8 a <= b
209 10 a >= b
213 Each operation does a test to set the condition code
214 appropriately and then does an indexed branch based on the
215 sub-operation code to a test of the condition here specified,
216 pushing a Boolean value on the stack.
218 Consider the statement fragment:
221 \*bif\fR a = b \*bthen\fR
225 .I a
227 .I b
228 are integers this generates the following code:
231 lp-2w(8) l.
232 RV4:\fIl a\fR
233 RV4:\fIl b\fR
234 REL4 \&=
235 IF \fIElse part offset\fR
238 c s.
239 \fI\&... Then part code ...\fR
242 .NH 2
243 Boolean operators
245 The Boolean operators
246 .SM AND ,
247 .SM OR ,
249 .SM NOT
250 manipulate values on the top of the stack.
251 All Boolean values are kept in single bytes in memory,
252 or in single words on the stack.
253 Zero represents a Boolean \fIfalse\fP, and one a Boolean \fItrue\fP.
254 .NH 2
255 Right value, constant, and assignment operators
257 LRV* l,A
259 RV* l,a
261 The right value operators load values on the stack.
262 They take a block number as a sub-opcode and load the appropriate
263 number of bytes from that block at the offset specified
264 in the following word onto the stack. As an example, consider
265 .SM LRV4 :
268 _LRV4:
269 \fBcvtbl\fR (lc)+,r0 #r0 has display index
270 \fBaddl3\fR _display(r0),(lc)+,r1 #r1 has variable address
271 \fBpushl\fR (r1) #put value on the stack
272 \fBjmp\fR (loop)
275 Here the interpreter places the display level in r0.
276 It then adds the appropriate display value to the inline offset and
277 pushes the value at this location onto the stack.
278 Control then returns to the main
279 interpreter loop.
281 .SM RV*
282 operators have short inline data that
283 reduces the space required to address the first 32K of
284 stack space in each stack frame.
285 The operators
286 .SM RV14
288 .SM RV24
289 provide explicit conversion to long as the data
290 is pushed.
291 This saves the generation of
292 .SM STOI
293 to align arguments to
294 .SM C
295 subroutines.
297 CON* r
299 The constant operators load a value onto the stack from inline code.
300 Small integer values are condensed and loaded by the
301 .SM CON1
302 operator, that is given by
305 _CON1:
306 \fBcvtbw\fR (lc)+,\-(sp)
307 \fBjmp\fR (loop)
310 Here note that little work was required as the required constant
311 was available at (lc)+.
312 For longer constants,
313 .I lc
314 must be incremented before moving the constant.
315 The operator
316 .SM CON
317 takes a length specification in the sub-opcode and can be used to load
318 strings and other variable length data onto the stack.
319 The operators
320 .SM CON14
322 .SM CON24
323 provide explicit conversion to long as the constant is pushed.
327 The assignment operators are similar to arithmetic and relational operators
328 in that they take two operands, both in the stack,
329 but the lengths given for them specify
330 first the length of the value on the stack and then the length
331 of the target in memory.
332 The target address in memory is under the value to be stored.
333 Thus the statement
335 i := 1
338 where
339 .I i
340 is a full-length, 4 byte, integer,
341 will generate the code sequence
344 lp-2w(8) l.
345 LV:\fIl i\fP
346 CON1:1
347 AS24
351 Here
352 .SM LV
353 will load the address of
354 .I i,
355 that is really given as a block number in the sub-opcode and an
356 offset in the following word,
357 onto the stack, occupying a single word.
358 .SM CON1 ,
359 that is a single word instruction,
360 then loads the constant 1,
361 that is in its sub-opcode,
362 onto the stack.
363 Since there are not one byte constants on the stack,
364 this becomes a 2 byte, single word integer.
365 The interpreter then assigns a length 2 integer to a length 4 integer using
366 .SM AS24 \&.
367 The code sequence for
368 .SM AS24
369 is given by:
372 _AS24:
373 \fBincl\fR lc
374 \fBcvtwl\fR (sp)+,*(sp)+
375 \fBjmp\fR (loop)
378 Thus the interpreter gets the single word off the stack,
379 extends it to be a 4 byte integer
380 gets the target address off the stack,
381 and finally stores the value in the target.
382 This is a typical use of the constant and assignment operators.
383 .NH 2
384 Addressing operations
386 LLV l,W
388 LV l,w
390 The most common operation done by the interpreter
391 is the ``left value'' or ``address of'' operation.
392 It is given by:
395 _LLV:
396 \fBcvtbl\fR (lc)+,r0 #r0 has display index
397 \fBaddl3\fR _display(r0),(lc)+,\-(sp) #push address onto the stack
398 \fBjmp\fR (loop)
401 It calculates an address in the block specified in the sub-opcode
402 by adding the associated display entry to the
403 offset that appears in the following word.
405 .SM LV
406 operator has a short inline data that reduces the space
407 required to address the first 32K of stack space in each call frame.
409 OFF s
411 The offset operator is used in field names.
412 Thus to get the address of
414 p^.f1
417 .I pi
418 would generate the sequence
422 lp-2w(8) l.
423 RV:\fIl p\fP
424 OFF \fIf1\fP
428 where the
429 .SM RV
430 loads the value of
431 .I p,
432 given its block in the sub-opcode and offset in the following word,
433 and the interpreter then adds the offset of the field
434 .I f1
435 in its record to get the correct address.
436 .SM OFF
437 takes its argument in the sub-opcode if it is small enough.
441 The example above is incomplete, lacking a check for a
442 .B nil
443 pointer.
444 The code generated would be
447 lp-2w(8) l.
448 RV:\fIl p\fP
450 OFF \fIf1\fP
454 where the
455 .SM NIL
456 operation checks for a
457 .I nil
458 pointer and generates the appropriate runtime error if it is.
460 LVCON s,"
462 A pointer to the specified length inline data is pushed
463 onto the stack.
464 This is primarily used for
465 .I printf
466 type strings used by
467 .SM WRITEF .
468 (see sections 3.6 and 3.8)
470 INX* s,w,w
472 The operators
473 .SM INX2
475 .SM INX4
476 are used for subscripting.
477 For example, the statement
479 a[i] := 2.0
482 with
483 .I i
484 an integer and
485 .I a
487 ``array [1..1000] of real''
488 would generate
491 lp-2w(8) l.
492 LV:\fIl a\fP
493 RV4:\fIl i\fP
494 INX4:8 1,999
495 CON8 2.0
500 Here the
501 .SM LV
502 operation takes the address of
503 .I a
504 and places it on the stack.
505 The value of
506 .I i
507 is then placed on top of this on the stack.
508 The array address is indexed by the
509 length 4 index (a length 2 index would use
510 .SM INX2 )
511 where the individual elements have a size of 8 bytes.
512 The code for
513 .SM INX4
517 _INX4:
518 \fBcvtbl\fR (lc)+,r0
519 \fBbneq\fR L1
520 \fBcvtwl\fR (lc)+,r0 #r0 has size of records
522 \fBcvtwl\fR (lc)+,r1 #r1 has lower bound
523 \fBmovzwl\fR (lc)+,r2 #r2 has upper-lower bound
524 \fBsubl3\fR r1,(sp)+,r3 #r3 has base subscript
525 \fBcmpl\fR r3,r2 #check for out of bounds
526 \fBbgtru\fR esubscr
527 \fBmull2\fR r0,r3 #calculate byte offset
528 \fBaddl2\fR r3,(sp) #calculate actual address
529 \fBjmp\fR (loop)
530 esubscr:
531 \fBmovw\fR $ESUBSCR,_perrno
532 \fBjbr\fR error
535 Here the lower bound is subtracted, and range checked against the
536 upper minus lower bound.
537 The offset is then scaled to a byte offset into the array
538 and added to the base address on the stack.
539 Multi-dimension subscripts are translated as a sequence of single subscriptings.
541 IND*
543 For indirect references through
544 .B var
545 parameters and pointers,
546 the interpreter has a set of indirection operators that convert a pointer
547 on the stack into a value on the stack from that address.
548 different
549 .SM IND
550 operators are necessary because of the possibility of different
551 length operands.
553 .SM IND14
555 .SM IND24
556 operators do conversions to long
557 as they push their data.
558 .NH 2
559 Arithmetic operators
561 The interpreter has many arithmetic operators.
562 All operators produce results long enough to prevent overflow
563 unless the bounds of the base type are exceeded.
564 The basic operators available are
566 Addition: ADD*, SUCC*
567 Subtraction: SUB*, PRED*
568 Multiplication: MUL*, SQR*
569 Division: DIV*, DVD*, MOD*
570 Unary: NEG*, ABS*
572 .NH 2
573 Range checking
575 The interpreter has several range checking operators.
576 The important distinction among these operators is between values whose
577 legal range begins at zero and those that do not begin at zero,
578 for example
579 a subrange variable whose values range from 45 to 70.
580 For those that begin at zero, a simpler ``logical'' comparison against
581 the upper bound suffices.
582 For others, both the low and upper bounds must be checked independently,
583 requiring two comparisons.
584 On the
585 .SM "VAX 11/780"
586 both checks are done using a single index instruction
587 so the only gain is in reducing the inline data.
588 .NH 2
589 Case operators
591 The interpreter includes three operators for
592 .B case
593 statements that are used depending on the width of the
594 .B case
595 label type.
596 For each width, the structure of the case data is the same, and
597 is represented in figure 2.4.
598 .sp 1
599 .so fig2.4.n
602 .SM CASEOP
603 case statement operators do a sequential search through the
604 case label values.
605 If they find the label value, they take the corresponding entry
606 from the transfer table and cause the interpreter to branch to the
607 specified statement.
608 If the specified label is not found, an error results.
611 .SM CASE
612 operators take the number of cases as a sub-opcode
613 if possible.
614 Three different operators are needed to handle single byte,
615 word, and long case transfer table values.
616 For example, the
617 .SM CASEOP1
618 operator has the following code sequence:
621 _CASEOP1:
622 \fBcvtbl\fR (lc)+,r0
623 \fBbneq\fR L1
624 \fBcvtwl\fR (lc)+,r0 #r0 has length of case table
626 \fBmovaw\fR (lc)[r0],r2 #r2 has pointer to case labels
627 \fBmovzwl\fR (sp)+,r3 #r3 has the element to find
628 \fBlocc\fR r3,r0,(r2) #r0 has index of located element
629 \fBbeql\fR caserr #element not found
630 \fBmnegl\fR r0,r0 #calculate new lc
631 \fBcvtwl\fR (r2)[r0],r1 #r1 has lc offset
632 \fBaddl2\fR r1,lc
633 \fBjmp\fR (loop)
634 caserr:
635 \fBmovw\fR $ECASE,_perrno
636 \fBjbr\fR error
639 Here the interpreter first computes the address of the beginning
640 of the case label value area by adding twice the number of case label
641 values to the address of the transfer table, since the transfer
642 table entries are 2 byte address offsets.
643 It then searches through the label values, and generates an ECASE
644 error if the label is not found.
645 If the label is found, the index of the corresponding entry
646 in the transfer table is extracted and that offset is added
647 to the interpreter location counter.
648 .NH 2
649 Operations supporting pxp
651 The following operations are defined to do execution profiling.
653 PXPBUF w
655 Causes the interpreter to allocate a count buffer
656 with
657 .I w
658 four byte counters
659 and to clear them to zero.
660 The count buffer is placed within an image of the
661 .I pmon.out
662 file as described in the
663 .I "PXP Implementation Notes."
664 The contents of this buffer are written to the file
665 .I pmon.out
666 when the program ends.
668 COUNT w
670 Increments the counter specified by
671 .I w .
673 TRACNT w,A
675 Used at the entry point to procedures and functions,
676 combining a transfer to the entry point of the block with
677 an incrementing of its entry count.
678 .NH 2
679 Set operations
681 The set operations:
682 union
683 .SM ADDT,
684 intersection
685 .SM MULT,
686 element removal
687 .SM SUBT,
688 and the set relationals
689 .SM RELT
690 are straightforward.
691 The following operations are more interesting.
693 CARD s
695 Takes the cardinality of a set of size
696 .I s
697 bytes on top of the stack, leaving a 2 byte integer count.
698 .SM CARD
699 uses the
700 .B ffs
701 opcode to successively count the number of set bits in the set.
703 CTTOT s,w,w
705 Constructs a set.
706 This operation requires a non-trivial amount of work,
707 checking bounds and setting individual bits or ranges of bits.
708 This operation sequence is slow,
709 and motivates the presence of the operator
710 .SM INCT
711 below.
712 The arguments to
713 .SM CTTOT
714 include the number of elements
715 .I s
716 in the constructed set,
717 the lower and upper bounds of the set,
718 the two
719 .I w
720 values,
721 and a pair of values on the stack for each range in the set, single
722 elements in constructed sets being duplicated with
723 .SM SDUP
724 to form degenerate ranges.
726 IN s,w,w
728 The operator
729 .B in
730 for sets.
731 The value
732 .I s
733 specifies the size of the set,
734 the two
735 .I w
736 values the lower and upper bounds of the set.
737 The value on the stack is checked to be in the set on the stack,
738 and a Boolean value of
739 .I true
741 .I false
742 replaces the operands.
744 INCT
746 The operator
747 .B in
748 on a constructed set without constructing it.
749 The left operand of
750 .B in
751 is on top of the stack followed by the number of pairs in the
752 constructed set,
753 and then the pairs themselves, all as single word integers.
754 Pairs designate runs of values and single values are represented by
755 a degenerate pair with both value equal.
756 This operator is generated in grammatical constructs such as
758 \fBif\fR character \fBin\fR [`+', '\-', `*', `/']
763 \fBif\fR character \fBin\fR [`a'..`z', `$', `_']
766 These constructs are common in Pascal, and
767 .SM INCT
768 makes them run much faster in the interpreter,
769 as if they were written as an efficient series of
770 .B if
771 statements.
772 .NH 2
773 Miscellaneous
775 Other miscellaneous operators that are present in the interpreter
777 .SM ASRT
778 that causes the program to end if the Boolean value on the stack is not
779 .I true,
781 .SM STOI ,
782 .SM STOD ,
783 .SM ITOD ,
785 .SM ITOS
786 that convert between different length arithmetic operands for
787 use in aligning the arguments in
788 .B procedure
790 .B function
791 calls, and with some untyped built-ins, such as
792 .SM SIN
794 .SM COS \&.
796 Finally, if the program is run with the run-time testing disabled, there
797 are special operators for
798 .B for
799 statements
800 and special indexing operators for arrays
801 that have individual element size that is a power of 2.
802 The code can run significantly faster using these operators.
803 .NH 2
804 Mathematical Functions
806 The transcendental functions
807 .SM SIN ,
808 .SM COS ,
809 .SM ATAN ,
810 .SM EXP ,
811 .SM LN ,
812 .SM SQRT ,
813 .SM SEED ,
815 .SM RANDOM
816 are taken from the standard UNIX
817 mathematical package.
818 These functions take double precision floating point
819 values and return the same.
821 The functions
822 .SM EXPO ,
823 .SM TRUNC ,
825 .SM ROUND
826 take a double precision floating point number.
827 .SM EXPO
828 returns an integer representing the machine
829 representation of its argument's exponent,
830 .SM TRUNC
831 returns the integer part of its argument, and
832 .SM ROUND
833 returns the rounded integer part of its argument.
834 .NH 2
835 System functions and procedures
837 LLIMIT
839 A line limit and a file pointer are passed on the stack.
840 If the limit is non-negative the line limit is set to the
841 specified value, otherwise it is set to unlimited.
842 The default is unlimited.
844 STLIM
846 A statement limit is passed on the stack. The statement limit
847 is set as specified.
848 The default is 500,000.
849 No limit is enforced when the ``p'' option is disabled.
851 CLCK
853 SCLCK
855 .SM CLCK
856 returns the number of milliseconds of user time used by the program;
857 .SM SCLCK
858 returns the number of milliseconds of system time used by the program.
860 WCLCK
862 The number of seconds since some predefined time is
863 returned. Its primary usefulness is in determining
864 elapsed time and in providing a unique time stamp.
867 The other system time procedures are
868 .SM DATE
870 .SM TIME
871 that copy an appropriate text string into a pascal string array.
872 The function
873 .SM ARGC
874 returns the number of command line arguments passed to the program.
875 The procedure
876 .SM ARGV
877 takes an index on the stack and copies the specified
878 command line argument into a pascal string array.
879 .NH 2
880 Pascal procedures and functions
882 PACK s,w,w,w
884 UNPACK s,w,w,w
886 They function as a memory to memory move with several
887 semantic checks.
888 They do no ``unpacking'' or ``packing'' in the true sense as the
889 interpreter supports no packed data types.
891 NEW s
893 DISPOSE s
896 .SM LV
897 of a pointer is passed.
898 .SM NEW
899 allocates a record of a specified size and puts a pointer
900 to it into the pointer variable.
901 .SM DISPOSE
902 deallocates the record pointed to by the pointer
903 and sets the pointer to
904 .SM NIL .
907 The function
908 .SM CHR*
909 converts a suitably small integer into an ascii character.
910 Its primary purpose is to do a range check.
911 The function
912 .SM ODD*
913 returns
914 .I true
915 if its argument is odd and returns
916 .I false
917 if its argument is even.
918 The function
919 .SM UNDEF
920 always returns the value
921 .I false .