1 WARNING! This document is historical but left out of the historical directory
2 because it describes the basic architecture of the compiler that mostly still
3 apply as of writing. Refer to the code itself for up to date information such
4 as what AST Nodes are in use.
6 OMG INTERFACE DEFINITION LANGUAGE COMPILER FRONT END PROTOCOLS
7 ==============================================================
12 Welcome to the publicly available source release of SunSoft's
13 implementation of the compiler front end (CFE) for OMG Interface Definition
16 This document explains how to use the release to create a fully functional
17 OMG Interface Definition Language to target language compiler for your
18 selected target system configuration. The section OVERVIEW explains this
24 The implementation has three parts:
26 1. A main program driving the compilation process
27 2. A parser and attendant utilities for converting the IDL input into
29 3. One or more back ends which take as input the internal form representing
30 the IDL input, and which produce output in a target language and target
33 The release contains components 1 and 2, and a demonstration implementation
34 of component 3. To use this release, you
36 - write a back end which takes the internal representation of the parsed input
37 and translates it to the target language and format. You may replace or
38 modify the demonstration back end provided.
39 - link the back end with the provided main program and parser sources
40 to produce a complete compiler.
45 This document does not explain IDL nor does it introduce IDL features.
46 For this information, refer to the OMG CORBA specification, available by
47 anonymous FTP from omg.org.
49 This document does not explain C++, except to demonstrate how it is
50 used to construct the CFE. The ARM by Stroustrup and Ellis provides a
51 thorough explanation of C++.
53 This document consists of two independent parts. The first part
54 s all CFE supported protocols and the required
55 application programmer's interface entry points that a conformant
56 BE must provide. The second part steps through the process of
57 constructing a working BE.
59 The first part describes:
61 - The compilation process
62 - The Abstract Syntax Tree (AST) internal representation of parsed IDL
64 - How access to member data fields is managed
65 - How the AST is generated from the IDL input (Generator protocol)
66 - How definition scopes are nested and how name lookup works
67 - The narrowing mechanism
68 - How definition scopes are managed and how nodes are added to scopes
69 - How BEs get control during the AST construction process (Add protocol)
70 - The inheritance scheme used by the AST and how it affects BEs
71 - How errors are handled and reported
72 - How the CFE is initialized
73 - How the command line arguments are parsed
74 - What global variables and functions are provided
75 - What API is required to be supported by a BE in order to link
77 - What files must be included in each BE file
79 The second part describes
81 - The API to be supplied by each BE
82 - How to subclass from the AST to add BE specific functionality
83 - How to subclass from the Generator protocol to create BE specific
85 - How to write constructors for the derived BE classes
86 - How to use the Add protocol to store BE specific information
87 - How to maintain BE specific information which applies to the entire
88 AST generated from the IDL input
89 - How to use data members in your BE
90 - How to build a complete compiler
92 PART I. FEATURES OF THE CFE
93 -=========================-
95 THE COMPILATION PROCESS
96 -----------------------
98 The OMG IDL compiler operates as follows:
100 - Parses command line arguments. If an option is directed at a
101 BE, an appropriate operation provided by the BE is invoked to process
103 - Performs global initialization.
104 - Forks a copy of the compiler for each file specified as input.
105 - An ANSI-compatible preprocessor preprocesses the IDL input.
106 - Parses the file using the CFE parser, and constructs an AST describing the
108 - Prints the AST for verification, if requested.
109 - Invokes the BE to process the AST and produce the output
110 characteristic of that BE.
115 The AST (Abstract Syntax Tree) is the primary mechanism for communication
116 between a BE and the CFE. It consists of a tree of instances of classes
117 defined in the CFE or refinements of those classes as defined in a BE.
118 The class hierarchy of the AST closely resembles the structure of the IDL
119 syntax. Most AST classes have direct equivalents in IDL constructs.
121 The UTL_Scope class defines common functionality for definition scope
122 management and name lookup. This is explained in a following section.
123 UTL_Scope is defined in include/utl_scope.hh and implemented in
126 The AST provides the following classes:
128 AST_Decl Base of the AST class hierarchy. Each class in the AST
129 inherits from AST_Decl. Defined in include/ast_decl.hh
130 and implemented in ast/ast_decl.cc
132 AST_Type Common base class for all classes which represent IDL
133 type constructs. Defined in include/ast_type.hh and
134 implemented in ast/ast_type.cc. Inherits from AST_Decl.
136 AST_ConcreteType Common base class for all classes which represent IDL
137 types other than interfaces. Defined in the file
138 include/ast_concrete_type.hh and implemented in
139 ast/ast_concrete_type.cc. Inherits from AST_Type.
141 AST_PredefinedType Instances of this class represent all predefined types
142 such as long, char and so forth. Defined in the file
143 include/ast_predefined_type.hh and implemented in
144 ast/ast_predefined_type.cc. Inherits from
147 AST_Module Represents the IDL module construct. Defined in the
148 file include/ast_module.hh and implemented in
149 ast/ast_module.cc. Inherits from AST_Decl and
152 AST_Root Represents the root of the abstract syntax tree being
153 constructed. Is a subclass of AST_Module. Can be
154 subclassed in BEs to store information associated with
155 the entire AST. Defined in the file include/ast_root.hh
156 and implemented in ast/ast_root.cc. Inherits from
159 AST_Interface Represents the IDL interface construct. Defined in
160 include/ast_interface.hh and implemented in the file
161 ast/ast_interface.cc. Inherits from AST_Type and
164 AST_InterfaceFwd Represents a forward declaration of an IDL interface.
165 Defined in include/ast_interface_fwd.hh and implemented
166 in ast/ast_interface_fwd.cc. Inherits from AST_Decl.
168 AST_Attribute Represents an IDL attribute construct. Defined in
169 include/ast_attribute.hh and implemented in the file
170 ast/ast_attribute.cc. Inherits from AST_Decl.
172 AST_Exception Represents an IDL exception construct. Defined in
173 include/ast_exception.hh and implemented in the file
174 ast/ast_exception.cc. Inherits from AST_Decl.
176 AST_Structure Represents an IDL struct construct. Defined in the file
177 include/ast_structure.hh and implemented in the file
178 ast/ast_structure.cc. Inherits from AST_ConcreteType
181 AST_Field Represents a field in an IDL struct or exception
182 construct. Defined in include/ast_field.hh and
183 implemented in ast/ast_field.cc. Inherits from
186 AST_Operation Represents an IDL operation construct. Defined in the
187 file include/ast_operation.hh and implemented in
188 ast/ast_operation.cc. Inherits from AST_Decl and
191 AST_Argument Represents an argument to an IDL operation construct.
192 Defined in include/ast_argument.hh and implemented in
193 ast/ast_argument.cc. Inherits from AST_Field.
195 AST_Union Represents an IDL union construct. Defined in
196 include/ast_union.hh and implemented in
197 ast/ast_union.cc. Inherits from AST_ConcreteType and
200 AST_UnionBranch Represents an individual branch in an IDL union
201 construct. Defined in include/ast_union_branch.hh and
202 implemented in ast/ast_union_branch.cc. Inherits from
205 AST_UnionLabel Represents the label of an individual branch in an IDL
206 union construct. Defined in include/ast_union_label.hh
207 and implemented in ast/ast_union_label.cc
209 AST_Constant Represents an IDL constant construct. Defined in
210 include/ast_constant.hh and implemented in the file
211 ast/ast_constant.cc. Inherits from AST_Decl.
213 AST_Enum Represents an IDL enum construct. Defined in the file
214 include/ast_enum.hh and implemented in ast/ast_enum.cc.
215 Inherits from AST_ConcreteType and UTL_Scope.
217 AST_EnumVal Represents an enumerator in an IDL enum construct.
218 Defined in include/ast_enum_val.hh and implemented in
219 ast/ast_enum_val.cc. Inherits from AST_Constant.
221 AST_Sequence Represents an IDL sequence construct. Defined in
222 include/ast_sequence.hh and implemented in
223 ast/ast_sequence.cc. Inherits from AST_Decl.
225 AST_String Represents an IDL string construct. Defined in the file
226 include/ast_string.hh and implemented in
227 ast/ast_string.cc. Inherits from AST_Decl.
229 AST_Array Represents an array modifier to the type of an IDL
230 field or typedef declaration. Defined in the file
231 include/ast_array.hh and implemented in
232 ast/ast_array.cc. Inherits from AST_Decl.
234 AST_Typedef Represents an IDL typedef construct. Defined in the file
235 include/ast_typedef.hh and implemented in
236 ast/ast_typedef.cc. Inherits from AST_Decl.
238 AST_Expression Represents an IDL expression. Defined in the file
239 include/ast_expression.hh and implemented in
240 ast/ast_expression.cc.
242 AST_Root A subclass of AST_Module, an instance of this class
243 is used to represent the distinguished root node of
244 the AST. Defined in include/ast_root.hh and implemented
245 in ast/ast_root.cc. Inherits from AST_Module.
251 The AST classes define member data fields in addition to defining
252 operations on instances. These member data fields are all private, to allow
253 only the instance in which they are stored direct access. Other objects
254 (including other instances of the same class) can obtain access to the
255 member data fields of an instance through accessor functions. These
256 accessor functions allow retrieval of the data, and in some cases update
257 functions are also provided to store new values.
259 There are several reasons why this approach is taken. First, it hides the
260 actual implementation of the member data fields from outside the class. For
261 example, a Thermometer class would not expose whether its temperature
262 reading is stored in Farenheit or Celsius units, and it could allow access
263 through either unit method.
265 Second, protecting access to member data in this manner restricts the
266 ability to update it to the instance itself, save where update functions
267 are explicitly provided. This makes for more reliable implementations,
268 since the manipulation of the data is isolated in the class implementation
271 Third, wrapping a function call around access to member data allows such
272 access and update operations to be protected in a multithreaded
273 environment. While the CFE itself is not multithreaded and the access
274 operations as currently defined do no special work to protect against
275 mutliple conflicting access operations, this may be changed in a future
276 version. Moving the CFE to a multithreaded environment without protecting
277 access to member data in this manner would be extremely difficult.
279 The protocol defined in the CFE is that member data fields are all private
280 and have names which start with the prefix "pd_" (denoting Private Data).
281 The access functions have names which are the same as the name of the field
282 sans the prefix. For example, AST_Decl has a field pd_defined_in and an
283 access function defined_in().
285 The update functions have names starting with "set_" followed by the name
286 of the corresponding access function. Thus, AST_Decl defines a function
287 set_in_main_file(boolean) which sets the pd_in_main_file data member's
288 value to the boolean provided.
290 GENERATION OF THE AST
291 ---------------------
293 The CFE generates the abstract syntax tree after parsing IDL
294 input. The nodes of the AST are defined by classes introduced in the
295 previous section, or by subclasses thereof as defined by each BE. In
296 writing the CFE, we were faced with the following problem: how to generate
297 the AST containing nodes of the derived classes as defined in each BE
298 without knowledge of the types and conventions of these BE classes.
300 One alternative was to define a naming scheme which predetermines the names
301 of each subclass a BE can define. The AST would then be generated by
302 calling an appropriate constructor on the BE derived class. This scheme
303 suffers from some shortcomings:
305 - It breaks the modularity of the compiler and imports knowledge about
306 types defined in a BE into the CFE, where this information does not belong.
307 - It restricts a compiler to having only one BE loaded at a time because the
308 names of these classes can be in use in only one BE at a time.
309 - It requires a BE to provide derived classes for all AST classes, even for
310 those classes where the BE adds no functionality.
312 The mechanism we chose is different. We define the AST_Generator class
313 which has an operation for each constructor defined on each AST class. The
314 operation takes arguments appropriate to the constructor, invokes it and
315 returns the created AST node, using the type known to the CFE. All such
316 operations on the generator are declared virtual. The names of all
317 operations start with "create_" and contain the name of the construct.
318 Thus, an operation which invokes a constructor of an AST_Module is named
319 create_module. AST_Generator is defined in include/ast_generator.hh and
320 implemented in ast/ast_generator.cc.
322 If a BE derives from any AST class, it must also derive from the
323 AST_Generator class and redefine the relevant operations to invoke
324 constructors of the BE provided class instead of the AST provided class.
325 For example, if BE_Module is a subclass of AST_Module in a BE, the BE would
326 also define BE_Generator and redefine create_module to call the constructor
327 of BE_Module instead of that provided by AST_Module.
329 During initialization, the CFE causes an instance of the BE derived
330 generator to be created and saved. This is explained in the section on
331 REQUIRED ENTRY POINTS SUPPLIED BY A BE. During parsing, actions in the Yacc
332 grammar invoke operations on the saved instance to create new nodes for the
333 AST as it is being built. These operations invoke constructors for BE
334 derived classes or for AST provided classes if they were not overridden.
339 IDL is a nested scoped language. The scoping rules are defined by the CORBA
340 spec and closely follow those of C++.
342 Scope management is implemented in two classes provided in the utilities
343 library, UTL_Scope and UTL_Stack. UTL_Scope manages associations between
344 names and AST nodes, and UTL_Stack manages scope nesting and entry and exit
345 from definition scopes as the parse is proceeding. UTL_Scope is defined in
346 include/utl_scope.hh and implemented in util/utl_scope.cc. UTL_Stack is
347 defined in include/utl_stack.hh and implemented in util/utl_stack.cc.
349 During initialization, the CFE creates an instance of UTL_Stack and saves
350 it. During parsing, as definition scopes are entered and exited, AST nodes
351 are pushed onto, or popped from, the stack represented by the saved
352 instances. Nodes on the stack are stored as instances of UTL_Scope. Section
353 THE NARROWING MECHANISM explains how to obtain the real type of a node
354 retrieved from the stack.
356 All definition scopes are linked in a tree rooted in the distinguished AST
357 root node. This linkage is implemented by UTL_Scope and AST_Decl. The
358 linkage is a permanent record of the scope nesting while the stack is a
359 dynamic record which at each instant represents the current state of the
362 The nesting information is used to do name lookup. IDL uses scoped names
363 which are concatenations of definition scope names ending with individual
364 construct names. For example, in
376 the name a::b::c represents the long field in the struct b inside the
379 Lookup is performed by searching down the linkage chain for the first component
380 of the name, then, when found, recursively resolving the remaining
381 components in the scope defined by the first component. Lookup is relative
382 to the scope of use; in the above example, k could also have been referred to
383 as a::k within the struct s.
385 Nodes are stored in a definition scope as instances of AST_Decl. Thus, name
386 lookup returns instances of AST_Decl. The next section, THE NARROWING
387 MECHANISM, explains how to obtain the real type of a node retrieved from a
390 THE NARROWING MECHANISM
391 -----------------------
393 Here we give only a cursory explanation of how narrowing works. We
394 concentrate on defining the problem and showing how to use our narrowing
395 mechanism. The narrowing mechanism is defined in include/idl_narrow.hh.
397 As explained above, nodes are stored on the scope stack as instances of
398 UTL_Scope, and inside definition scopes as instances of AST_Decl. Also,
399 nodes are linked in a nesting tree as instances of AST_Decl. Given a node
400 retrieved from the stack or a definition scope, one is faced with the task
401 of obtaining its real class. C++ does not currently provide an implicit
402 mechanism for narrowing to a derived class, so the CFE defines its own
403 mechanism. This mechanism requires some work on your part as BE implementor
404 and requires some explicit code to be written when it is to be used.
406 The class AST_Decl defines an enum whose members encode specific AST node
407 classes. AST_Decl provides an accessor function, node_type(), which
408 retrieves a member of the enum representing the AST type of the node. Thus,
409 if an instance of AST_Decl really is an instance of AST_Module, the
410 node_type() accessor returns AST_Decl::NT_module.
412 The class UTL_Scope also provides an accessor function, scope_node_type(),
413 which returns a member of the enum encoding the actual type of the node.
414 Thus, given an UTL_Scope instance which is really an instance of
415 AST_Operation, scope_node_type() would return AST_Decl::NT_op.
417 Perusing the header files for classes provided by the AST, you will note
418 the use of some macros defined in include/idl_narrow.hh. These macros
419 define the explicit narrowing mechanism:
421 DEF_NARROW_METHODSx(<class name>,<parent_x>) for x equal to 0,1,2 or 3,
422 defines a narrowing method for the specified class which has 0,1,2 or 3
423 immediate base classes from which it inherits. For example, ast_module.hh
424 which defines AST_Module contains the following line:
426 DEF_NARROW_METHODS2(AST_Module, AST_Decl, UTL_Scope)
428 This is because AST_Module inherits directly from AST_Decl and UTL_Scope.
430 DEF_NARROW_FROM_DECL(<class name>) appears in class definitions for classes
431 which are derived from AST_Decl and which can be stored in a definition
432 scope. This macro declares a static operation narrow_from_decl(AST_Decl *)
433 on the class in which it appears. The operation returns the provided
434 instance as an instance of <class name> if it can be narrowed, or NULL.
436 DEF_NARROW_FROM_SCOPE(<class name>) appears in class definitions of classes
437 which are derived from UTL_Scope and which can be stored on the scope
438 stack. This macro declares a static operation narrow_from_scope(UTL_Scope *)
439 on the class in which it appears. The operation returns the provided
440 instance as an instance of <class name> if it can be narrowed, or NULL.
442 Now look in the files implementing these classes. You will note occurrences
443 of the following macros:
445 IMPL_NARROW_METHODSx(<class name>,<parent_x>) for x equal to 0,1,2 or 3,
446 implements a narrowing method for the specified class which has 0,1,2 or 3
447 immediate base classes from which it inherits. For example, ast_module.cc
448 which implements AST_Module contains the following line:
450 IMPL_NARROW_METHODS2(AST_Module, AST_Decl, UTL_Scope)
452 IMPL_NARROW_FROM_DECL(<class name>) implements a method to narrow from an
453 instance of AST_Decl to an instance of <class name> as defined above.
455 IMPL_NARROW_FROM_SCOPE(<class name>) implements a method to narrow from an
456 instance of UTL_Scope to an instance of <class name> as defined above.
458 To put it all together: In the file ast_module.hh, you will find:
461 DEF_NARROW_METHODS2(AST_Module, AST_Decl, UTL_Scope);
462 DEF_NARROW_FROM_DECL(AST_Module);
463 DEF_NARROW_FROM_SCOPE(AST_Module);
465 In the file ast_module.cc, you will see:
470 IMPL_NARROW_METHODS2(AST_Module, AST_Decl, UTL_Scope)
471 IMPL_NARROW_FROM_DECL(AST_Module)
472 IMPL_NARROW_FROM_SCOPE(AST_Module)
474 The CFE uses narrowing internally to obtain the correct type of nodes in
475 the AST. The CFE contains many code fragments such as the following:
477 AST_Decl *d = get_an_AST_Decl_from_somewhere();
480 if (d->node_type() == AST_Decl::NT_module) {
481 m = AST_Module::narrow(d);
482 if (m == NULL) { // Narrow failed
484 } else { // Success, do normal processing
490 Similar code implements narrowing instances of UTL_Scope to their actual
493 In your BE classes which derive from UTL_Scope you must include a line
494 defining how to narrow from a scope, so:
496 DEF_NARROW_FROM_SCOPE(<your BE class>)
498 and similarly for your BE classes which derive from AST_Decl.
500 The narrowing mechanism is defined only for narrowing from AST_Decl and
501 UTL_Scope. If your BE class inherits directly from one or more classes
502 which themselves are derived from AST_Decl and/or UTL_Scope, you must
505 DEF_NARROW_METHODSx(<your class name>,<parent 1>,<parent 2>)
507 To make this concrete, here is what you'd write in a definition of BE_union
508 which inherits from AST_Union:
510 DEF_NARROW_METHODS1(BE_Union, AST_Union);
511 DEF_NARROW_FROM_DECL(BE_Union);
512 DEF_NARROW_FROM_SCOPE(BE_Union);
514 and in the implementation file of BE_Union:
519 IMPL_NARROW_METHODS1(BE_Union, AST_Union)
520 IMPL_NARROW_FROM_DECL(BE_Union)
521 IMPL_NARROW_FROM_SCOPE(BE_Union)
523 Then, in BE code which expects to see an instance of your derived BE_Union
524 class, you will write:
526 AST_Decl *d = get_an_AST_Decl_from_somewhere();
529 if (d->node_type() == AST_Decl::NT_union) {
530 u = BE_Union::narrow_from_decl(d);
531 if (u == NULL) { // Narrow failed
533 } else { // Success, do normal processing
543 Instances of classes which are derived from UTL_Scope implement definition
544 scopes. A definition scope can contain any kind of AST node as long as it
545 is derived from AST_Decl. However, specific kinds of definition scopes such
546 as interfaces and unions can contain only a restricted subset of all AST
549 UTL_Scope provides operations to add instances of each AST provided class
550 to a definition scope. The names of these operations are constructed by
551 prepending the string "add_" to the name of the IDL construct. So, to add
552 an interface to a definition scope, invoke the operation add_interface.
553 The operations are all defined virtual and are intended to be overridden in
554 classes derived from UTL_Scope.
556 If the node was successfully added to the definition scope, the node is
557 returned as the result. Otherwise the node is not added to the definition
558 scope and NULL is returned.
560 All add operation implementations in UTL_Scope return NULL. Thus,
561 only the operations which implement legal additions to a specific kind of
562 definition scope must be overridden in the implementation of that
563 definition scope. For example, in AST_Module the add_interface operation is
564 overridden to add the provided instance of AST_Interface to the scope and
565 to return the provided instance if the addition was successful. Operations
566 which were not overridden return NULL to indicate that the addition is
567 illegal in this context. For example, in AST_Operation the definition of
568 add_interface is not overridden since it is illegal to store an interface
569 inside an operation definition scope.
571 The add operations are invoked in the actions in the Yacc grammar. The
572 following fragment is a representative example of code using the add
575 AST_Constant *d = construct_a_new_constant();
577 if (current_scope->add_constant(d) == NULL) { // Failed
579 } else { // Succeeded
583 BE INTERACTION DURING THE PARSING PROCESS
584 -----------------------------------------
586 The add operations can be overridden in BE derived classes to let the BE
587 perform additional house-keeping work during the process of constructing
588 the AST. For example, a BE could keep separate lists of interfaces as they
589 are being added to a module.
591 If you override an add operation in your BE, you must invoke the overridden
592 operation in the superclass of your derived class to allow the CFE to
593 perform its own house-keeping tasks. A good rule is to invoke the operation
594 on the superclass before you do your own processing; then, if the
595 superclass operation returns NULL, this indicates that the addition failed
596 and your own code should immediately return NULL. An example explains this:
599 BE_Module::add_interface(AST_Interface *i)
601 if (AST_Module::add_interface(i) == NULL) // Failed, bail out!
603 ... // Do your own work here
604 return i; // Return success indication
607 We strongly advise you to only define add operations that override add
608 operations provided by the AST classes. Add operations which
609 do not override equivalent operations in the AST in effect
610 extend the semantics of the language accepted by the compiler. For
611 example, the CFE does not have an add_interface operation on
612 AST_Operation. If you were to define one in your BE_Operation class,
613 the resulting compiler would allow an interface to be
614 stored in an operation definition scope. The current CORBA specification
617 AST INHERITANCE SCHEME
618 ----------------------
620 The AST classes all use public virtual inheritance to construct the
621 inheritance tree. This ensures that a class may appear several times in the
622 inheritance tree through different paths and the derived class's instances
623 will have only one copy of the inherited class's data.
625 The use of public virtual inheritance has several important effects on how
626 a BE is constructed. We explain those effects below.
628 First, you must define a default constructor for your BE class, since
629 your class may be used as a virtual base class of some other class. In this
630 case the compiler may want to call a default constructor for your class. It
631 is a good idea to have a default constructor anyway, even if you do not
632 plan to subclass your BE class, since for most C++ compilers this causes
633 the code to be smaller. Your default constructor should initialize all
634 constant data members. Additionally, it may initialize any non-constant
635 data member whose value must be set before the first time the instance is
638 Second, the constructor of your BE derived class must explicitly call all
639 constructors of virtual base classes which perform useful work. For
640 example, if a class in the AST from which your BE class inherits has an
641 initializer for a data member, you must call that constructor. This rule is
642 discussed in detail in the C++ ARM. An example may help here.
644 Suppose you define a class BE_attribute which inherits from AST_Attribute.
645 Its constructor should be as follows:
647 BE_Attribute::BE_Attribute(boolean ro,
651 : AST_Attribute(ro, ft, n, p),
653 AST_Decl(AST_Decl::NT_attr, n, p)
657 The calls to the constructors of AST_Attribute, AST_Field and AST_Decl are
658 needed because these constructors do useful initializations on their
661 Note that there is some redundancy in the data passed to these
662 constructors. We chose to preserve this redundancy since it should be
663 possible to create BEs which subclass only some of the classes supplied by
664 the AST. This means that the constructors on each class provided by the AST
665 should take arguments which are sufficient to construct the instance if
666 the AST class is the most derived one.
668 The code supplied with this release contains a demonstration BE which
669 subclasses all the AST provided classes. The constructors for each class
670 provided by the BE are found in the file be/be_classes.cc.
675 The following steps take place at initialization:
677 - The global data instance is created, stored in idl_global and filled with
678 default values (in driver/drv_init.cc).
679 - The command line arguments are parsed (in driver/drv_args.cc).
680 - For each IDL input file, a copy of the compiler process is forked (in
682 - The IDL input is preprocessed (in driver/drv_preproc.cc).
683 - FE initialization stage 1 is done: the scopes stack is created and stored
684 in the global data variable idl_global->scopes() field (in fe/fe_init.cc).
685 - BE_init is called to create the generator instance and the returned
686 instance is stored in the global data variable idl_global->gen() field.
687 - FE initialization stage 2 is done: the global scope is created, pushed on
688 the scopes stack and populated with predefined types (in fe/fe_init.cc).
690 GLOBAL STATE AND ENTRY POINTS
691 -----------------------------
693 The CFE has one global variable named idl_global, which stores an instance
694 of a class IDL_GlobalData as explained below:
696 The CFE defines a class IDL_GlobalData which defines the global
697 information used in a specific run of the compiler. IDL_GlobalData is
698 defined in include/idl_global.hh and implemented in the file
701 Initialization creates an instance of this class and stores it in the value
702 of the global variable idl_global. Thus, the individual pieces of
703 information stored in the instance are accessible everywhere.
708 All error handling is defined by a class provided by the CFE, UTL_Error.
709 This class is defined in include/utl_error.hh and implemented in the file
710 util/utl_error.cc. The class provides several methods for reporting
711 specific errors as well as generic error reporting methods taking zero to
714 The CFE instantiates the class and stores the instance as part of the
715 global state, accessible as idl_global->err(). Thus, to cause an error
716 report, you would write code similar to the following:
718 if (error condition found)
719 idl_global->err()->specific_error_message(arg1, ..);
723 if (error condition found)
724 idl_global->err()->generic_error_message(flag, arg1, ..);
726 The flag argument is one of the predefined error conditions found in the
727 enum at the head of the UTL_Error class definition. The arguments to the
728 specific error message routine are defined by the signature of that
729 routine. The arguments to a generic error message routine are always
730 instances of AST_Decl.
732 The running count of errors is accessible as idl_global->err_count(). If
733 the value returned by this operation is non-zero after the IDL input has
734 been parsed, the BE is not invoked.
736 HANDLING OF COMMAND LINE ARGUMENTS
737 ----------------------------------
739 Defined command line arguments are specified in the document CLI, in this
740 directory. The CFE calls the required BE API entry point BE_prep_arg to
741 process arguments passed within a -Wb flag.
743 REQUIRED ENTRY POINTS SUPPLIED BY A BE
744 --------------------------------------
746 The following API entry points must be supplied by a BE in order to
747 successfully link with the CFE:
749 extern "C" AST_Generator *BE_init();
751 Creates an instance of the generator object and returns it. Note
752 that the global scope is not yet set up and the scopes stack is
753 empty when this routine is called.
755 extern "C" void BE_produce();
757 Called by the compiler main program after the IDL input has been
758 successfully parsed and processed. The job of this routine is to
759 carry out the specific function of the BE. The AST is accessible as
760 the value of idl_global->root().
762 extern "C" void BE_prep_arg(char *, idl_bool);
764 Called to process an argument passed in with a -Wb flag. The boolean
765 will always be FALSE.
767 extern "C" void BE_abort();
769 Called when the CFE decides to abort the compilation. Can be used in
770 a BE to clean up after itself, e.g. remove temporary files or
771 directories it created while the parse was in progress.
773 extern "C" void BE_version();
775 Called when a -V argument is processed. This should produce a
776 message for the user identifying the BE that is loaded and its
779 PART II. WRITING A BACK END
780 -=========================-
782 REQUIRED API THAT EACH BE MUST SUPPORT
783 --------------------------------------
785 Below are the API entry points that each BE must supply in order to use the
786 CFE framework. This is a repeat of the BE API section:
788 extern "C" AST_Generator *BE_init();
790 Creates an instance of the generator object and returns it. Note
791 that the scopes stack is still not set up at the time this routine
794 extern "C" void BE_produce();
796 Called by the compiler main program after the IDL input has been
797 successfully parsed and processed. The job of this routine is to
798 carry out the specific function of the BE. The AST is accessible as
799 the value of idl_global->root().
801 extern "C" void BE_prep_arg(char *, boolean);
803 Called to process an argument passed in with a -Wb flag. The boolean
804 will always be FALSE.
806 extern "C" void BE_abort();
808 Called when the CFE decides to abort the compilation. Can be used in
809 a BE to clean up after itself, e.g. remove temporary files or
810 directories it created while the parse was in progress.
812 extern "C" void BE_version();
814 Called when a -V argument is processed. This should produce a
815 message for the user identifying the BE that is loaded and its
818 WHAT FILES TO INCLUDE
819 ---------------------
821 To use the CFE, each implementation file of your BE must include the
822 following two header files:
825 #include <idl_extern.hh>
827 Following this, you can include any header files needed by your BE.
829 HOW TO SUBCLASS THE AST
830 -----------------------
832 Your BE may subclass from any of the classes provided by the AST. Your
833 class should use public virtual inheritance to ensure that only one copy of
834 the class's data members is present in each instance. Read the section on
835 HOW TO WRITE CONSTRUCTORS to learn about additional considerations that you
836 must take into account when writing constructors for your BE classes.
838 HOW TO SUBCLASS THE GENERATOR TO CREATE BE ENHANCED AST NODES
839 -------------------------------------------------------------
841 Your BE subclasses from classes provided by the AST. To ensure that
842 instances of these classes are constructed when the AST is built, you must
843 also subclass AST_Generator and return an instance of your subclass from
846 The AST_Generator class provides operations to create instances of all
847 classes defined in the AST. For example, the operation to create an
848 AST_Attribute node is as follows:
851 AST_Generator::create_attribute(...)
853 return new AST_Attribute(...);
856 In your BE_Generator subclass of AST_Generator, you will override methods
857 for creation of nodes of all AST classes which you have subclassed. Thus,
858 if your BE has a class BE_Attribute which is a subclass of AST_Attribute,
859 your BE_Generator class definition has to override the create_attribute
860 method to ensure that instances of BE_Attribute are created.
862 The definition of the overriden operations should call the constructor of
863 the derived class and return the new node as an instance of the inherited
864 class. Thus, the implementation of create_attribute is as follows:
867 BE_Generator::create_attribute(...)
869 return new BE_Attribute(...);
872 The Yacc grammar actions call create_xxx operations on the generator
873 instance stored in the global variable idl_global->gen() field. By storing
874 an instance of your derived generator class BE_Generator you ensure that
875 instances of the BE classes you defined will be created.
877 HOW TO WRITE CONSTRUCTORS FOR BE CLASSES
878 ----------------------------------------
880 As mentioned above, the AST uses public virtual inheritance to derive the
881 AST class hierarchy. This has two important effects on how you write a BE,
882 specifically how you write constructors for derived BE classes.
884 First, you must define a default constructor for your BE class, since
885 your class may be used as a virtual base class of some other class. In that
886 case the compiler may want to call a default constructor for your class. It
887 is a good idea to have a default constructor anyway, even if you do not
888 plan to subclass your BE class, since for most C++ compilers this causes
889 the code to be smaller. Your default constructor should initialize all
890 constant data members. Additionally, it may initialize any non-constant
891 data member whose value must be set before the first time the instance is
894 Second, the constructor for your BE class must explicitly call all
895 constructors of virtual base classes which do some useful work. For
896 example, if a class in the AST from which your BE class inherits, directly
897 or indirectly, has an initializer for a data member, your BE class's
898 constructor must call the AST class's constructor. This is discussed
899 extensively in the C++ ARM.
901 Below is a list showing how to write constructors for subclasses of each
902 class provided by the BE. For each AST class we show a definition of a
903 constructor for a derived class which calls all neccessary constructors on
908 BE_Argument::BE_Argument(AST_Argument::Direction d,
912 : AST_Argument(d, ft, n, p),
913 AST_Field(AST_Decl::NT_argument, ft, n, p),
914 AST_Decl(AST_Decl::NT_argument, n, p)
920 BE_Array::BE_Array(UTL_ScopedName *n,
923 : AST_Array(n, nd, ds),
924 AST_Decl(AST_Decl::NT_array, n, NULL)
931 BE_Attribute::BE_Attribute(boolean ro,
935 : AST_Attribute(ro, ft, n, p),
936 AST_Field(AST_Decl::NT_attr, ft, n, p),
937 AST_Decl(AST_Decl::NT_attr, n, p)
943 BE_ConcreteType::BE_ConcreteType(AST_Decl::NodeType nt,
952 BE_Constant::BE_Constant(AST_Expression::ExprType t,
956 : AST_Constant(t, v, n, p),
957 AST_Decl(AST_Decl::NT_const, n, p)
963 BE_Decl::BE_Decl(AST_Decl::NodeType nt,
972 BE_Enum::BE_Enum(UTL_ScopedName *n,
975 AST_Decl(AST_Decl::NT_enum, n, p),
976 UTL_Scope(AST_Decl::NT_enum)
982 BE_EnumVal::BE_EnumVal(unsigned long v,
985 : AST_EnumVal(v, n, p),
986 AST_Constant(AST_Expression::EV_ulong,
987 AST_Decl::NT_enum_val,
988 new AST_Expression(v),
991 AST_Decl(AST_Decl::NT_enum_val, n, p)
997 BE_Exception::BE_Exception(UTL_ScopedName *n,
999 : AST_Decl(AST_Decl::NT_except, n, p),
1000 AST_Structure(AST_Decl::NT_except, n, p),
1001 UTL_Scope(AST_Decl::NT_except)
1007 BE_Field::BE_Field(AST_Type *ft,
1010 : AST_Field(ft, n, p),
1011 AST_Decl(AST_Decl::NT_field, n, p)
1017 BE_Interface::BE_Interface(UTL_ScopedName *n,
1021 : AST_Interface(n, ih, nih, p),
1022 AST_Decl(AST_Decl::NT_interface, n, p),
1023 UTL_Scope(AST_Decl::NT_interface)
1029 BE_InterfaceFwd::BE_InterfaceFwd(UTL_ScopedName *n,
1031 : AST_InterfaceFwd(n, p),
1032 AST_Decl(AST_Decl::NT_interface_fwd, n, p)
1038 BE_Module::BE_Module(UTL_ScopedName *n,
1040 : AST_Decl(AST_Decl::NT_module, n, p),
1041 UTL_Scope(AST_Decl::NT_module)
1047 BE_Operation::BE_Operation(AST_Type *rt,
1048 AST_Operation::Flags fl,
1051 : AST_Operation(rt, fl, n, p),
1052 AST_Decl(AST_Decl::NT_op, n, p),
1053 UTL_Scope(AST_Decl::NT_op)
1059 BE_PredefinedType::BE_PredefinedType(
1060 AST_PredefinedType::PredefinedType *pt,
1063 : AST_PredefinedType(pt, n, p),
1064 AST_Decl(AST_Decl::NT_pre_defined, n, p)
1070 BE_Root::BE_Root(UTL_ScopedName *n, UTL_StrList *p)
1072 AST_Decl(AST_Decl::NT_module, n, p),
1073 UTL_Scope(AST_Decl::NT_module)
1080 BE_Sequence::BE_Sequence(AST_Expression *ms, AST_Type *bt)
1081 : AST_Sequence(ms, bt),
1082 AST_Decl(AST_Decl::NT_sequence,
1083 new UTL_ScopedName(new String("sequence"), NULL),
1090 BE_String::BE_String(AST_Expression *ms)
1092 AST_Decl(AST_Decl::NT_string,
1093 new UTL_ScopedName(new String("string"), NULL),
1100 BE_Structure::BE_Structure(UTL_ScopedName *n,
1102 : AST_Decl(AST_Decl::NT_struct, n, p),
1103 UTL_Scope(AST_Decl::NT_struct)
1109 BE_Type::BE_Type(AST_Decl::NodeType nt,
1112 : AST_Decl(nt, n, p)
1118 BE_Typedef::BE_Typedef(AST_Type *bt,
1121 : AST_Typedef(bt, n, p),
1122 AST_Decl(AST_Decl::NT_typedef, n, p)
1128 BE_Union::BE_Union(AST_ConcreteType *dt,
1131 : AST_Union(dt, n, p),
1132 AST_Structure(AST_Decl::NT_union, n, p),
1133 AST_Decl(AST_Decl::NT_union, n, p),
1134 UTL_Scope(AST_Decl::NT_union)
1140 BE_UnionBranch::BE_UnionBranch(AST_UnionLabel *fl,
1144 : AST_UnionBranch(fl, ft, n, p),
1145 AST_Field(ft, n, p),
1146 AST_Decl(AST_Decl::NT_union_branch, n, p)
1152 BE_UnionLabel::BE_UnionLabel(AST_UnionLabel::UnionLabel lk,
1154 : AST_UnionLabel(lk, lv)
1158 HOW TO USE THE ADD PROTOCOL
1159 ---------------------------
1161 As explained the section SCOPE MANAGEMENT, the CFE manages scopes by
1162 calling type-specific functions to add new nodes to the scope to be
1163 augmented. These functions can be overridden in your BE classes to do work
1164 specific to your BE class. For example, in a BE_module class, you might
1165 override add_interface to do additional work.
1167 The protocol defined by the "add_" functions is that they return NULL to
1168 indicate failure. They return the node that was added (and which was given
1169 as an argument) if the operation succeeded. Your functions in your BE class
1170 should follow the same protocol.
1172 The "add_" functions defined in the BE must call the overridden function in
1173 the base class defind in the CFE in order for the CFE scope management
1174 mechanism to work. Otherwise, the CFE does not get an opportunity to
1175 augment its scopes with the new node to be added. It is good practice to
1176 call the overridden "add_" function as the first action in your BE
1177 function, because the success or failure of the CFE operation indicates
1178 whether your function should complete its task or abort early.
1180 Here is an example. Suppose you have defined a class BE_module which
1181 inherits from AST_Module. You may wish to override the add_interface
1182 function as follows:
1184 class BE_Module : public virtual AST_Module
1190 virtual AST_Interface *add_interface(AST_Interface *);
1194 The implementation of this function would look something like the following:
1197 BE_Module::add_interface(AST_Interface *new_in)
1200 * Check that the CFE operation succeeds. If it returns
1201 * NULL, stop any further work
1203 if (AST_Module::add_interface(new_in) == NULL)
1206 * OK, non-NULL, this means the BE can do its own work here
1210 * Finally, don't forget to return the argument to indicate
1216 HOW TO MAINTAIN BE SPECIFIC INFORMATION
1217 ---------------------------------------
1219 The CFE provides a special class AST_Root, a subclass of AST_Module. An
1220 instance of the AST_Root class is used as the distinguished root of the
1221 abstract syntax tree built during a parse.
1223 Your BE can subclass BE_Root from AST_Root and override the create_root
1224 operation in your BE_Generator class derived from AST_Generator. This will
1225 cause the CFE to create an instance of your BE_Root class as the root of
1226 the tree being constructed.
1228 You can use the instance of the BE_Root class as a convenient place to
1229 store information specific to an individual tree. For example, you could
1230 add operations on the BE_Root class to count how many nodes of each class
1233 HOW TO USE MEMBER DATA
1234 ----------------------
1236 As explained above, the AST classes provide access and update functions for
1237 manipulating data members. Your BE classes must use these functions when
1238 they require access to data members defined in the AST classes, since the
1239 data members themselves are private.
1241 It is good practice to follow the same scheme in your BE classes. Make all
1242 data members private. Prepend the names of all such fields with "pd_".
1243 Define access functions with names equal to the name of the field without the
1244 prefix. Define update functions according to need by prepending the name of
1245 the access function with the prefix "set_".
1247 Using these techniques will allow your BE to enjoy the same benefits that
1248 are imparted onto the CFE. Your BE will be easier to move to a
1249 multithreaded environment and its data members will be better protected and
1252 HOW TO BUILD A COMPLETE COMPILER
1253 --------------------------------
1255 We now have all information needed to write a BE and to link it in with the
1256 CFE, to produce a complete IDL compiler.
1258 The following assumes that your BE will be stored in the "be" directory
1259 under the "release" directory. See the document ROADMAP for an explanation
1260 of the directory structure of the source release. If you decide to use a
1261 different directory to store your BE, you may have to modify the CPP_FLAGS in
1262 "idl_make_vars" in the top-level directory to allow your BE to find the
1263 include files it needs. You will also need to modify several targets in
1264 the Makefile in the top-level directory to correctly compile your BE into a
1265 library and to correctly link it in with the CFE to produce a complete
1268 You can get started quickly on writing your BE by modifying the sources
1269 found in the "demo_be" directory. The Makefile supports all the the targets
1270 that are needed to build a complete system and the maintenance target
1271 "clean" which assists in keeping the files and directories tidy. The files
1272 provided in the "demo_be" directory also provide all the API entry points
1273 that are mandated by this document.
1275 To build a complete compiler, invoke "make" or "make all" in the top-level
1276 directory. This will compile your BE and all the CFE sources, if this is
1277 the first invocation. On subsequent invocations this will recompile only
1278 the modified files. You will rarely if at all modify the CFE sources, so
1279 the overhead of compiling the CFE is incurred only the first time. To build
1280 just your BE, you can invoke "make all" or "make" in the "demo_be"
1281 directory. You can also, from the top-level directory, invoke "make
1284 HOW TO OBTAIN ASSISTANCE
1285 ------------------------
1287 First, read all the documents provided. If you have unanswered questions,
1292 Sun does not promise to support the IDL CFE source release in any manner.
1293 However, we will attempt to answer questions and correct problems as time
1298 SunOS, SunSoft, Sun, Solaris, Sun Microsystems or the Sun logo are
1299 trademarks or registered trademarks of Sun Microsystems, Inc.
1304 Copyright 1992, 1993, 1994 Sun Microsystems, Inc. Printed in the United
1305 States of America. All Rights Reserved.
1307 This product is protected by copyright and distributed under the following
1308 license restricting its use.
1310 The Interface Definition Language Compiler Front End (CFE) is made
1311 available for your use provided that you include this license and copyright
1312 notice on all media and documentation and the software program in which
1313 this product is incorporated in whole or part. You may copy and extend
1314 functionality (but may not remove functionality) of the Interface
1315 Definition Language CFE without charge, but you are not authorized to
1316 license or distribute it to anyone else except as part of a product or
1317 program developed by you or with the express written consent of Sun
1318 Microsystems, Inc. ("Sun").
1320 The names of Sun Microsystems, Inc. and any of its subsidiaries or
1321 affiliates may not be used in advertising or publicity pertaining to
1322 distribution of Interface Definition Language CFE as permitted herein.
1324 This license is effective until terminated by Sun for failure to comply
1325 with this license. Upon termination, you shall destroy or return all code
1326 and documentation for the Interface Definition Language CFE.
1328 INTERFACE DEFINITION LANGUAGE CFE IS PROVIDED AS IS WITH NO WARRANTIES OF
1329 ANY KIND INCLUDING THE WARRANTIES OF DESIGN, MERCHANTIBILITY AND FITNESS
1330 FOR A PARTICULAR PURPOSE, NONINFRINGEMENT, OR ARISING FROM A COURSE OF
1331 DEALING, USAGE OR TRADE PRACTICE.
1333 INTERFACE DEFINITION LANGUAGE CFE IS PROVIDED WITH NO SUPPORT AND WITHOUT
1334 ANY OBLIGATION ON THE PART OF Sun OR ANY OF ITS SUBSIDIARIES OR AFFILIATES
1335 TO ASSIST IN ITS USE, CORRECTION, MODIFICATION OR ENHANCEMENT.
1337 SUN OR ANY OF ITS SUBSIDIARIES OR AFFILIATES SHALL HAVE NO LIABILITY WITH
1338 RESPECT TO THE INFRINGEMENT OF COPYRIGHTS, TRADE SECRETS OR ANY PATENTS BY
1339 INTERFACE DEFINITION LANGUAGE CFE OR ANY PART THEREOF.
1341 IN NO EVENT WILL SUN OR ANY OF ITS SUBSIDIARIES OR AFFILIATES BE LIABLE FOR
1342 ANY LOST REVENUE OR PROFITS OR OTHER SPECIAL, INDIRECT AND CONSEQUENTIAL
1343 DAMAGES, EVEN IF SUN HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
1345 Use, duplication, or disclosure by the government is subject to
1346 restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in
1347 Technical Data and Computer Software clause at DFARS 252.227-7013 and FAR
1350 Sun, Sun Microsystems and the Sun logo are trademarks or registered
1351 trademarks of Sun Microsystems, Inc.
1355 Mountain View, California 94043