19 TableGen's purpose is to help a human develop and maintain records of
20 domain-specific information. Because there may be a large number of these
21 records, it is specifically designed to allow writing flexible descriptions and
22 for common features of these records to be factored out. This reduces the
23 amount of duplication in the description, reduces the chance of error, and makes
24 it easier to structure domain specific information.
26 The core part of TableGen parses a file, instantiates the declarations, and
27 hands the result off to a domain-specific `backend`_ for processing.
29 The current major users of TableGen are :doc:`../CodeGenerator`
31 `Clang diagnostics and attributes <http://clang.llvm.org/docs/UsersManual.html#controlling-errors-and-warnings>`_.
33 Note that if you work on TableGen much, and use emacs or vim, that you can find
34 an emacs "TableGen mode" and a vim language file in the ``llvm/utils/emacs`` and
35 ``llvm/utils/vim`` directories of your LLVM distribution, respectively.
43 TableGen files are interpreted by the TableGen program: `llvm-tblgen` available
44 on your build directory under `bin`. It is not installed in the system (or where
45 your sysroot is set to), since it has no use beyond LLVM's build process.
50 TableGen runs just like any other LLVM tool. The first (optional) argument
51 specifies the file to read. If a filename is not specified, ``llvm-tblgen``
52 reads from standard input.
54 To be useful, one of the `backends`_ must be used. These backends are
55 selectable on the command line (type '``llvm-tblgen -help``' for a list). For
56 example, to get a list of all of the definitions that subclass a particular type
57 (which can be useful for building up an enum list of these records), use the
58 ``-print-enums`` option:
62 $ llvm-tblgen X86.td -print-enums -class=Register
63 AH, AL, AX, BH, BL, BP, BPL, BX, CH, CL, CX, DH, DI, DIL, DL, DX, EAX, EBP, EBX,
64 ECX, EDI, EDX, EFLAGS, EIP, ESI, ESP, FP0, FP1, FP2, FP3, FP4, FP5, FP6, IP,
65 MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, R10, R10B, R10D, R10W, R11, R11B, R11D,
66 R11W, R12, R12B, R12D, R12W, R13, R13B, R13D, R13W, R14, R14B, R14D, R14W, R15,
67 R15B, R15D, R15W, R8, R8B, R8D, R8W, R9, R9B, R9D, R9W, RAX, RBP, RBX, RCX, RDI,
68 RDX, RIP, RSI, RSP, SI, SIL, SP, SPL, ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST7,
69 XMM0, XMM1, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, XMM2, XMM3, XMM4, XMM5,
70 XMM6, XMM7, XMM8, XMM9,
72 $ llvm-tblgen X86.td -print-enums -class=Instruction
73 ABS_F, ABS_Fp32, ABS_Fp64, ABS_Fp80, ADC32mi, ADC32mi8, ADC32mr, ADC32ri,
74 ADC32ri8, ADC32rm, ADC32rr, ADC64mi32, ADC64mi8, ADC64mr, ADC64ri32, ADC64ri8,
75 ADC64rm, ADC64rr, ADD16mi, ADD16mi8, ADD16mr, ADD16ri, ADD16ri8, ADD16rm,
76 ADD16rr, ADD32mi, ADD32mi8, ADD32mr, ADD32ri, ADD32ri8, ADD32rm, ADD32rr,
77 ADD64mi32, ADD64mi8, ADD64mr, ADD64ri32, ...
79 The default backend prints out all of the records.
81 If you plan to use TableGen, you will most likely have to write a `backend`_
82 that extracts the information specific to what you need and formats it in the
88 With no other arguments, `llvm-tblgen` parses the specified file and prints out all
89 of the classes, then all of the definitions. This is a good way to see what the
90 various definitions expand to fully. Running this on the ``X86.td`` file prints
91 this (at the time of this writing):
96 def ADD32rr { // Instruction X86Inst I
97 string Namespace = "X86";
98 dag OutOperandList = (outs GR32:$dst);
99 dag InOperandList = (ins GR32:$src1, GR32:$src2);
100 string AsmString = "add{l}\t{$src2, $dst|$dst, $src2}";
101 list<dag> Pattern = [(set GR32:$dst, (add GR32:$src1, GR32:$src2))];
102 list<Register> Uses = [];
103 list<Register> Defs = [EFLAGS];
104 list<Predicate> Predicates = [];
106 int AddedComplexity = 0;
109 bit isIndirectBranch = 0;
112 bit canFoldAsLoad = 0;
115 bit isImplicitDef = 0;
116 bit isConvertibleToThreeAddress = 1;
117 bit isCommutable = 1;
118 bit isTerminator = 0;
119 bit isReMaterializable = 0;
120 bit isPredicable = 0;
121 bit hasDelaySlot = 0;
122 bit usesCustomInserter = 0;
124 bit isNotDuplicable = 0;
125 bit hasSideEffects = 0;
126 InstrItinClass Itinerary = NoItinerary;
127 string Constraints = "";
128 string DisableEncoding = "";
129 bits<8> Opcode = { 0, 0, 0, 0, 0, 0, 0, 1 };
130 Format Form = MRMDestReg;
131 bits<6> FormBits = { 0, 0, 0, 0, 1, 1 };
132 ImmType ImmT = NoImm;
133 bits<3> ImmTypeBits = { 0, 0, 0 };
134 bit hasOpSizePrefix = 0;
135 bit hasAdSizePrefix = 0;
136 bits<4> Prefix = { 0, 0, 0, 0 };
137 bit hasREX_WPrefix = 0;
139 bits<3> FPFormBits = { 0, 0, 0 };
143 This definition corresponds to the 32-bit register-register ``add`` instruction
144 of the x86 architecture. ``def ADD32rr`` defines a record named
145 ``ADD32rr``, and the comment at the end of the line indicates the superclasses
146 of the definition. The body of the record contains all of the data that
147 TableGen assembled for the record, indicating that the instruction is part of
148 the "X86" namespace, the pattern indicating how the instruction is selected by
149 the code generator, that it is a two-address instruction, has a particular
150 encoding, etc. The contents and semantics of the information in the record are
151 specific to the needs of the X86 backend, and are only shown as an example.
153 As you can see, a lot of information is needed for every instruction supported
154 by the code generator, and specifying it all manually would be unmaintainable,
155 prone to bugs, and tiring to do in the first place. Because we are using
156 TableGen, all of the information was derived from the following definition:
161 isCommutable = 1, // X = ADD Y,Z --> X = ADD Z,Y
162 isConvertibleToThreeAddress = 1 in // Can transform into LEA.
163 def ADD32rr : I<0x01, MRMDestReg, (outs GR32:$dst),
164 (ins GR32:$src1, GR32:$src2),
165 "add{l}\t{$src2, $dst|$dst, $src2}",
166 [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>;
168 This definition makes use of the custom class ``I`` (extended from the custom
169 class ``X86Inst``), which is defined in the X86-specific TableGen file, to
170 factor out the common features that instructions of its class share. A key
171 feature of TableGen is that it allows the end-user to define the abstractions
172 they prefer to use when describing their information.
174 Each ``def`` record has a special entry called "NAME". This is the name of the
175 record ("``ADD32rr``" above). In the general case ``def`` names can be formed
176 from various kinds of string processing expressions and ``NAME`` resolves to the
177 final value obtained after resolving all of those expressions. The user may
178 refer to ``NAME`` anywhere she desires to use the ultimate name of the ``def``.
179 ``NAME`` should not be defined anywhere else in user code to avoid conflicts.
184 TableGen has a syntax that is loosely based on C++ templates, with built-in
185 types and specification. In addition, TableGen's syntax introduces some
186 automation concepts like multiclass, foreach, let, etc.
191 TableGen files consist of two key parts: 'classes' and 'definitions', both of
192 which are considered 'records'.
194 **TableGen records** have a unique name, a list of values, and a list of
195 superclasses. The list of values is the main data that TableGen builds for each
196 record; it is this that holds the domain specific information for the
197 application. The interpretation of this data is left to a specific `backend`_,
198 but the structure and format rules are taken care of and are fixed by
201 **TableGen definitions** are the concrete form of 'records'. These generally do
202 not have any undefined values, and are marked with the '``def``' keyword.
206 def FeatureFPARMv8 : SubtargetFeature<"fp-armv8", "HasFPARMv8", "true",
209 In this example, FeatureFPARMv8 is ``SubtargetFeature`` record initialised
210 with some values. The names of the classes are defined via the
211 keyword `class` either on the same file or some other included. Most target
212 TableGen files include the generic ones in ``include/llvm/Target``.
214 **TableGen classes** are abstract records that are used to build and describe
215 other records. These classes allow the end-user to build abstractions for
216 either the domain they are targeting (such as "Register", "RegisterClass", and
217 "Instruction" in the LLVM code generator) or for the implementor to help factor
218 out common properties of records (such as "FPInst", which is used to represent
219 floating point instructions in the X86 backend). TableGen keeps track of all of
220 the classes that are used to build up a definition, so the backend can find all
221 definitions of a particular class, such as "Instruction".
225 class ProcNoItin<string Name, list<SubtargetFeature> Features>
226 : Processor<Name, NoItineraries, Features>;
228 Here, the class ProcNoItin, receiving parameters `Name` of type `string` and
229 a list of target features is specializing the class Processor by passing the
230 arguments down as well as hard-coding NoItineraries.
232 **TableGen multiclasses** are groups of abstract records that are instantiated
233 all at once. Each instantiation can result in multiple TableGen definitions.
234 If a multiclass inherits from another multiclass, the definitions in the
235 sub-multiclass become part of the current multiclass, as if they were declared
236 in the current multiclass.
240 multiclass ro_signed_pats<string T, string Rm, dag Base, dag Offset, dag Extend,
241 dag address, ValueType sty> {
242 def : Pat<(i32 (!cast<SDNode>("sextload" # sty) address)),
243 (!cast<Instruction>("LDRS" # T # "w_" # Rm # "_RegOffset")
244 Base, Offset, Extend)>;
246 def : Pat<(i64 (!cast<SDNode>("sextload" # sty) address)),
247 (!cast<Instruction>("LDRS" # T # "x_" # Rm # "_RegOffset")
248 Base, Offset, Extend)>;
251 defm : ro_signed_pats<"B", Rm, Base, Offset, Extend,
252 !foreach(decls.pattern, address,
253 !subst(SHIFT, imm_eq0, decls.pattern)),
258 See the :doc:`TableGen Language Introduction <LangIntro>` for more generic
259 information on the usage of the language, and the
260 :doc:`TableGen Language Reference <LangRef>` for more in-depth description
261 of the formal language specification.
269 TableGen files have no real meaning without a back-end. The default operation
270 of running ``llvm-tblgen`` is to print the information in a textual format, but
271 that's only useful for debugging of the TableGen files themselves. The power
272 in TableGen is, however, to interpret the source files into an internal
273 representation that can be generated into anything you want.
275 Current usage of TableGen is to create huge include files with tables that you
276 can either include directly (if the output is in the language you're coding),
277 or be used in pre-processing via macros surrounding the include of the file.
279 Direct output can be used if the back-end already prints a table in C format
280 or if the output is just a list of strings (for error and warning messages).
281 Pre-processed output should be used if the same information needs to be used
282 in different contexts (like Instruction names), so your back-end should print
283 a meta-information list that can be shaped into different compile-time formats.
285 See the `TableGen BackEnds <BackEnds.html>`_ for more information.
287 TableGen Deficiencies
288 =====================
290 Despite being very generic, TableGen has some deficiencies that have been
291 pointed out numerous times. The common theme is that, while TableGen allows
292 you to build Domain-Specific-Languages, the final languages that you create
293 lack the power of other DSLs, which in turn increase considerably the size
294 and complexity of TableGen files.
296 At the same time, TableGen allows you to create virtually any meaning of
297 the basic concepts via custom-made back-ends, which can pervert the original
298 design and make it very hard for newcomers to understand the evil TableGen
301 There are some in favour of extending the semantics even more, but making sure
302 back-ends adhere to strict rules. Others are suggesting we should move to less,
303 more powerful DSLs designed with specific purposes, or even re-using existing
306 Either way, this is a discussion that will likely span across several years,
307 if not decades. You can read more in the `TableGen Deficiencies <Deficiencies.html>`_