3 @c % /**-----------------------------------------------------------------**
4 @c % ** OpenScop Library **
5 @c % **-----------------------------------------------------------------**
6 @c % ** openscop.texi **
7 @c % **-----------------------------------------------------------------**
8 @c % ** First version: september 10th 2006 **
9 @c % **-----------------------------------------------------------------**/
11 @c % release 0.0: May 4th 2008
14 @c % /*************************************************************************
15 @c % * PART I: HEADER *
16 @c % *************************************************************************/
18 @setfilename openscop.info
19 @settitle OpenScop Specification and Library
23 @set LIB_VERSION 0.8.3
24 @set UPDATED February 17th 2012
25 @setchapternewpage odd
27 @c % This is to ask for A4 instead of Letter size document.
34 @c % /************************************************************************
35 @c % * PART II: SUMMARY DESCRIPTION AND COPYRIGHT *
36 @c % ************************************************************************/
39 This document describes OpenScop, a specification of a file format and a set
40 of data structures for polyhedral compilation tools to talk
41 together. It also describes briefly the OpenScop Library version @value{LIB_VERSION},
42 a Free Software that provides an example of OpenScop implementation.
44 It would be quite kind to refer at the present document in any publication that
45 results from the use of the OpenScop Library:
49 @ @ author =@ @ @ @ @ @ @{C\'edric Bastoul@},
50 @ @ title =@ @ @ @ @ @ @ @{OpenScop: A Specification and a Library for Data
51 @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ Exchange in Polyhedral Compilation Tools@},
52 @ @ month =@ @ @ @ @ @ @ @{September@},
53 @ @ year =@ @ @ @ @ @ @ @ 2011,
54 @ @ institution = @{Paris-Sud University, France@}
58 Copyright @copyright{} 2011 Paris-Sud University and INRIA.
61 Permission is granted to copy, distribute and/or modify this document under
62 the terms of the GNU Free Documentation License, Version 1.2 published by the
63 Free Software Foundation; with no Invariant Sections, with no Front-Cover
64 Texts, and with no Back-Cover Texts. To receive a copy of the
65 GNU Free Documentation License, write to the Free Software Foundation, Inc.,
66 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
70 @c % /*************************************************************************
71 @c % * PART III: TITLEPAGE, CONTENTS, COPYRIGHT *
72 @c % *************************************************************************/
75 @subtitle A Specification and a Library for Data Exchange in Polyhedral Compilation Tools
76 @subtitle Edition @value{EDITION}, for OpenScop Specification @value{SPEC_VERSION} and OpenScop Library @value{LIB_VERSION}
77 @subtitle @value{UPDATED}
78 @author C@'edric Bastoul
80 @c The following two commands start the copyright page.
82 @vskip 0pt plus 1filll
86 @c Output the table of contents at the beginning.
89 @c % /*************************************************************************
90 @c % * PART IV: TOP NODE AND MASTER MENU *
91 @c % *************************************************************************/
101 * Polyhedral Representation::
102 * OpenScop Specification::
107 @c % /*************************************************************************
108 @c % * PART V: BODY OF THE DOCUMENT *
109 @c % *************************************************************************/
111 @c % ****************************** INTRODUCTION ******************************
113 @chapter Introduction
114 OpenScop is an open specification that defines a file format and a set of
115 data structures to represent a @emph{static control part} (SCoP for short),
116 i.e., a program part that can be represented in the @emph{polyhedral model}.
117 The goal of OpenScop is to provide a common interface to various
118 polyhedral compilation tools in order to simplify their interaction.
120 Designing a single format for tools that have different purposes
121 (e.g., as different as code generation and data dependence analysis) may
122 sound strange at first. However we could observe that most available
123 polyhedral compilation tools during the last decade were manipulating
124 more or less the same kind of data (polyhedra, affine functions...) and
125 were actually sharing a part of their input (e.g., iteration domains and
126 context concepts are nearly everywhere). We could also observe that
127 those tools may rely on different internal representations, mostly based on
128 one of the major polyhedral libraries (e.g., Polylib, PPL or isl), and
129 this representation may change over time (e.g., when switching to a
130 more convenient polyhedral library).
131 The OpenScop aim is to provide a stable, unified format that offers a
132 durable guarantee that a tool can use an output or provide an input to
133 another tool without breaking a tool chain because of some internal
134 changes in one element of the chain. The other promise of OpenScop is
135 the ability to assemble or replace the basic blocks of a polyhedral
136 compilation framework at no, or at least low engineering cost.
138 The policy that drives OpenScop can be summarized by these three rules:
140 @item Embed the @emph{minimum} information to build a complete polyhedral
141 compilation framework in the so-called @emph{core part}
142 (to avoid as much as possible empty or useless information
144 @item Provide a @emph{very stable} core part (so users have some
145 guarantee that they will not need to update their tool
146 because of frequent specification evolution),
147 @item Provide a @emph{very flexible} extension part (so it can also
148 be used to test wild new ideas).
151 @noindent Another, more technical, rule may be added:
153 @item Avoid any need for external library or tool to support it
154 (i.e., it's not XML or YAML or anything like that).
157 The success of OpenScop in meeting its goals totally depends on its
158 acceptance by the tool developers (that have to support it in their tools).
159 To help them, we provide an example implementation: the OpenScop Library.
160 This library (and in particular its API) is not part of the OpenScop
161 specification (which includes only the file format and the set of data
162 structures). It is licensed under the 3-clause BSD license so developers may
163 feel free to use it in their code (either by linking it or copy-pasting its
164 code). This document also describes this library briefly (readers are
165 invited to read at its technical documentation).
166 The current version of the OpenScop Library is still under evaluation,
167 and there is no guarantee that the upward compatibility will be respected,
168 even if we do think so. A lot of reports are
169 necessary to freeze the library API. Thus you are very welcome and
170 encouraged to send reports on bugs, wishes, critics, comments, suggestions
171 or (please!) successful experiences to the OpenScop mailing list
172 @email{openscop-development@@googlegroups.com}.
174 This document is organized as follows. First, we provide some
175 background on the polyhedral model and how it is used to represent and to
176 manipulate programs (@pxref{Polyhedral Representation}). Next,
177 we describe the OpenScop specification, from the file format
178 (@pxref{OpenScop File Format Specification}) to the data structures
179 and the OpenScop Library API
180 (@pxref{OpenScop Data Structure Specification}).
181 Finally we will detail how to install the OpenScop Library
182 (@pxref{Installation}).
185 @c % ******************* POLYHEDRAL REPRESENTATION OF PROGRAMS ****************
186 @node Polyhedral Representation
187 @chapter Polyhedral Representation of Programs
188 If you are reading at the OpenScop documentation, you probably don't need any
189 explanation about the polyhedral model. It is unlikely that someone will read
190 this paper by mistake. However some vicious advisor may ask their poor
191 engineers/interns/students
192 to work for the very first time on this exciting topic. Most papers on
193 polyhedral compilation are hard to read. Despite my efforts,
194 mine are no exception according to some reviewers. Hence I give there a new
195 try to provide a comprehensive explanation of the polyhedral model without the
196 size and style limits of a classical research paper.
198 Be aware that to be able to understand the polyhedral model, there are a few
199 prerequisites. You should not read the following while you still ignore
202 @item a @code{for} loop construction in C programs (@code{do} loops in FORTRAN are OK too!),
203 @item an @emph{affine expression},
204 @item a @emph{vector},
205 @item a @emph{matrix},
206 @item a @emph{matrix-vector multiply}.
208 If you do not know those concepts, please do some search and come back
209 afterwards. And if you are courageous enough, send me a few lines that
210 describe them so I can insert them here!
214 * Thinking in Polyhedra::
220 @section Motivation: Program Transformations
222 A direct translation of high level programs written, e.g., in C, to assembly
223 then to object code is likely to produce (very) inefficient applications.
225 quite complex, including several levels of cache memory, many cores, deep
226 pipelines, various number of functional units, of registers etc.
228 "architectural features" is growing with each new generation of processors.
229 To achieve the best performance, the object program must use
230 these features in a smart way.
231 Programmers use high level languages for productivity and portability:
232 typically they do not have to take care of the target architecture but
233 to ensure they write programs which produce the right output. Hence,
234 the problem of mapping the program to the target architecture in the most
235 efficient way is left to the compiler.
237 The compiler may see a high level program as a specification
238 @emph{of an output}. The program is a list of instructions to be executed to
239 produce the output. As long as the output is guaranteed to be as the
240 programmer specified in his code, the compiler is free to modify
242 For instance, let us imagine we are working on an architecture with only
243 three registers and we consider the following statements written by
254 It is easy to see that we can reorder the three statements in any way without
255 modifying the semantics (no statement reads or writes a variable that another
256 statement writes). Because of the lack of registers, the solutions such that
257 the first and the third statements are one after the other are better
258 because @code{a} and @code{b} will be put in the processor registers by
259 one statement and can be reused directly by the other one
260 without reading from memory (this is called a @emph{data locality
261 improving} transformation). Hence a better statement order is, e.g.:
266 z = a * b; // a and b are still in processor registers
271 We can also notice that it is possible to run the three statements in
272 parallel (possibly on different processors). The programmer may
273 explicit this in a way the compiler
274 and/or the architecture is able to understand. For instance,
275 we can use OpenMP to describe parallelism
276 (this is called a @emph{parallelizing} transformation):
280 #pragma omp parallel sections
298 However, the right way to optimize this program is probably a tradeoff
299 between these two techniques. This is true if, e.g., the target
300 architecture has some limitations to run too many operations in parallel,
301 or, like in our case, when some data may be reused by some processors.
302 Hence, the best optimization for our small example is probably the
307 #pragma omp parallel sections
322 This example is quite trivial because the statements are
323 executed only once. The real sport begins when we have to deal with loops,
324 as we will see momentarily. However, polyhedral compilation framework users
325 have to be conscious that we @emph{need} to transform programs to achieve
326 the best performance and that the best transformation that has to be
327 discovered (at the price of many, many efforts) and performed may be
328 quite complex. Hence the need of powerful model and tools.
331 @node Thinking in Polyhedra
332 @section Thinking in Polyhedra
335 Since the very first compilers, the internal representation of programs
336 is the @emph{Abstract Syntax Tree}, or AST. In such representation,
337 each statement appears only once even if it is executed many times (e.g.,
338 when it is enclosed inside a loop). This is a limitation
339 for finding and applying complex transformations:
341 @item It limits program analysis power. For instance if a statement
342 @emph{depends} on another statement (i.e., they access the same
343 memory location and at least one of these accesses is a write),
344 we will consider both statements as unique entities while the
345 dependence relation may involve only few statement executions.
346 @item It limits program transformation power. Loop transformations
347 operate on statement executions. For instance, because they
348 consider all statement executions at the same time, present day
349 production compilers are not able to achieve loop fusion
350 (that tries to merge the loop bodies of two loops) if the loop bounds
351 of the two loops do not match (yes, that's ridiculous).
352 @item It limits program manipulation flexibility.
353 Trees are very rigid data structures that are not easy to manipulate.
354 Program transformation may require very complex transformations that will
355 imply deep modifications of the control flow.
358 The Polyhedral Model is a convenient alternative representation which
359 combines analysis power, expressiveness and high flexibility. The drawback
360 is it breaks the classical structure of programs that every programmer
361 is familiar with. It requires some (real) efforts
362 to be smoothly manipulated, but it definitely worth it. It is based on three
363 main concepts, @emph{iteration domain}, @emph{scattering function} and
364 @emph{access function} that are described in depth in the
367 A program part that can be represented using the Polyhedral Model is called
368 a @strong{Static Control Part} or @strong{SCoP} for short.
372 * Scattering Function::
376 @node Iteration Domain
377 @subsection Iteration Domain
379 The key aspect of the polyhedral model is to consider @emph{statement
380 instances}. A statement instance is @emph{one} execution of a statement.
382 outside a loop has only one instance while those inside loops may have many.
383 Let us consider the following code with two statements @code{S1}
389 for (i = 0; i < 5; i++)
394 The list of statement instances is the following (we just have to fully
408 Each instance of a statement which is enclosed inside a loop may be referred
409 thanks to its outer loop counters (or @emph{iterators}). In the polyhedral
410 model we consider statements as functions of the outer loop counters that may
411 produce statement instances:
412 instead of simply "@code{S2}", we use preferably the notation @code{S2(i)}.
413 For instance we denote the statement instance @code{A[3] = pi;} of the
414 previous example as @code{S2(3)}. This means @emph{instance of
415 statement @code{S2} for} @code{i = 3}.
416 If a statement @code{S3} is enclosed inside two loops of iterators @code{i}
417 (outermost loop) and @code{j} (innermost loop), we would denote it
418 @code{S3(i,j)}, and so on with more enclosing loops.
421 of iterators (ordered from the outermost iterator to the innermost iterator)
422 is called the @strong{iteration vector}. For instance the iteration vector for
423 @code{S3} is @code{(i,j)}, for @code{S2} it is @code{(i)}, and for @code{S1}
424 it is empty since it has no enclosing loop: @code{()}. A more precise reading
425 at the notation @code{S2(3)} would show that it denotes the instance of
426 statement @code{S2} for the iteration vector @code{(2)}.
428 Obviously, dealing with statement instances does not mean we have to unroll all
429 loops. First because there would be probably too many instances to deal with,
430 and second because we probably just do not know how many instances there are.
431 For instance in the following loop it is impossible to know (at compile time)
432 how many times the statement @code{S3} will be executed:
436 for (i = 2; i <= N; i++)
437 for (j = 2; j <= N; j++)
442 @noindent Such a loop is said to be @emph{parametric}: it depends on
443 (at least) a value called a @emph{parameter} which is not modified
444 during the execution of the whole loop, but is unknown at compile time.
445 Here, the only parameter is @code{N}.
447 A compact way to represent all the instances of a given statement
448 is to consider the set of all possible values of its iteration vector.
449 This set is called the @strong{iteration domain}. It can be conveniently
450 described thanks to all the constraints on the various iterators the statement
451 depends on. For instance, let us consider
452 the statement @code{S3} of the previous program. The iteration domain is the set
453 of iteration vectors @code{(i,j)}. Because of the parameter, we are not able to
454 achieve a precise list of all possible values. It would look like this:
458 (2,2) (2,3) (2,4) ... (2,N)
459 (3,2) (3,3) (3,4) ... (3,N)
461 (N,2) (N,3) (N,4) ... (N,N)
465 @noindent A better way is to say it is the set
466 of iteration vectors @code{(i,j)} such that @code{i} is an integer greater or
467 equal than 2 and lower or equal than @code{N}, and @code{j} is an integer
468 greater or equal than 2 and lower or equal than @code{N}. This may be written
469 in the following mathematical form:
472 $$D _{S3} = \{(i,j) \in Z^2 \; | \; 2 \leq i \leq N \land 2 \leq j \leq N \}$$
478 D_S3 = @{(i,j) in Z^2 | 2 <= i <= N && 2 <= j <= N @}
483 @noindent It is easy to see that this iteration domain is a part of the
495 We often use in our research papers a graphical representation that gives a
496 better view of this subspace:
498 @image{images/basic1,4cm}
500 @noindent Here, the iteration domain is specified thanks to a set of
501 constraints. When those constraints are affine and
502 depend only on the outer loop counters and some parameters, the set of
503 constraints defines a @emph{polyhedron} (more precisely this is a
504 @emph{Z-polyhedron}, but we use @emph{polyhedron} for short).
505 Hence the @emph{polyhedral model}!
507 To manipulate a set of affine constraints easily, we rely on a matrix
508 representation. To write it, we use the @emph{homogeneous} iteration vector:
509 it is simply the iteration vector with some additional dimensions to
510 represent the parameters and the constant.
511 For instance for the statement @code{S3}, the
512 iteration vector in homogeneous coordinates is @code{(i, j, N, 1)}
513 (we will now call it @emph{iteration vector} directly for short).
514 Then we write all the constraints as affine inequalities of the form
515 @emph{affine constraint}
522 For instance for the statement @code{S3} the set of constraints is:
526 \hbox{$ \cases{ i - 2 &$\geq 0$\cr
544 @noindent Lastly, we translate the constraint system to the form
545 @strong{domain matrix}@code{ * }@emph{iteration vector}@code{ >= 0}:
547 @c Thanks to Harald Devos for the TeX :-) !
580 [ 1 0 0 -2 ] [ i ] [ 0 ]
581 [ -1 0 1 0 ] [ j ] [ 0 ]
582 [ 0 1 0 -2 ] * [ N ] >= [ 0 ]
583 [ 0 -1 1 0 ] [ 1 ] [ 0 ]
588 @noindent @strong{The domain matrix will be used in all our tools to provide the
589 information on the iteration domain of a given statement (the iteration vector
590 is in general an implicit information).}
592 @node Scattering Function
593 @subsection Scattering Function
595 There is no ordering information inside the iteration domain: it only describes
596 the set of statement instances but @strong{not} the order in which they have
597 to be executed relatively to each other. In the past the lexicographic order
598 of the iteration domain was considered, this is no more true
599 (especially when using CLooG). If we do not provide any ordering information, this
600 means that the statement instances may be executed in any order
601 (this is useful, e.g., to specify parallelism). However, some statement instances
602 may depend on some others and it may be critical to enforce a given order (or
603 non-order). Hence we need another information.
605 We call @emph{scattering} any kind of ordering information in the polyhedral
606 model. There exists many kind of ordering, such as @emph{allocation},
607 @emph{scheduling}, @emph{chunking} etc. They are all expressed
608 in the same way, i.e., using @emph{logical stamps}, but they may have
611 In the case of @strong{scheduling}, the logical stamps are logical dates that
612 express at which date a statement instance has to be executed. For instance,
613 let us consider the following three statements:
623 @noindent The scheduling of a statement @code{S} is typically
631 Let us consider the following logical dates for each statement:
653 @noindent It means that statement @code{S3} has to be executed at logical date
654 @code{1}, statement @code{S1} has to be executed at logical date
655 @code{2} and statement @code{S2} has to be executed at logical date
656 @code{3}. The target code has to respect this scheduling (the order of
657 the logical dates), hence it would look like the following where the
658 variable @code{t} denotes the time:
671 @noindent When some statements share the same logical date, this means that,
672 once the program reaches this logical date, the two statements can be executed
673 in any order, or better, in parallel. For instance let us consider the
674 following scheduling:
696 @noindent Statements @code{S1} and @code{S3} have the same logical date,
697 moreover, @code{S2} has a greater logical date than @code{S1} and @code{S3}.
698 Hence the target code would be:
703 #pragma omp parallel sections
719 Logical dates may be multidimensional, as clocks: the first dimension
720 may correspond to days (most significant), the next one to hours (less
721 significant), the third to minutes and so on. For instance we can consider
722 the following multidimensional schedules for our example:
727 $\theta_{S1} = (1,1)$
728 $\theta_{S2} = (2,1)$
729 $\theta_{S3} = (1,2)$
744 @noindent It is not very hard to decypher the meaning of such scheduling.
745 Because of the first dimension, statements @code{S1} and @code{S3} will be
746 executed before statement @code{S2} (@code{S1} and @code{S3} are executed at
747 day 1, while @code{S2} is executed at day 2). The second dimension is not
748 really useful there for @code{S2} because it is the only statement executed
749 at day 2. Nevertheless it allows to order @code{S1} and
750 @code{S3} relatively to each other since @code{S1} is executed at hour 1 of day
751 1 while @code{S3} is executed at hour 2 of day 1.
752 The corresponding target code is the following, with some
753 additional time variables for a better view of the ordering (@code{t1}
754 corresponds to the first time dimension, @code{t2} to the second one):
769 In the case of @strong{allocation} (in the literature we can find some
770 papers calling it @emph{placement}), the logical stamp is a processor
771 number expressing on which processor a statement instance has to be
772 executed. Typically, allocations are written in the same way as scheduling.
773 Here, we denote it @math{P_S} for a statement @code{S}.
774 For instance, let us consider the following allocation:
796 @noindent The corresponding target code has to take into account that both
797 statements @code{S1} and @code{S3} have to be executed on the same processor
798 (they have the same logical number 1) and that statement @code{S2} has
799 to be executed on another processor (logical number 2). A possible target code
804 #pragma omp parallel sections
808 // Logical processor 1
814 // Logical processor 2
821 @noindent We can note that no order has been specified for the
822 statements @code{S1} and @code{S3} that are executed on the same processor.
823 Hence any order is satisfying. For sake of flexibility, it is usual to build
824 a scattering whose various dimensions do not have the same semantics. A
825 typical construction is @emph{space/time mapping} where the first @code{n}
826 dimensions are devoted to allocation, then the last @code{m}
827 dimensions are devoted to scheduling. Typically, space/time mapping is
828 written in the same way as scheduling.
829 Here we denote it @math{M_S} for a statement @code{S}.
830 For instance, let us consider the following space/time mapping for our
831 example where one dimension is devoted to mapping and one dimension is
832 devoted to scheduling:
854 @noindent Here we have the same first dimension as the previous example, thus
855 the allocation of the statements to processors is the same. The second
856 dimension precises on a given processor at which logical date a statement
857 instance has to be executed. Here, the statement @code{S1} is executed at
858 day 2 on processor 1 while the statement @code{S3} is executed at day 1 onto
859 the same processor. It follows this space/time mapping corresponds to the
860 following target code (we added an additional variable to represent the
861 local logical clocks):
865 #pragma omp parallel sections
869 // Logical processor 1
877 // Logical processor 2
885 For the same reason as discussed for iteration domains
886 (@pxref{Iteration Domain}), it is not possible to define a scattering for
887 each statement instance, especially if the statement belongs to a
888 (possibly parametric) loop. The iteration vector fully defines an
889 instance of a given statement. Thus, a practical way to provide a scattering
890 for each instance of a given statement is to use a @emph{function}
891 that depends on the iteration vector. In this way the function may associate
892 to each iteration vector a different scattering. We call these functions
893 @strong{scattering functions}. Scattering functions are @emph{affine}
894 functions of the outer loop counter and the global parameters.
895 For instance, let us consider the following source code:
899 for (i = 2; i <= 4; i++)
900 for (j = 2; j <= 4; j++)
901 P[i+j] += A[i] + B[j]; // S4
905 @noindent The iteration domain of the statement @code{S4} is:
909 $$D _{S4} = \{(i,j) \in Z^2 \; | \; 2 \leq i \leq 4 \land 2 \leq j \leq 4 \}.$$
914 D_S4= @{(i,j) in Z^2 | 2 <= i <= 4 && 2 <= j <= 4 @}.
919 @noindent If you are still not comfortable with the mathematical notation, it
920 corresponds to the following graphical representation:
922 @image{images/basic2,3cm}
924 @noindent The list of the statement instances of @code{S4} (the integer
925 points of its iteration domain) corresponds to the following iteration vectors:
942 @noindent Let us suppose we want to schedule the instances of the statement
943 @code{S4} (the integer points of its iteration domain) using the following
949 $\theta_{S4}(i, j) = (j+2, 3*i+j)$
957 T_S4(i,j) = (j+2,3*i+j)
962 @noindent We only need to apply the function to each iteration vector to find
963 the logical date of each instance:
967 iteration vector logical date
980 The polyhedral model users do not have to take care about the generation of a
981 target code that respects the scattering: the
982 CLooG@footnote{@url{http://www.cloog.org}} tool is there to
983 solve the problem quite easily. For the previous
984 example, the target code would be the following (@code{t1} and @code{t2}
985 correspond to the two dimensions of the logical date, the reader may
986 take care that this code actually implements the scattering function):
990 for (t1 = 4; t1 <= 6; t1++) @{
991 for (t2 = t1+4; t2 <= t1+10; t2++) @{
992 if ((-t1+t2+2)%3 == 0) @{
995 P[i+j] += A[i] + B[j]; // S4
1004 Obviously with such a twisted scheduling, it is hard to see the "meaning"
1005 of the transformation. To name any kind of program transformation as
1006 a magic spell ("tile", "fuse", "skew"...) is an old bad habit which is not
1007 relevant anymore in the polyhedral model: a scheduling may be an arbitrary
1008 complex sequence of basic-old-good transformations. Nevertheless it is most
1009 of the time quite easy to translate well known transformations to schedules.
1010 For instance, let us consider this new scheduling function:
1015 $\theta_{S4}(i,j) = (j,i)$
1028 @noindent Using CLooG, we can generate the target code:
1032 for (t1 = 2; t1 <= 4; t1++) @{
1033 for (t2 = 2; t2 <= 4; t2++) @{
1036 P[i+j] += A[i] + B[j]; // S4
1043 @noindent It is easy to see (and analyze) that it corresponds to a classical
1044 @emph{loop interchange} transformation.
1046 A very useful example of multi-dimensional scattering functions is
1047 the @strong{scheduling of the original program}.
1048 The method to compute it is quite simple (@pxref{Fea92}). The idea is to
1049 build an abstract syntax tree of the program and to read the scheduling for
1050 each statement. For instance, let us consider the following implementation of
1051 a Cholesky factorization:
1055 /* A Cholesky factorization kernel. */
1056 for (i=1;i<=N;i++) @{
1057 for (j=1;j<=i-1;j++) @{
1058 a[i][i] -= a[i][j] ; // S1
1060 a[i][i] = sqrt(a[i][i]) ; // S2
1061 for (j=i+1;j<=N;j++) @{
1062 for (k=1;k<=i-1;k++) @{
1063 a[j][i] -= a[j][k]*a[i][k] ; // S3
1065 a[j][i] /= a[i][i] ; // S4
1072 @noindent The corresponding abstract syntax tree is shown in the following
1073 figure. It directly gives the scattering functions (schedules) for all the
1074 statements of the program (just follow the paths).
1076 @image{images/tree,6cm}
1080 \hbox{$ \cases{ \theta _{S1}(i,j) &$= (0,i,0,j,0)$\cr
1081 \theta _{S2}(i) &$= (0,i,1)$\cr
1082 \theta _{S3}(i,j,k) &$= (0,i,2,j,0,k,0)$\cr
1083 \theta _{S4}(i,j) &$= (0,i,2,j,1)$}$}
1090 T_S1(i,j) = (0,i,0,j,0)
1092 T_S3(i,j,k) = (0,i,2,j,0,k,0)
1093 T_S4(i,j) = (0,i,2,j,1)
1098 @noindent These schedules depend on the iterators and give for each instance
1099 of each statement a unique execution date. Using such scattering functions
1100 allows CLooG to re-generate the input code.
1102 @noindent To easily manipulate the scattering function of any
1103 statement @code{S}, we translate it to the matrix form:
1105 @math{\theta_S}(@emph{iteration vector})
1108 T_S(@emph{iteration vector})
1110 @code{ = }@strong{scattering matrix}@code{ * }@emph{iteration vector}.
1111 For instance let us consider again our previous example
1113 $\theta_{S4}(i, j) = (j+2, 3*i+j).$
1116 T_S4(i,j) = (j+2,3*i+j).
1118 We write it in the following way:
1148 [ i ] [ 0 1 2 ] [ i ]
1149 T_S4([ j ]) = [ 3 1 0 ] * [ j ]
1155 @noindent @strong{The scattering matrix will be used in all our tools to provide
1156 the information on the scattering of a given statement
1157 (the iteration vector is in general an implicit information).}
1160 @node Access Function
1161 @subsection Access Function
1163 Before applying any transformation, it is essential to deeply analyze both the
1164 original program and the transformation to ensure the transformation does not
1165 imply any modification of the original program semantics. In the polyhedral
1166 model, we are able to achieve
1167 an exact analysis when all the memory accesses are made through arrays
1168 (note that variables are a particular case of arrays since they are simply
1169 arrays with only one memory location) with affine subscripts that depend
1170 on outer loop counters and global parameters (note that @emph{subscripts}
1171 are sometimes called @emph{index} or @emph{accesses} in the literature).
1173 For instance let us consider the array access @code{A[2*i+j][j][i+N]}. It has
1174 three dimensions, each subscript dimension is an affine form of some outer loop
1175 iterarors (@code{i} and @code{j}) and global parameters (@code{N}) hence
1176 it corresponds to an acceptable array access to be analyzed in the
1179 Each array access can target a different memory cell depending on the
1180 statement instance, i.e., depending on the iteration vector.
1181 Thus we use access functions (or subscript functions, or index functions, as you
1182 prefer to call it) depending on the iteration vector to describe an array
1183 access. In our example, the access function would be written
1184 @math{F_A(i, j) = (2*i+j, j, i+N)}.
1186 @noindent To easily manipulate the access function of any
1187 array @code{A}, we translate it to the matrix form:
1188 @math{F_A}(@emph{iteration vector})
1189 @code{ = }@strong{access matrix}@code{ * }@emph{iteration vector}.
1190 For instance let us consider again our previous example. We would
1191 write it in the following way:
1224 [ i ] [ 2 1 0 0 ] [ i ]
1225 F_A([ j ]) = [ 0 1 0 0 ] * [ j ]
1226 [ N ] [ 1 0 1 0 ] [ N ]
1232 @noindent @strong{The access matrix will be used in all our tools to provide the
1233 information on the access of a given statement
1234 (the iteration vector is in general an implicit information).}
1237 @section What's Next?
1239 OK, now you have an idea about how we do represent a program part in the
1240 polyhedral model. You know the three main concepts, namely: domain, scattering
1241 and access. What can you do with this?! Well, pretty much anything related
1242 to code restructuring! The core idea will be to rely on the mathematical
1243 representation to extract useful information about your
1244 code (data reuse, parallelism...) and to generate a scattering
1245 to benefit from the properties you analysed.
1246 However, OpenScop's documentation is not the right
1247 place to learn at this (OpenScop is all about representation, not about
1248 manipulation). Probably it is the right time for you to
1251 @item PoCC @url{http://pocc.sourceforge.net}
1252 @item Pluto @url{http://pluto-compiler.sourceforge.net}
1255 @noindent Have fun :-) !
1258 @c % *********************** OpenScop Specification **************************
1259 @node OpenScop Specification
1260 @chapter OpenScop Specification
1262 OpenScop provides an explicit polyhedral representation of a static control
1263 part. It has been designed by various polyhedral compilation tool writers from
1264 various institutions. It builds on previous popular polyhedral file and data
1265 structure formats (such as @emph{.cloog} and CLooG data structures) to provide
1266 a unique, extensible format to most polyhedral compilation tools. It
1267 is composed of two parts. The first part, the so-called
1268 @emph{core part}, is devoted to the polyhedral representation of a SCoP.
1269 It contains what is strictly necessary to build a
1270 complete source-to-source framework in the polyhedral model and to output a
1271 semantically equivalent code for the SCoP, from analysis to code generation.
1272 The second part of the format, the so-called @emph{extension part},
1273 contains extensions to provide additional informations to some tools.
1276 * Preliminary Example::
1277 * OpenScop File Format Specification::
1278 * OpenScop Data Structure Specification::
1283 @c %/*************************************************************************
1284 @c % * PRELIMINARY EXAMPLE *
1285 @c % *************************************************************************/
1286 @node Preliminary Example
1287 @section Preliminary Example
1288 OpenScop is a specification for representing static control program parts in
1289 the polyhedral model. Thus, we can translate some code parts to an equivalent
1290 OpenScop representation. As an example, let us consider the
1291 following matrix-multiply kernel:
1296 for (i = 0; i < N; i++) @{
1297 for (j = 0; j < N; j++) @{
1299 for (k = 0; k < N; k++)
1300 C[i][j] = C[i][j] + A[i][k] * B[k][j];
1306 The Clan@footnote{@url{http://www.lri.fr/~bastoul/development/clan/}}
1307 tool may be used to translate this code part to an OpenScop
1308 representation automatically. The @code{#pragma scop} is used here for
1309 Clan, but some other tool may not need it. Here is the result of the
1310 translation to an OpenScop textual representation.
1313 @center @strong{DON'T PANIC}
1316 @noindent Explanations will follow and it is not
1317 as cryptic as it seems to be !
1323 # =============================================== Global
1333 # Parameter names are provided
1340 # Number of statements
1343 # =============================================== Statement 1
1344 # Number of relations describing the statement
1347 # ---------------------------------------------- 1.1 Domain
1352 1 -1 0 1 -1 ## -i+N-1 >= 0
1354 1 0 -1 1 -1 ## -j+N-1 >= 0
1356 # ---------------------------------------------- 1.2 Scattering
1359 # e/i| s1 s2 s3 s4 s5 | i j | N | 1
1360 0 -1 0 0 0 0 0 0 0 0 ## s1 = 0
1361 0 0 -1 0 0 0 1 0 0 0 ## s2 = i
1362 0 0 0 -1 0 0 0 0 0 0 ## s3 = 0
1363 0 0 0 0 -1 0 0 1 0 0 ## s4 = j
1364 0 0 0 0 0 -1 0 0 0 0 ## s5 = 0
1366 # ---------------------------------------------- 1.3 Access
1369 # e/i| Arr [1] [2] | i j | N | 1
1370 0 -1 0 0 0 0 0 1 ## C
1371 0 0 -1 0 1 0 0 0 ## [i]
1372 0 0 0 -1 0 1 0 0 ## [j]
1374 # ---------------------------------------------- 1.4 Body
1375 # Statement body is provided
1379 # Number of original iterators
1381 # Original iterator names
1383 # Statement body expression
1387 # =============================================== Statement 2
1388 # Number of relations describing the statement
1391 # ---------------------------------------------- 2.1 Domain
1394 # e/i| i j k | N | 1
1395 1 1 0 0 0 0 ## i >= 0
1396 1 -1 0 0 1 -1 ## -i+N-1 >= 0
1397 1 0 1 0 0 0 ## j >= 0
1398 1 0 -1 0 1 -1 ## -j+N-1 >= 0
1399 1 0 0 1 0 0 ## k >= 0
1400 1 0 0 -1 1 -1 ## -k+N-1 >= 0
1402 # ---------------------------------------------- 2.2 Scattering
1405 # e/i| s1 s2 s3 s4 s5 s6 s7 | i j k | N | 1
1406 0 -1 0 0 0 0 0 0 0 0 0 0 0 ## s1 = 0
1407 0 0 -1 0 0 0 0 0 1 0 0 0 0 ## s2 = i
1408 0 0 0 -1 0 0 0 0 0 0 0 0 0 ## s3 = 0
1409 0 0 0 0 -1 0 0 0 0 1 0 0 0 ## s4 = j
1410 0 0 0 0 0 -1 0 0 0 0 0 0 1 ## s5 = 1
1411 0 0 0 0 0 0 -1 0 0 0 1 0 0 ## s6 = k
1412 0 0 0 0 0 0 0 -1 0 0 0 0 0 ## s7 = 0
1414 # ---------------------------------------------- 2.3 Access
1417 # e/i| Arr [1] [2] | i j k | N | 1
1418 0 -1 0 0 0 0 0 0 1 ## C
1419 0 0 -1 0 1 0 0 0 0 ## [i]
1420 0 0 0 -1 0 1 0 0 0 ## [j]
1424 # e/i| Arr [1] [2] | i j k | N | 1
1425 0 -1 0 0 0 0 0 0 1 ## C
1426 0 0 -1 0 1 0 0 0 0 ## [i]
1427 0 0 0 -1 0 1 0 0 0 ## [j]
1431 # e/i| Arr [1] [2] | i j k | N | 1
1432 0 -1 0 0 0 0 0 0 2 ## A
1433 0 0 -1 0 1 0 0 0 0 ## [i]
1434 0 0 0 -1 0 0 1 0 0 ## [k]
1438 # e/i| Arr [1] [2] | i j k | N | 1
1439 0 -1 0 0 0 0 0 0 3 ## B
1440 0 0 -1 0 0 0 1 0 0 ## [k]
1441 0 0 0 -1 0 1 0 0 0 ## [j]
1443 # ---------------------------------------------- 2.4 Body
1444 # Statement body is provided
1448 # Number of original iterators
1450 # Original iterator names
1452 # Statement body expression
1453 C[i][j] = C[i][j] + A[i][k] * B[k][j];
1456 # =============================================== Extensions
1458 May the power of the polyhedral model be with you.
1465 We will not describe here precisely the structure and the components of this
1466 output, this is described in depth in a further section
1467 (@pxref{OpenScop File Format Specification}). This format
1468 has been designed to be a possible input or output file format of most of
1469 polyhedral compilation tools. If you read the description of the polyhedral
1470 representation of programs, you should already feel familiar with this file
1471 format (@pxref{Polyhedral Representation}).
1474 @c %/*************************************************************************
1475 @c % * FILE FORMAT SPECIFICATION *
1476 @c % *************************************************************************/
1477 @node OpenScop File Format Specification
1478 @section OpenScop File Format Specification
1485 The following grammar describes the structure of the
1486 OpenScop file format where terminals are preceeded by "_". Except
1487 stated otherwise, there can be at most one terminal per line in the file.
1488 Moreover, each line may finish with a comment, starting by the @samp{#}
1489 character. Each relevant part will be explained in more details momentarily:
1492 OpenScop ::= Start_tag Data End_tag
1493 Start_tag ::= "<OpenScop>"
1494 End_tag ::= "</OpenScop>"
1495 Data ::= Context Statements Extension_list
1496 Context ::= Language Parameter_Domain Parameters
1497 Statements ::= Nb_statements Statement_list
1498 Statement_list ::= Statement Statement_list | (void)
1499 Relation_list ::= _Relation Relation_list | (void)
1500 Extension_list ::= _Generic Extension_list | (void)
1501 Statement ::= Statement_relations Body
1502 Body ::= "0" | "1" Body_information
1503 Parameters ::= "0" | "1" Parameter_information
1504 Statement_relations ::= Nb_relations Relation_List
1505 Parameter_domain ::= _Relation
1506 Language ::= _String
1507 Nb_Relations ::= _Integer
1508 Parameter_information ::= _Generic
1509 Body_information ::= _Generic
1512 The @samp{Context} and the @samp{Statements} parts compose the
1513 @emph{core part}, i.e., what is strictly necessary to build
1514 a complete source to source framework based on OpenSCop:
1516 @item @samp{Context} represents the global information of the SCoP. It
1517 consists on the target language, the global constraints on the
1518 parameters and optionally the parameter information which may be necessary
1519 for the code generation process. The constraints on the parameters
1520 are represented as a relation (@pxref{Context Domain Relation}).
1521 The parameter information is optional. It is preceded by a
1522 boolean which precises whether it is provided or not.
1523 It is a generic information (@pxref{Generics}), a @code{strings}
1524 (@pxref{Strings Generic}) for instance.
1525 @item @samp{Statements} represents the information about the statements.
1526 @samp{Nb_statements} is the number of statements in the SCoP,
1527 i.e. the number of @samp{Statement} items in the @samp{Statement_list}.
1528 @samp{Statement} represents the information on a given statement.
1529 To each statement is associated a list of relations and,
1530 optionaly, a body. The list of relations may include
1531 one iteration domain (@pxref{Iteration Domain Relation}),
1532 one scattering relation (@pxref{Scattering Relation})
1533 and several access relations (@pxref{Access Relation}).
1534 There is no mandatory ordering, but for consistency reason it would
1535 be much appreciated that iteration domain comes first (if present)
1536 then scattering (if present), then accesses (if present).
1537 The statement body is an optional information. It is preceded by a
1538 boolean which precises whether it is provided or not.
1539 It is a generic information (@pxref{Generics}), a @code{body}
1540 (@pxref{Body Generic}) for instance.
1543 The @samp{Extension_list} represents the @emph{extension part} and may contain
1544 an arbitrary number of generic informations (@pxref{Generics}).
1545 Few examples of possible extensions are presented in a further
1546 section (@pxref{Extensions}).
1548 As shown by the grammar, the input file describes the various pieces of
1549 information based on strings, integers, @emph{relations} and @emph{generics}.
1550 Relations and Generics are specific to OpenScop and are described in depth
1551 in the following Sections (@pxref{Relations} and @pxref{Generics}).
1554 @c %/*************************************************************************
1556 @c % *************************************************************************/
1558 @subsection Relations
1561 * Iteration Domain Relation::
1562 * Context Domain Relation::
1563 * Scattering Relation::
1567 @emph{Relations} are the essence of the OpenScop format and contain the
1568 "polyhedral" information. They are used to describe either an iteration
1569 domain, or a context domain, or a scattering or a memory access.
1571 We use the relation term as a shortcut to denote a
1572 union of convex relations, each element of the union being described by a set of
1573 constraints in an extended PolyLib format (@pxref{Wil93}).
1574 The number of elements in the union is given by an integer on the first line,
1575 optionally followed by a comment starting with @samp{#}.
1576 This number of elements can be omitted when there is only one element.
1577 Each element in the union has the following syntax:
1580 @item Some optional comment lines beginning with @samp{#}.
1581 @item A line with the type of the relation, possibly followed by comments.
1582 The type can be one of the following:
1584 @item @code{UNDEFINED}: generic relation,
1585 @item @code{CONTEXT}: for context information,
1586 @item @code{DOMAIN}: for iteration domains,
1587 @item @code{SCATTERING}: for scattering relation,
1588 @item @code{READ}: for read accesses,
1589 @item @code{WRITE}: for write accesses,
1590 @item @code{MAY_WRITE}: for may-write accesses,
1592 @item A line with 6 numbers, possibly followed by comments:
1594 @item the number of rows of the constraint matrix,
1595 @item the number of columns of the constraint matrix,
1596 @item the number of @emph{output dimensions},
1597 @item the number of @emph{input dimension},
1598 @item the number of @emph{local dimensions}
1599 (existentially quantified dimensions),
1600 @item the number of @emph{parameters}.
1602 The sum of the last four numbers should be equal to the number of columns
1603 minus two. The remaining two columns are the equality/inequality
1604 indicator and the constant term. The number of parameters should be the
1605 same for all relations in the entire OpenScop file or data structure.
1606 @item The constraint rows. Each row corresponds to a constraint the
1607 relation has to satisfy. Each row must be on a single line and is possibly
1608 followed by comments. The constraint is an equality @math{p(x) = 0} if the
1609 first element is 0, an inequality @math{p(x) \geq 0} if the first element
1610 is 1. The next elements are the coefficients of the output dimensions,
1611 followed by coefficients of the input dimensions, the existentially
1612 quantified dimensions and finally the parameters.
1613 The last element is the constant term.
1616 This representation is the basis for several purposes. Examples for
1617 iteration domains (@pxref{Iteration Domain Relation}), context domains
1618 (@pxref{Context Domain Relation}), scattering
1619 relations (@pxref{Scattering Relation}) and
1620 access relations (@pxref{Access Relation}) are provided in further sections.
1622 @c %/************************** ITERATION DOMAIN ****************************
1623 @node Iteration Domain Relation
1624 @subsubsection Iteration Domain Relation
1626 Iteration domain represents the set of instances of the corresponding statement.
1627 OpenScop iteration domains are represented as relations with the following
1630 @item the type is @code{DOMAIN},
1631 @item there is 0 input dimension,
1632 @item loop iterators correspond to output dimensions.
1635 @noindent For instance, assuming that @samp{i}, @samp{j} and @samp{k} are the loop
1636 iterators and @samp{M} and @samp{N} are the parameters, the domain defined by
1637 the following constraints :
1641 \hbox{$ \cases{ -i + M &$\geq 0$\cr
1643 i + j - k &$\geq 0$}$}
1656 @noindent can be written in the input file as follows:
1660 # This is an iteration domain
1662 1 # Number of relations in the union
1663 3 7 3 0 0 2 # 3 rows, 7 cols: 3 output dims and 2 params
1664 # e/i| i j k | M N | 1
1665 1 -1 0 0 1 0 0 # -i + M >= 0
1666 1 0 -1 0 0 1 0 # -j + N >= 0
1667 1 1 1 -1 0 0 0 # i + j - k >= 0
1671 @noindent Equivalently, it can be written in the following way as the number
1672 of relations in the union can be omitted if it is 1:
1676 # This is an iteration domain
1678 3 7 3 0 0 2 # 3 rows, 7 cols: 3 output dims and 2 params
1679 # e/i| i j k | M N | 1
1680 1 -1 0 0 1 0 0 # -i + M >= 0
1681 1 0 -1 0 0 1 0 # -j + N >= 0
1682 1 1 1 -1 0 0 0 # i + j - k >= 0
1686 @noindent As an example for unions, let us consider the following pseudo-code:
1690 for (i = 1; i <= N; i++) @{
1691 if ((i >= M) || (i <= 2*M))
1697 @noindent The iteration domain of @samp{S1} can be divided into two
1698 relations and written in the OpenScop file as follows:
1702 # This is an iteration domain
1704 2 # Number of relations in the union
1706 3 5 1 0 0 2 # 3 rows, 5 cols: 1 output dim and 2 params
1712 3 5 1 0 0 2 # 3 rows, 5 cols: 1 output dim and 2 params
1716 1 -1 2 0 0 # i <= 2*M
1720 @noindent As an example for local dimensions (existentially quantified
1721 dimensions), let us consider the following pseudo-code:
1725 for (i = 1; i <= N; i++) @{
1732 @noindent The iteration domain of @samp{S1} is composed of all even
1733 integer values between 1 and N. The "divisible by two" constraint can
1734 be expressed as follows: there exists an integer @samp{ld} such that
1735 @samp{i = 2*ld}. We encode this thanks to a new local dimension:
1739 # This is an iteration domain
1741 3 5 1 0 1 1 # 3 rows, 5 cols: 1 output dim, 1 local dim, 1 param
1742 # e/i| i |ld | N | 1
1743 0 1 -2 0 0 # i = 2*ld
1750 @c %/************************** CONTEXT DOMAIN ******************************
1751 @node Context Domain Relation
1752 @subsubsection Context Domain Relation
1754 The context domain is a particular case of iteration domain
1755 (@pxref{Iteration Domain Relation}) where there are only
1756 constraints about parameters (no loop iterators). Hence it is the same
1757 as an iteration domain, with the following conventions:
1759 @item the type is @code{CONTEXT},
1760 @item there is 0 input dimension,
1761 @item there is 0 output dimension.
1765 @c %/************************* SCATTERING RELATION **************************
1766 @node Scattering Relation
1767 @subsubsection Scattering Relation
1769 Scattering relation maps an iteration domain to a logical time and/or
1770 space (and/or) anything.
1771 OpenScop scattering information is represented as relations
1772 (@pxref{Relations}) with the following conventions:
1775 @item the type is @code{SCATTERING},
1776 @item output dimensions correspond to scattering dimensions,
1777 @item loop iterators correspond to input dimensions.
1780 As an example of a scattering relation and
1781 assuming that @samp{i}, @samp{j} and @samp{k} are the loop
1782 iterators and @samp{M} and @samp{N} are the parameters, take for instance:
1784 $\theta_{S}(i,j,k) = (j+2,3*i+j,k+N+1).$
1788 T_@{S@}(i,j,k) = (j+2,3*i+j,k+N+1).
1791 We can represent it in the following way:
1795 # A scattering relation
1797 # 3 rows, 10 columns: 3 scattering dimensions, 3 iterators, 2 parameters
1799 # e/i|s1 s2 s3 | i j k | M N | 1
1800 0 -1 0 0 0 1 0 0 0 2 # s1 = j+2
1801 0 0 -1 0 3 1 0 0 0 0 # s2 = 3*i+j
1802 0 0 0 -1 0 0 1 0 1 1 # s3 = k+N+1
1806 @c %/************************** ACCESS RELATION *****************************
1807 @node Access Relation
1808 @subsubsection Access Relation
1810 Access relation maps an iteration domain to an array space.
1811 Each array accessed in the SCoP has a unique identification number.
1812 OpenScop relation information is represented as relations
1813 (@pxref{Relations}) with the following conventions:
1816 @item the type is one of the following:
1818 @item @code{READ}, for read accesses,
1819 @item @code{WRITE}, for write accesses,
1820 @item @code{MAY_WRITE}, for may write accesses,
1822 @item output dimensions correspond to the array identifier and dimensions,
1823 @item the first output dimension corresponds to the array identifier,
1824 @item the (i+1)th output dimension corresponds to the ith array dimension (i>1),
1825 @item loop iterators correspond to input dimensions.
1828 As an example of a scattering relation and
1829 assuming that @samp{i}, @samp{j} and @samp{k} are the loop
1830 iterators and @samp{M} and @samp{N} are the parameters, let us consider
1831 the array access @code{A[2*i+j][j][i+N]} (the identifier of @code{A} is 42),
1832 and let us suppose this is a read access. Its representation would be the
1837 # A read access relation
1839 # 4 rows, 11 columns: 4 array dimensions, 3 iterators, 2 parameters
1841 # e/i|Arr [1] [2] [3]| i j k | M N | 1
1842 0 -1 0 0 0 0 0 0 0 0 42 # A
1843 0 0 -1 0 0 2 1 0 0 0 0 # [2*i+j]
1844 0 0 0 -1 0 0 1 0 0 0 0 # [j]
1845 0 0 0 0 -1 1 0 0 0 1 0 # [i+N]
1849 To understand this representation, consider that OpenScop accesses
1850 are general memory accesses and not array accesses. The memory is
1851 seen as a big array @code{Mem} while usual array names correspond to
1852 the first dimension. Hence our example translates to @code{Mem[42][2*i+j][j][i+N]}.
1854 Unions of access relations are allowed. In this case, each union part must
1855 refer at the same array identifier, and the number of dimensions must be
1858 @c %/*************************************************************************
1860 @c % *************************************************************************/
1862 @subsection Generics
1868 @emph{Generics} represent any elaborated non-polyhedral information in the
1869 OpenScop format. They are used to represent the parameter information, the
1870 statement body information as well as the extensions. Each generic information
1871 is delimited using XML-like tags corresponding to its URI (Unique Resource
1872 Identifier), For instance, if the generic has the URI @code{foo}, the begin
1873 tag is @code{<foo>} and the end tag is @code{</foo>}).
1875 Two generics, namely @code{strings} (@pxref{Strings Generic}) and
1876 @code{body} (@pxref{Body Generic}) are part of the OpenScop
1877 specification to provide the minimum, stricly necessary information to
1878 build a complete source-to-source polyhedral framework based on OpenScop.
1879 However, generics can be basically @emph{anything} as long as they are
1880 properly delimited. OpenScop implementations will simply ignore
1881 non-supported generics and warn the user with the mention of the
1882 non-supported URIs. Support of new generics will be added throught the
1883 extension mechanism.
1885 @c ---------------------------------------------------------------------------
1887 @node Strings Generic
1888 @subsubsection Strings Generic
1890 The purpose of the @code{strings} generic is to represent a list of
1891 textual strings on one line (which may be used, e.g., to represent the list of
1892 parameter names in the order used in the relation). Its URI is @code{strings}
1893 and its file format respects the following grammar:
1895 Strings_generic ::= "<strings>" Strings "</strings>"
1896 Strings ::= _String String_list | (void)
1899 @noindent A possible example of textual @code{strings} is the following:
1903 Not one sentence but 6 strings!
1908 @c ---------------------------------------------------------------------------
1911 @subsubsection Body Generic
1913 The purpose of the @code{body} generic is to represent the textual
1914 information about a statement. It contains the number of original iterators on
1915 the first line, the list of original iterators on the second
1916 line (the loop counters of the statement surrounding loops in the original
1917 program) and the original textual body expression on the third line.
1918 Its URI is @code{body} and its file format respects the following grammar
1919 (the @code{String} rule is reused, @pxref{Strings Generic}):
1921 Body_generic ::= "<body>" Body "</body>"
1922 Body ::= Nb_iterators Iterator_list Expression
1923 Nb_iterators ::= _Integer
1924 Iterator_list ::= Strings
1925 Expression ::= Strings
1928 @noindent A possible example of textual @code{body} is the following:
1932 # Number of original iterators
1934 # Original iterators
1936 # Original statement expression
1937 A[i+j] += B[i] * C[j];
1943 @c %/*************************************************************************
1944 @c % * DATA STRUCTURE SPECIFICATION *
1945 @c % *************************************************************************/
1946 @node OpenScop Data Structure Specification
1947 @section OpenScop Data Structure Specification
1951 * osl_relation_list_t::
1960 The OpenScop specification offers a small set of C data structures devoted to
1961 represent a SCoP in memory in a convenient way. Using them in some tool or
1962 library may greatly facilitate its interaction with other tools or libraries
1963 which rely on this representation as well. Every field may not be useful for
1964 a given tool or library. A general rule for all the data structure is
1965 that a @code{NULL} pointer or a -1 integer value means the information is
1966 not present. Contrary to engineering time, memory is cheap today, so it's much
1967 probably not a big deal that some fields are left empty. Every field may not
1968 be enough for a given tool or library. In this case it is much recommended
1969 to provide a new extension which may be reused by other users
1970 (@pxref{Extensions}).
1972 Each tool or library may have its own implementation of the OpenScop
1973 data structures. The type names should not be the same as those provided
1974 as an example here (they correspond to the OpenScop Library implementation).
1975 The names of the fields, and their ordering, should however be the same. In this
1976 way, the interaction between tools and libraries should be as simple as a cast.
1978 Before reading at the OpenScop data structures, it is much recommended to
1979 read at the OpenScop file format description, as it is quite close to this
1980 representation (@pxref{OpenScop File Format Specification}).
1983 @node osl_relation_t
1984 @subsection osl_relation_t
1988 struct osl_relation @{
1989 int type; /* What this relation is encoding */
1990 int precision; /* Precision of the matrix elements */
1991 int nb_rows; /* Number of rows */
1992 int nb_columns; /* Number of columns */
1993 int nb_output_dims; /* Number of output dimensions */
1994 int nb_input_dims; /* Number of input dimensions */
1995 int nb_local_dims; /* Number of local dimensions */
1996 int nb_parameters; /* Number of parameters */
1997 void ** m; /* Matrix of constraints */
1998 struct osl_relation * next; /* Next relation in the union */
2000 typedef struct osl_relation osl_relation_t;
2001 typedef struct osl_relation * osl_relation_p;
2005 @noindent The @code{osl_relation_t} structure stores a part of an
2006 union of relations. A union of relation is a @code{NULL}-terminated
2007 linked list of union parts (@code{next} field). The @code{type} field
2008 may provide some information about what the relation is encoding:
2010 @item -1: undefined (@code{OSL_UNDEFINED}),
2011 @item 2: context domain (@code{OSL_TYPE_CONTEXT}),
2012 @item 3: iteration domain (@code{OSL_TYPE_DOMAIN}),
2013 @item 4: scattering relation (@code{OSL_TYPE_SCATTERING}),
2014 @item 6: read access relation (@code{OSL_TYPE_READ}),
2015 @item 7: write access relation (@code{OSL_TYPE_WRITE}),
2016 @item 8: may write access relation (@code{OSL_TYPE_MAY_WRITE}),
2018 The various numbers provide the details on the relation itself
2019 (@pxref{Relations}) while the @code{m} field points to
2020 the constraint matrix. The precision of the constraint matrix elements is
2021 provided by the @code{precision} field. It can take the following
2024 @item 32: 32 bits precision, elements are @code{long int}
2025 (@code{OSL_PRECISION_SP}),
2026 @item 64: 64 bits precision, elements are @code{long long int}
2027 (@code{OSL_PRECISION_DP}),
2028 @item 0: multiple precision, elements are GNU GMP Library's
2029 @code{mpz_t} (@code{OSL_PRECISION_MP}).
2032 @c ---------------------------------------------------------------------------
2034 @node osl_relation_list_t
2035 @subsection osl_relation_list_t
2039 struct osl_relation_list @{
2040 osl_relation_p elt; /* Element of the list */
2041 struct osl_relation_list * next; /* Next element of the list */
2043 typedef struct osl_relation_list osl_relation_list_t;
2044 typedef struct osl_relation_list * osl_relation_list_p;
2048 @noindent The @code{osl_relation_list_t} structure is a @code{NULL}-terminated
2049 linked list of @code{osl_relation_t} data structures.
2050 @code{elt} is a relation element of the list and @code{next} is the pointer to
2051 the next element of the list.
2053 @c ---------------------------------------------------------------------------
2055 @node osl_interface_t
2056 @subsection osl_interface_t
2060 typedef void (*osl_idump_f) (FILE *, void *, int);
2061 typedef char * (*osl_sprint_f)(void *);
2062 typedef void * (*osl_sread_f) (char *);
2063 typedef void * (*osl_malloc_f)();
2064 typedef void (*osl_free_f) (void *);
2065 typedef void * (*osl_clone_f) (void *);
2066 typedef int (*osl_equal_f) (void *, void *);
2068 struct osl_interface @{
2069 char * URI; /* Unique interface identifier string */
2070 osl_idump_f idump; /* Pointer to the idump function */
2071 osl_sprint_f sprint; /* Pointer to the sprint function */
2072 osl_sread_f sread; /* Pointer to the sread function */
2073 osl_malloc_f malloc; /* Pointer to the malloc function */
2074 osl_free_f free; /* Pointer to the free function */
2075 osl_clone_f clone; /* Pointer to the clone function */
2076 osl_equal_f equal; /* Pointer to the equal function */
2077 struct osl_interface * next; /* Next interface in the list */
2079 typedef struct osl_interface osl_interface_t;
2080 typedef struct osl_interface * osl_interface_p;
2084 @noindent The @code{osl_interface_t} structure represents a
2085 node in a @code{NULL}-terminated list of interfaces. Each node stores the
2086 @emph{interface} of a generic OpenScop object, i.e., its unique name
2087 (@code{URI}) and the function pointers to all the base functions it has
2088 to provide. Extension providers will find information relative to those
2089 functions in the OpenScop Library description (@pxref{Base Functions})
2090 and the section dedicated to writing extensions
2091 (@pxref{Extension Development}).
2093 @c ---------------------------------------------------------------------------
2096 @subsection osl_generic_t
2100 struct osl_generic @{
2101 void * data; /* Pointer to some data */
2102 osl_interface_p interface; /* Interface to work with the data */
2103 struct osl_generic * next; /* Pointer to the next generic */
2105 typedef struct osl_generic osl_generic_t;
2106 typedef struct osl_generic * osl_generic_p;
2110 @noindent The @code{osl_generic_t} structure represents a node in a
2111 @code{NULL}-terminated list of generic elements. It stores some data
2112 and operations with no pre-defined type. The information is accessible
2113 through the @code{data} pointer while the type and operations are
2114 accessible through the @code{interface} pointer. It is used to represent
2115 data that are allowed to differ in implementations, such as symbols and
2118 @c ---------------------------------------------------------------------------
2121 @subsection osl_strings_t
2125 struct osl_string @{
2126 char ** string; /* NULL-terminated array of strings */
2128 typedef struct osl_strings osl_strings_t;
2129 typedef struct osl_strings * osl_strings_p;
2133 @noindent The @code{osl_strings_t} structure represents a NULL-terminated
2134 list of C character strings. It is encapsulated into a structure to allow
2135 its manipulation through a generic type.
2137 @c ---------------------------------------------------------------------------
2140 @subsection osl_body_t
2145 osl_strings_p iterators; /* Original iterators */
2146 osl_strings_p expression; /* Original statement expression */
2148 typedef struct osl_body osl_body_t;
2149 typedef struct osl_body * osl_body_p;
2153 @noindent The @code{osl_body_t} structure stores a statement body in a
2154 textual form. The complete original expression (directly copy-pasted
2155 from the original code) is in the expression field while the textual forms
2156 of the original iterators are in the iterators field. They may be used for
2157 substitutions inside the expression.
2159 @c ---------------------------------------------------------------------------
2161 @node osl_statement_t
2162 @subsection osl_statement_t
2166 struct osl_statement @{
2167 osl_relation_p domain; /* Iteration domain */
2168 osl_relation_p scattering; /* Scattering relation */
2169 osl_relation_list_p access; /* List of array access relations */
2170 osl_generic_p body; /* Original statement body */
2171 void * usr; /* A user-defined field */
2172 struct osl_statement * next; /* Next statement in the list */
2174 typedef struct osl_statement osl_statement_t;
2175 typedef struct osl_statement * osl_statement_p;
2179 @noindent The @code{osl_statement_t} structure represents a node
2180 in a @code{NULL}-terminated linked list of statements. Each node contains the
2181 useful information for a given statement to process it within a polyhedral
2182 framework. The order in the list may matter for naming conventions
2183 (e.g. "S1" for the first statement in the list). The iteration domain
2184 and the scattering are represented using an @code{osl_relation_p}
2185 structure while the accesses are using a list of
2186 relations: one for each memory access in the statement.
2187 The @code{body} field should provide information about the statement body
2188 (since it has a generic type, the specification is not strict about how it
2189 is used), e.g., using the @code{osl_body_t} data structure (@pxref{osl_body_t}).
2190 It is also possible to use the @code{usr} field, but it has to be
2191 totally managed by the user.
2193 @c ---------------------------------------------------------------------------
2196 @subsection osl_scop_t
2200 int version; /* Version of the data structure */
2201 char * language; /* Target language */
2202 osl_relation_p context; /* Constraints on the parameters */
2203 osl_generic_p parameters; /* Information about parameters */
2204 osl_statement_p statement; /* Statement list */
2205 osl_interface_p registry; /* Registered extension interfaces */
2206 osl_generic_p extension; /* Extension list */
2207 void * usr; /* A user-defined field */
2208 struct osl_scop * next; /* Next scop in the list */
2210 typedef struct osl_scop osl_scop_t;
2211 typedef struct osl_scop * osl_scop_p;
2215 @noindent @code{osl_scop_t} represents a node in a
2216 @code{NULL}-terminated list of scops. It stores the useful informations
2217 of a static control part of a program to process it within a polyhedral
2218 framework. To prepare OpenScop specification evolution, the @code{version}
2219 field tells the version of the data structure. It should be set to 1 for
2220 now (and hopefully a very, very, long time).
2221 First, it contains the informations about the context. The target language
2222 in expressed in the @code{language} field. The constraints on the
2223 global parameters are detailed in the @code{context} field.
2224 The @code{paremeters} field should provide information about the
2225 parameters (since it has a generic type, the specification is not strict
2226 about how it is used), e.g., using the @code{osl_strings_t} data structure
2227 (@pxref{osl_strings_t}).
2228 Finally, it contains the list of statements @code{statement}, the list
2229 of registered interfaces for generic types @code{registry} and the list of
2230 extentions @code{extension}.
2231 It is also possible to use the @code{usr} field, but it has to be
2232 totally managed by the user.
2234 As an example, let us consider again the matrix multiply program
2235 (@pxref{Preliminary Example}).
2236 The next figure gives a possible representation in memory for this
2237 SCoP thanks to the OpenScop data structures (it has been actually printed
2238 by the @code{osl_scop_dump} function), note that symbols like
2239 parameters, original iterators and statement expression are represented
2240 with an @code{osl_strings_t} which does not belong to the
2241 specification but to the OpenScop Library implementation:
2251 | +-- osl_relation_t (CONTEXT, 32 bits)
2257 | | +-- osl_interface_t: URI = strings
2259 | | +-- osl_strings_t: N
2262 | +-- osl_statement_t (S1)
2264 | | +-- osl_relation_t (DOMAIN, 32 bits)
2267 | | | [ 1 -1 0 1 -1 ]
2269 | | | [ 1 0 -1 1 -1 ]
2271 | | +-- osl_relation_t (SCATTERING, 32 bits)
2273 | | | [ 0 -1 0 0 0 0 0 0 0 0 ]
2274 | | | [ 0 0 -1 0 0 0 1 0 0 0 ]
2275 | | | [ 0 0 0 -1 0 0 0 0 0 0 ]
2276 | | | [ 0 0 0 0 -1 0 0 1 0 0 ]
2277 | | | [ 0 0 0 0 0 -1 0 0 0 0 ]
2279 | | +-- osl_relation_list_t
2281 | | | +-- osl_relation_t (WRITE, 32 bits)
2283 | | | | [ 0 -1 0 0 0 0 0 1 ]
2284 | | | | [ 0 0 -1 0 1 0 0 0 ]
2285 | | | | [ 0 0 0 -1 0 1 0 0 ]
2288 | | +-- osl_generic_t
2290 | | | +-- osl_interface_t: URI = body
2292 | | | +-- osl_strings_t: i j
2294 | | | +-- osl_strings_t: C[i][j] = 0.0;
2298 | | osl_statement_t (S2)
2300 | | +-- osl_relation_t (DOMAIN, 32 bits)
2302 | | | [ 1 1 0 0 0 0 ]
2303 | | | [ 1 -1 0 0 1 -1 ]
2304 | | | [ 1 0 1 0 0 0 ]
2305 | | | [ 1 0 -1 0 1 -1 ]
2306 | | | [ 1 0 0 1 0 0 ]
2307 | | | [ 1 0 0 -1 1 -1 ]
2309 | | +-- osl_relation_t (SCATTERING, 32 bits)
2311 | | | [ 0 -1 0 0 0 0 0 0 0 0 0 0 0 ]
2312 | | | [ 0 0 -1 0 0 0 0 0 1 0 0 0 0 ]
2313 | | | [ 0 0 0 -1 0 0 0 0 0 0 0 0 0 ]
2314 | | | [ 0 0 0 0 -1 0 0 0 0 1 0 0 0 ]
2315 | | | [ 0 0 0 0 0 -1 0 0 0 0 0 0 1 ]
2316 | | | [ 0 0 0 0 0 0 -1 0 0 0 1 0 0 ]
2317 | | | [ 0 0 0 0 0 0 0 -1 0 0 0 0 0 ]
2319 | | +-- osl_relation_list_t
2321 | | | +-- osl_relation_t (WRITE, 32 bits)
2323 | | | | [ 0 -1 0 0 0 0 0 0 1 ]
2324 | | | | [ 0 0 -1 0 1 0 0 0 0 ]
2325 | | | | [ 0 0 0 -1 0 1 0 0 0 ]
2328 | | | osl_relation_list_t
2330 | | | +-- osl_relation_t (READ, 32 bits)
2332 | | | | [ 0 -1 0 0 0 0 0 0 1 ]
2333 | | | | [ 0 0 -1 0 1 0 0 0 0 ]
2334 | | | | [ 0 0 0 -1 0 1 0 0 0 ]
2337 | | | osl_relation_list_t
2339 | | | +-- osl_relation_t (READ, 32 bits)
2341 | | | | [ 0 -1 0 0 0 0 0 0 2 ]
2342 | | | | [ 0 0 -1 0 1 0 0 0 0 ]
2343 | | | | [ 0 0 0 -1 0 0 1 0 0 ]
2346 | | | osl_relation_list_t
2348 | | | +-- osl_relation_t (READ, 32 bits)
2350 | | | | [ 0 -1 0 0 0 0 0 0 3 ]
2351 | | | | [ 0 0 -1 0 0 0 1 0 0 ]
2352 | | | | [ 0 0 0 -1 0 1 0 0 0 ]
2355 | | +-- osl_generic_t
2357 | | | +-- osl_interface_t: URI = body
2359 | | | +-- osl_strings_t: i j k
2361 | | | +-- osl_strings_t: C[i][j] = C[i][j] + A[i][k]*B[k][j];
2365 | +-- NULL interface
2371 @c @end smallexample
2374 @c %/*************************************************************************
2376 @c % *************************************************************************/
2380 The core part of the OpenScop representation embeds what is strictly
2381 necessary to build a complete source-to-source polyhedral framework.
2382 However it may not be enough. Hence, OpenScop offers a very flexible
2383 extension part. Actually, the only constraint to build an extension is
2384 to request the OpenScop maintainer for a unique extension name: its URI
2385 (ask the maintainer through the OpenScop mailing list
2386 @email{openscop-development@@googlegroups.com}).
2388 The policy to support extensions is the following and is pretty simple: an
2389 OpenScop implementation is not required to support any extension. If it
2390 is processing an OpenScop file or data structure which contains an
2391 extension which is not supported, it must (1) warn the user with the
2392 mention of the URI of the non-supported extension
2393 and (2) ignore this extension.
2395 Extensions in an OpenScop file are provided after the core part, without
2396 any specific order. Each extension is delimited using
2397 XML-like tags corresponding to its URI (e.g., if the extension has the URI
2398 @code{foo}, the begin tag is @code{<foo>} and the end tag is @code{</foo>}).
2399 There is no specification or preferred way to write the extension body.
2400 Extensions in an OpenScop data structure must be accessible through one
2401 pointer. This pointer will be stored in the @code{data} field of an
2402 @code{osl_generic_t} container (@pxref{osl_generic_t}). There must be only
2403 one extension with the same URI in an OpenScop file or data structure.
2405 Extension writers may write a short documentation about their extension to
2406 be added to this document. For consistency reason, this
2407 documentation should comply to the documentation of the
2408 @code{comment} option (@pxref{Comment Extension}). To describe the
2409 file format, it is allowed to reuse the existing rules and terminals
2410 present in the OpenScop file format description without defining them
2411 (@pxref{OpenScop File Format Specification}). By sending a
2412 documentation, you accept it to be added to this document. In
2413 particular, the sender fully accepts the license and copyright notice.
2416 * Comment Extension::
2417 * Arrays Extension::
2418 * Scatnames Extension::
2419 * Coordinates Extension::
2420 * Irregular Extension::
2423 @c ---------------------------------------------------------------------------
2425 @node Comment Extension
2426 @subsection Comment Extension
2428 @noindent @strong{Description}
2430 @item URI: @code{comment}.
2431 @item Author: C@'edric Bastoul <cedric.bastoul@@u-psud.fr>.
2432 @item Purpose: the @code{comment} extension stores a textual string.
2435 @noindent @strong{File Format}
2437 @noindent The @code{comment} extension file format respects the following
2440 Comment_generic ::= "<comment>" Comment "</comment>"
2444 @noindent An example of textual @code{comment} extension is the following:
2448 This is a comment string.
2453 @noindent @strong{Data Structure}
2455 @noindent The @code{comment} extension data structure is the following:
2458 struct osl_comment @{
2459 char * comment; /* Comment message as a 0-terminated string */
2461 typedef struct osl_comment osl_comment_t;
2462 typedef struct osl_comment * osl_comment_p;
2466 @c ---------------------------------------------------------------------------
2469 @node Scatnames Extension
2470 @subsection Scatnames Extension
2472 @noindent @strong{Description}
2474 @item URI: @code{scatnames}.
2475 @item Author: C@'edric Bastoul <cedric.bastoul@@u-psud.fr>.
2476 @item Purpose: the @code{scatnames} extension provides a list of textual
2477 scattering dimension names.
2480 @noindent @strong{File Format}
2482 @noindent The @code{scatnames} extension file format respects the following
2483 grammar. It reuses the @code{Strings} description (@pxref{Strings Generic}):
2485 Scatnames_generic ::= "<scatnames>" Scatnames "</scatnames>"
2486 Scatnames ::= Strings
2489 @noindent The list of scattering dimension names is provided on one single
2490 line. The names are separated with spaces. A possible
2491 example of such an extension is the following:
2496 # List of scattering dimension names:
2497 beta_0 i beta_1 j beta_2
2502 @noindent @strong{Data Structure}
2504 @noindent The @code{scatnames} extension data structure is the following:
2508 struct osl_scatnames @{
2509 osl_strings_p names; /* List of textual scattering dimension names. */
2511 typedef struct osl_scatnames osl_scatnames_t;
2512 typedef struct osl_scatnames * osl_scatnames_p;
2516 @noindent The order of the scattering dimension names in the list corresponds
2517 to the order of the scattering dimensions.
2519 @c ---------------------------------------------------------------------------
2522 @node Arrays Extension
2523 @subsection Arrays Extension
2525 @noindent @strong{Description}
2527 @item URI: @code{arrays}.
2528 @item Author: C@'edric Bastoul <cedric.bastoul@@u-psud.fr>.
2529 @item Purpose: the @code{arrays} extension provides a set of textual array
2530 names corresponding to the array identifiers used in the access relations.
2533 @noindent @strong{File Format}
2535 @noindent The @code{arrays} extension file format respects the following
2538 Arrays_generic ::= "<arrays>" Arrays "</arrays>"
2539 Arrays ::= Nb_items Item_list
2540 Item_List ::= Item Item_list | (void)
2541 Item ::= Identifier Name
2542 Nb_items ::= _Integer
2543 Identifier ::= _Integer
2547 @noindent The number of array names is provided on the first line,
2548 then each following line contains a couple identifier-name.
2549 For instance, the following example is a correct textual @code{arrays}
2550 extension. It corresponds to the array names of the preliminary example
2551 (@pxref{Preliminary Example}):
2556 # Number of array names:
2558 1 C # Identifier 1 corresponds to array name "C"
2559 3 B # Identifier 3 corresponds to array name "B"
2560 2 A # Identifier 2 corresponds to array name "A"
2565 @noindent @strong{Data Structure}
2567 @noindent The @code{arrays} extension data structure is the following:
2571 struct osl_arrays @{
2572 int nb_names; /* Number of names */
2573 int * id; /* Array of nb_names identifiers */
2574 char ** names; /* Array of nb_names names */
2576 typedef struct osl_arrays osl_arrays_t;
2577 typedef struct osl_arrays * osl_arrays_p;
2581 @noindent Each name has a name string and an identifier: the ith name has name
2582 string @code{names[i]} and identifier @code{id[i]}.
2585 @c ---------------------------------------------------------------------------
2587 @node Coordinates Extension
2588 @subsection Coordinates Extension
2590 @noindent @strong{Description}
2592 @item URI: @code{coordinates}.
2593 @item Author: C@'edric Bastoul <cedric.bastoul@@u-psud.fr>.
2594 @item Purpose: the @code{coordinates} extension provides the information
2595 about the SCoP location in the original code: the original file name/path,
2596 the starting and ending lines of the SCoP in this file (inclusives) and
2597 the indentation level.
2600 @noindent @strong{File Format}
2602 @noindent The @code{coordinates} extension file format respects the following
2605 Coordinates_generic ::= "<coordinates>" Coordinates "</coordinates>"
2606 Coordinates ::= File_name Start_line End_line Indentation
2607 File_name ::= _String
2608 Start_line ::= _Integer
2609 End_line ::= _Integer
2610 Indentation ::= _Integer
2613 @noindent The original file name where the SCoP has been extracted is
2614 provided on the first line, then the starting line number of the SCoP,
2615 then the ending line number of the SCoP, and lastly the indentation level
2616 (the number of spaces characters each line of the SCoP starts with).
2617 For instance, the following example is a correct textual
2618 @code{coordinates} extension:
2635 @noindent @strong{Data Structure}
2637 @noindent The @code{coordinates} extension data structure is the following:
2641 struct osl_coordinates @{
2642 char * name; /* File name */
2643 int start; /* First line of the SCoP in the source file */
2644 int end; /* Last line of the SCoP in the source file */
2645 int indent; /* Indentation */
2647 typedef struct osl_coordinates osl_coordinates_t;
2648 typedef struct osl_coordinates * osl_coordinates_p;
2653 @c ---------------------------------------------------------------------------
2655 @node Irregular Extension
2656 @subsection Irregular Extension
2659 @c ---------------------------------------------------------------------------
2664 OpenScop is a follow-up of Louis-No@"el Pouchet et al.'s ScopLib effort which
2665 was itself based on C@'edric Bastoul et al.'s Clan tool. People involved in
2666 OpenScop's genesis are:
2668 @item C@'edric Bastoul
2669 @item Uday Bondhugula
2670 @item Tobias Grosser
2671 @item Louis-No@"el Pouchet
2672 @item Sven Verdoolaege
2675 @c %/*************************************************************************
2676 @c % * OpenScop LIBRARY *
2677 @c % *************************************************************************/
2679 @node OpenScop Library
2680 @chapter OpenScop Library
2682 The OpenScop Library, or OSL for short, is an example implementation of the
2683 OpenScop specification. Its API is not part of the OpenScop specification.
2684 It offers basic functionalities to manipulate the OpenScop data structures
2685 (allocate, free, copy, dump, etc.) and file format (read, print, etc.).
2686 The OpenScop Library is @emph{not} a polyhedral library. OpenScop is an
2687 exchange format, and the OpenScop Library reflects this.
2689 It is a Free Software using the 3-clause BSD License.
2690 Programmers should feel free to use
2691 it or copy/paste its code in any project, Open Source or not@footnote{Closed
2692 source projects should consider to provide some OpenScop file input
2693 and output, so they can be incorporated to any OpenScop-based tool chain.}.
2698 * Example of OpenScop Library Utilization::
2707 The OpenScop specification does not impose a specific type for the
2708 constraint matrix elements. For a maximum flexibility, the OpenScop Library
2709 offers an hybrid precision implementation. It supports 32 bits, 64 bits and
2710 multiple precision (relying on GNU GMP) relations transparently. At relation
2711 allocation time, users have two ways to set the precision. The first way is
2712 to call an allocation function with a precision parameter. The second way is
2713 to rely on the environment variable @code{OSL_PRECISION}.
2714 The accepted values for this variable are @code{32} for 32 bits precision,
2715 @code{64} for 64 bits precision and @code{0} for multiple precision. When this
2716 variable is set, its value becomes the default precision for relation elements.
2717 For instance, to ensure the OpenScop Library will use 64 bits precision
2718 by default, the user may set:
2720 export OSL_PRECISION=64
2722 @noindent if his shell is, e.g., bash or
2724 setenv OSL_PRECISION 64
2726 @noindent if his shell is, e.g., tcsh. The user should ad this line to
2727 his .bashrc or .tcshrc (or whatever convenient file) to make this
2730 @node Base Functions
2731 @section Base Functions
2733 The OpenScop Library provides, for each OpenScop data structure,
2734 a set of functions devoted to basic manipulation, conversion
2735 from file format to data structures and from data structures to
2736 file format. The naming convention is consistent for all data
2737 structures. Hence, the function prototypes differ only with the
2738 name of the data structure. In the following, we will use the
2739 generic term of @emph{structure} to refer at any OpenScop
2740 data structure. For instance the
2741 @code{osl_}@emph{structure}@code{_malloc()} function is a
2742 generic name can be instantiated to
2743 @code{osl_relation_malloc()} or
2744 @code{osl_statement_malloc()} etc.
2746 We present in this documentation only
2747 the main functions. Many other utility functions are provided
2748 to ease OpenScop format manipulation. The reader is invited to
2749 refer at the technical documentation to learn everything about the
2764 @subsection Dumping: osl_@emph{structure}_dump and idump
2768 void osl_@emph{structure}_dump(FILE * output, osl_@emph{structure}_p s);
2769 void osl_@emph{structure}_idump(FILE * output, osl_@emph{structure}_p s, int i);
2773 @noindent Each OpenScop data structure has a dumping functions
2774 as shown above. Dumping means writing down the content of the data
2775 structure pointed by @code{s} (and its fields recursively)
2776 in a textual form to the
2777 @code{output} file (the file, possibly @code{stdout}, has to be open
2778 for writing). The textual form is not the OpenScop file format but
2779 another representation closer to the internal representation in
2780 memory and mainly intended for debugging purpose. The @code{idump}
2781 function has an additional integer parameter which corresponds to
2782 an indentation level.
2785 @subsection Printing: osl_@emph{structure}_print
2789 void osl_@emph{structure}_print(FILE * output, osl_@emph{structure}_p s);
2793 @noindent Each OpenScop data structure has a pretty printing function
2794 as shown above. It prints the content of the data
2795 structure pointed by @code{s} (and its fields recursively)
2796 according to the OpenScop file format
2797 (@pxref{OpenScop File Format Specification}) to the
2798 @code{output} file (the file, possibly @code{stdout}, has to be open
2802 @subsection Reading: osl_@emph{structure}_read
2806 osl_@emph{structure}_p osl_@emph{structure}_read(FILE * input);
2810 @noindent Each OpenScop data structure has a reading function
2811 as shown above. It reads the content of an OpenScop
2812 data structure written according to the OpenScop file format
2813 (@pxref{OpenScop File Format Specification}) from
2814 the @code{input} file (the file, possibly @code{stdin}, has to be open
2815 for reading). It returns a pointer to a freshly allocated
2816 @code{osl_@emph{structure}_t} structure containing the
2820 @subsection Allocating: osl_@emph{structure}_malloc
2824 osl_@emph{structure}_p osl_@emph{structure}_malloc();
2828 @noindent Each OpenScop data structure has a memory allocation function
2829 as shown above (except one see below). It allocates the memory to store
2830 the corresponding data structure, it initializes the pointer fields to
2831 @code{NULL} and the integer fields to @code{OSL_UNDEFINED}
2832 (@code{-1}) and it returns a pointer to the allocated space.
2834 An exception to this base description is the
2835 @code{osl_relation_malloc()} function which requires two
2836 parameters: the number of rows and columns of the constraint
2837 matrix (@pxref{Relations}):
2841 osl_relation_p osl_relation_malloc(int nb_rows, int nb_columns);
2845 @noindent The precision of the relation elements will depend on the
2846 @code{OSL_PRECISION} environment variable (@pxref{Precision}) if it is set,
2847 or the maximum available precision if it is not set. Another allocation
2848 function is provided to explicitly set a given precision:
2852 osl_relation_p osl_relation_pmalloc(int precision,
2853 int nb_rows, int nb_columns);
2857 @noindent The @code{precision} field may take the following values:
2859 @item @code{OSL_PRECISION_SP} for 32 bits precision,
2860 @item @code{OSL_PRECISION_DP} for 64 bits precision,
2861 @item @code{OSL_PRECISION_MP} for multiple precision,
2865 @subsection Deallocating: osl_@emph{structure}_free
2869 void osl_@emph{structure}_free(osl_@emph{structure}_p s);
2873 @noindent Each OpenScop data structure has a memory deallocation function
2874 as shown above. It recursively frees the memory allocated for the
2875 structure pointed by @code{s}, i.e., internal structures are also freed.
2878 @subsection Cloning: osl_@emph{structure}_clone
2882 osl_@emph{structure}_p osl_@emph{structure}_clone(osl_@emph{structure}_p s);
2886 @noindent Each OpenScop data structure has a clone function
2887 as shown above. It recursively copies the content of the
2888 structure pointed by @code{s}, i.e., internal structures are also copied.
2889 It returns a pointer to the clone of the structure pointed by @code{s}.
2892 @subsection Testing: osl_@emph{structure}_equal
2896 int osl_@emph{structure}_equal(osl_@emph{structure}_p s1, osl_@emph{structure}_p s2);
2900 @noindent Each OpenScop data structure has a testing function
2901 as shown above. It checks whether two pointers are referring to equivalent
2902 structures (either by pointing to the same structure or to different
2903 structures which contain the same information). It returns 1 if the
2904 pointed structures are equivalent, 0 otherwise. This test is
2905 @emph{content-based} and is intended for debugging purpose. It is not
2906 (and will never be) able to state, e.g., that two relations with
2907 different constraint matrices are actually representing the same relation.
2910 @node Example of OpenScop Library Utilization
2911 @section Example of OpenScop Library Utilization
2912 Here is a basic example showing how it is possible to use the
2913 OpenScop Library, assuming that a standard installation has been done.
2914 The following C program reads an OpenScop file from the standard
2915 input and dumps the content of the data structures to the standard output.
2920 # include <osl/osl.h>
2925 // Read the OpenScop file.
2926 scop = osl_scop_read(stdin);
2928 // Dump the content of the scop data structure.
2929 osl_scop_dump(stdout, scop);
2932 osl_scop_free(scop);
2938 @noindent The compilation command could be:
2940 gcc example.c -losl -o example
2942 @noindent A calling command with the input file test.scop could be:
2944 more test.scop | ./example
2948 @c % ****************************** INSTALLING ********************************
2950 @section Installation
2955 * Installation Instructions::
2956 * Optional Features::
2962 First of all, it would be very kind to refer the present document in any
2963 publication that results from the use of the OpenScop specification or library,
2964 @pxref{Bas11} (a bibtex entry is provided behind the title page of this
2965 manual, along with the copyright notice).
2966 The OpenScop Library is provided under the 3-clause BSD license:
2968 Copyright (C) 2011 University Paris-Sud 11 and INRIA
2970 Redistribution and use in source and binary forms, with or without
2971 modification, are permitted provided that the following conditions
2974 @item Redistributions of source code must retain the above copyright notice,
2975 this list of conditions and the following disclaimer.
2976 @item Redistributions in binary form must reproduce the above copyrigh
2977 notice, this list of conditions and the following disclaimer in the
2978 documentation and/or other materials provided with the distribution.
2979 @item The name of the author may not be used to endorse or promote products
2980 derived from this software without specific prior written permission
2983 THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
2984 IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
2985 OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
2986 IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
2987 INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
2988 NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
2989 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
2990 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
2991 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
2992 THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
2995 @subsection Requirements
2997 The OpenScop Library is a stand-alone library. For a basic use,
2998 it does not need any additional tool or library. Anyway, to be able to
2999 work in conjunction with other tools that manipulate multiple precision
3000 numbers, the GNU GMP library can be used as an option.
3008 @subsubsection GMP Library (optional)
3010 To be able to deal with insanely large coefficient, the user will need to
3011 install the GNU Multiple Precision Library (GMP for short) version 4.2.2
3012 or above@footnote{@code{http://www.swox.com/gmp}}.
3013 The user can compile it by typing the following commands on the GMP root
3017 @item @code{./configure}
3019 @item And as root: @code{make install}
3022 The GMP default installation is @code{/usr/local}. This directory may
3023 not be inside the user's library path. To fix the problem, the user should set
3025 export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
3027 @noindent if your shell is, e.g., bash or
3029 setenv LD_LIBRARY_PATH $LD_LIBRARY_PATH:/usr/local/lib
3031 @noindent if your shell is, e.g., tcsh. Add the line to your .bashrc or .tcshrc (or
3032 whatever convenient file) to make this change permanent. Another solution
3033 is to ask GMP to install in the standard path by using the prefix
3034 option of the configure script:
3035 @samp{./configure --prefix=/usr}.
3037 The OpenScop Library has to be built using the GMP library by specifying
3038 the convenient configure script options to buid the GMP version
3039 (@pxref{Optional Features}).
3042 @node Installation Instructions
3043 @subsection Installation Instructions
3045 Once downloaded and unpacked
3046 (e.g. using the @samp{tar -zxvf openscop-@value{LIB_VERSION}.tar.gz} command),
3047 you can compile the OpenScop Library by typing the following commands
3048 on the OpenScop Library's root directory:
3051 @item @code{./autogen.sh}
3052 @item @code{./configure}
3054 @item And as root: @code{make install}
3057 The program binaries and object files can be removed from the
3058 source code directory by typing @code{make clean}. To also remove the
3059 files that the @code{configure} script created (so you can compile the
3060 package for a different kind of computer) type @code{make distclean}.
3062 @node Optional Features
3063 @subsection Optional Features
3064 The @code{configure} shell script attempts to guess correct values for
3065 various system-dependent variables and user options used during compilation.
3066 It uses those values to create the @code{Makefile}. Various user options
3067 are provided by the OpenScop Library's configure script. They are summarized in the
3068 following list and may be printed by typing @code{./configure --help} in the
3069 OpenScop Library top-level directory.
3072 @item By default, the installation directory is @code{/usr/local}:
3073 @code{make install} will install the package's files in
3074 @code{/usr/local/bin}, @code{/usr/local/lib} and @code{/usr/local/include}.
3075 The user can specify an installation prefix other than @code{/usr/local} by
3076 giving @code{configure} the option @code{--prefix=PATH}.
3078 @item By default, The OpenScop Library supports 32 bits, 64 bits and GMP if
3079 it is installed in the standard locations. Using the @code{--with-gmp} option
3080 of @code{configure} the user can specify that no GMP (@code{--with-gmp=no}),
3081 a previously installed (@code{--with-gmp=system}, the default) GMP or a
3082 build GMP (@code{--with-gmp=build}) GMP should be used.
3083 In case of an installed GMP, the installation location can be specified
3084 using the @code{--with-isl-prefix=PATH} and if different, the installation
3085 of the library can be specified using the
3086 @code{--with-isl-exec-prefix=PATH} options of @code{configure}.
3087 In the case of a build GMP, the user can also specify the build location
3088 using @code{--with-isl-builddir=PATH}.
3091 @node Uninstallation
3092 @subsection Uninstallation
3093 The user can easily remove the OpenScop Library from his system
3094 by typing (as root if necessary) from the OpenScop Library top-level
3096 @code{make uninstall}.
3098 @c % **************************** DOCUMENTATION ******************************
3100 @section Documentation
3101 The OpenScop Library distribution provides several sources of documentation.
3102 First, the source code itself is as documented as much as possible.
3103 The code comments use the Doxygen technical documentation system.
3104 The user may install
3105 Doxygen@footnote{@code{http://www.stack.nl/~dimitri/doxygen}} to automatically
3106 generate a technical documentation by typing @code{make doc} or
3107 @code{doxygen ./autoconf/Doxyfile} at the OpenScop Library
3108 top-level directory after running the configure script
3109 (@pxref{Installation Instructions}). Doxygen will generate
3110 documentation sources (in HTML, LaTeX and man) in the @code{doc/source}
3111 directory of the OpenScop Library distribution.
3113 The Texinfo source of the present document is also provided in the @code{doc}
3114 directory. The user can build it in either PDF format
3115 (by typing @code{texi2pdf openscop.texi}) or HTML format
3116 (by typing @code{makeinfo --html openscop.texi}, using @code{--no-split}
3117 option to generate a single HTML file) or info format
3118 (by typing @code{makeinfo openscop.texi}).
3120 @c % **************************** DEVELOPPING ********************************
3122 @section Development
3128 * Extension Development::
3131 @node Copyright Issue
3132 @subsection Copyright Issue
3134 The OpenScop Library is an Open Source project and you should feel free to
3135 contribute by adding functionalities (in particular extensions), correcting
3136 bugs or improving documentation. However, for painful administrative reasons,
3137 the copyright of the core part (everything except extensions) should not be
3138 impacted by your work. Hence, if you are doing a significant contribution to
3139 the main part, the OpenScop Library maintainer may ask you for an agreement
3140 about this copyright. If you plan to do such a significant contribution, it
3141 may be wise to discuss this issue with the maintainer first. Extensions
3142 may include developer's own copyright.
3145 @subsection Repository
3147 The main repository of the OpenScop Library is
3148 @url{http://repo.or.cz/w/openscop.git}. Developers may ask the OpenScop Library
3149 maintainer to open them a write access to this repository. Only the maintainer
3150 should ever change the @code{master} branch. Developers should work on their
3151 own branches. To avoid any problem developers should use the @emph{fork}
3152 functionality of the repository.
3155 @subsection Coding Style
3157 The OpenScop Library is written in C using an object oriented style. Each
3158 important data structure (e.g., @code{struct foo}) has its own header file
3159 (@code{include/osl/foo.h}) where lies the definition of
3160 the data structure, the two typedefs for the data structure (one for the
3161 structure, @code{osl_foo_t}, and one for a pointer
3162 to the structure, @code{osl_foo_p}), the prototypes of the various
3163 functions related to this data structure, all named using the
3164 prefix "@code{osl_foo_}". The source code of the functions is provided in a
3165 separated C file (@code{source/foo.c}).
3167 Utility functions independent from the main data structures may be placed in
3168 separate source files (e.g., definition in @code{include/osl/util.h}
3169 and code in @code{source/util.c}). Tool-wide preprocessor directives are
3170 placed in @code{include/osl/macros.h}, macros are prefixed with
3173 The core code itself has to be written according to the Google C++ Coding Style:
3174 @url{http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml} (for
3175 what can apply to C), plus the naming conventions discussed above with
3176 highest priority. The extension parts must only respect the naming convention,
3177 but a consistent coding style is much appreciated.
3179 @node Extension Development
3180 @subsection Extension Development
3182 It's fairly easy to integrate a new extension to OpenScop and the OpenScop
3183 Library. Developing a new extension is very much like adding a new "object":
3184 it requires writing a data structure for the extension data and the 7 base
3185 functions to manage this extension. Here is how developers should proceed
3186 to add an extension called @code{foo} (beware that the naming convention is
3190 @item Send the name @code{foo} to the maintainer to ensure it is unique and
3191 hence can be used as an URI. The name (one single
3192 word, or words separated with underscores "_") should be
3193 suggested by the extension developers to the OpenScop development
3194 mailing list @email{openscop-development@@googlegroups.com}). It
3195 should not correspond to an existing structure name
3196 (see @code{include/osl/osl.h} for the list). The
3197 maintainer will update @code{include/osl/osl.h} in the development
3198 version accordingly.
3199 @item Look at the @code{comment} extension. The @code{comment} extension
3200 (@pxref{Comment Extension}) has been written to be used as a basic
3201 example for extension developers. Having a look at
3202 @code{include/osl/extensions/comment.h} and
3203 @code{source/extensions/comment.c} will be a great help to do it right.
3204 @item Create the extension data structure: it is necessary that the
3205 extension data can be accessible through only one pointer.
3206 @item Code the 7 base functions for the extension. Any extension must provide
3207 this set of functions (naming convention apply only if the
3208 extension is planed to be integrated to the OpenScop Library
3209 default extensions):
3211 @item @code{osl_foo_idump} (@pxref{Dumping})
3212 @item @code{osl_foo_sprint} has the following prototype:
3215 char * osl_@emph{structure}_sprint(osl_@emph{structure}_p s);
3218 It corresponds to the pretty printing functions of the core
3219 data structures (@pxref{Printing}) with the
3220 difference that the OpenScop textual representation is written
3221 to a string (returned by the function) instead of a file.
3222 @item @code{osl_foo_sread} has the following prototype:
3225 osl_@emph{structure}_p osl_@emph{structure}_sread(char ** string);
3228 It corresponds to the reading functions of the core
3229 data structures (@pxref{Reading}) with the
3230 difference that the OpenScop textual representation is read
3231 from a string (provided as a parameter) instead of a file.
3232 The address of the string to read is passed as a parameter and
3233 is updated to point immediately after what has been actually read.
3234 @item @code{osl_foo_malloc} (@pxref{Allocating})
3235 @item @code{osl_foo_free} (@pxref{Deallocating})
3236 @item @code{osl_foo_clone} (@pxref{Cloning})
3237 @item @code{osl_foo_equal} (@pxref{Testing})
3239 @item Code the other functions you need!
3242 Now let us consider two scenarios.
3244 First scenario, the extension is external and is
3245 not planned to be integrated to the OpenScop Library. In this case you are
3246 all set. Simply generate an @code{osl_interface_t} for your
3247 extension and have a look at the function
3248 @code{osl_scop_register_extension()} which is devoted to register
3249 a new extension interface to an existing @code{osl_scop_t}.
3251 Second scenario, the extension will integrate the set of default
3252 OpenScop Library extensions (the best solution to share it to other
3253 potential users). In this case, a few additional
3254 things have to be done:
3256 @item Create the extension header
3257 @code{include/osl/extensions/foo.h} to store the extension
3258 structure and function prototypes and the
3259 extension source file @code{source/extensions/foo.c} for the code
3261 @item Add the documentation for the extension to the texinfo source of
3262 this document (in @code{doc/openscop.texi}), following the example
3263 of the @code{comment} documentation (@pxref{Comment Extension}).
3264 @item Integrate the extension by adding the @code{extensions/foo.c} entry
3265 to the @code{libosl_la_SOURCES} in the @code{source/Makefile.am}
3266 file, the @code{osl/foo.h} entry to the
3267 @code{nobase_pkginclude_HEADERS} and add the corresponding
3268 @code{#include <osl/extensions/foo.h>} in the
3269 @code{include/scop.h.in} file.
3270 @item Add the new extension in the
3271 @code{osl_interface_get_default_registry()} function.
3272 @item You are done! Prepare a single commit or patch corresponding to the
3273 integration of the new extension and ask the maintainer to merge it
3274 to the master branch.
3278 @c % ****************************** REFERENCES ********************************
3284 @anchor{Bas03a}[Bas03a] C. Bastoul, P. Feautrier. Improving data locality
3285 by chunking. CC'12 International Conference on Compiler Construction,
3286 LNCS 2622, pages 320-335, Warsaw, April 2003.
3289 @anchor{Bas11}[Bas11] C. Bastoul.
3290 OpenScop: A Specification and a Library for Data Exchange in Polyhedral
3291 Compilation Tools. Technical Report, Paris-Sud University, France, June 2011.
3294 @anchor{Fea92}[Fea92] P. Feautrier. Some efficient solutions to the affine
3295 scheduling problem, part II: multidimensional time.
3296 International Journal of Parallel Programming, 21(6):389--420, December 1992.
3299 @anchor{Gri04}[Gri04] M. Griebl. Automatic parallelization of loop programs
3300 for distributed memory architectures. Habilitation Thesis. Facult@"at f@"ur
3301 Mathematik und Informatik, Universit@"at Passau, 2004.
3302 @emph{http://www.infosun.fmi.uni-passau.de/cl/loopo/}
3305 @anchor{Wil93}[Wil93] Doran K. Wilde.
3306 A library for doing polyhedral operations.
3307 Technical Report 785, IRISA, Rennes, France, 1993.
3314 @c % /*************************************************************************
3315 @c % * PART VI: END OF THE DOCUMENT *
3316 @c % *************************************************************************/
3317 @c @unnumbered Index