11 (Version of 27-Feb-92)
15 The CMIF video format was invented to allow various applications
16 to exchange video data. The format consists of
17 a header containing global information (like data format)
18 followed by a sequence of frames, each consisting of a header
19 followed by the actual frame data.
20 All information except pixel data is
21 encoded in ASCII. Pixel data is \fIalways\fP encoded in Silicon Graphics
22 order, which means that the first pixel in the frame is the lower left
25 All ASCII data except the first line of the file
26 is in python format. This means that
27 outer parentheses can be ommitted, and parentheses around a tuple with
28 one element can also be omitted. So, the lines
36 have the same meaning.
37 To ease parsing in C programs, however, it is advised that there are
38 no parenteses around single items, and that there are parentheses around
39 lists. So, the second format above is preferred.
41 The current version is version 3, but this document will also explain
42 shortly what the previous formats looked like.
46 The header consists of three lines. The first line identifies the file
47 as a CMIF video file, and gives the version number.
53 All programs expect the layout to be exactly like this, so no
54 extra spaces, etc. should be added.
56 The second line specifies the data format. Its format is a python
57 tuple with two members. The first member is a string giving the format
58 type and the second is a tuple containing type-specific information.
59 The following formats are currently understood:
61 The video data is 24 bit RGB packed into 32 bit words.
62 R is the least significant byte, then G and then B. The top byte is
65 There is no type-specific information, so the complete data format
71 The video data is greyscale, at most 8 bits. Data is packed into
72 8 bit bytes (in the low-order bits). The extra information is the
73 number of significant bits, so an example data format line is
78 The video data is in YIQ format. This is a format that has one luminance
79 component, Y, and two chrominance components, I and Q. The luminance and
80 chrominance components are encoded in \fItwo\fP pixel arrays: first an
81 array of 8-bit luminance values followed by a array of 16 bit chrominance
82 values. See the section on chrominance coding for details.
84 The type specific part contains the number of bits for Y, I and Q,
85 the chrominance packfactor and the colormap offset. So, a sample format
89 ('yiq',(5,3,3,2,1024))
91 means that the pictures have 5 bit Y values (in the luminance array),
92 3 bits of I and Q each (in the chrominance array), chrominance data
93 is packed for 2x2 pixels, and the first colormap index used is 1024.
95 The video data is in HLS format. L is the luminance component, H and S
96 are the chrominance components. The data format and type specific information
97 are the same as for the yiq format.
99 The video data is in HSV format. V is the luminance component, H and S
100 are the chrominance components. Again, data format and type specific
101 information are the same as for the yiq format.
103 The video data is in 8 bit dithered rgb format. This is the format
104 used internally by the Indigo. bit 0-2 are green, bit 3-4 are blue and
105 bit 5-7 are red. Because rgb8 is treated more-or-less like yiq format
106 internally the type-specific information is the same, with zeroes for
107 the (unused) chrominance sizes:
112 The third header line contains width and height of the video image,
113 in pixels, and the pack factor of the picture. For compatability, RGB
114 images must have a pack factor of 0 (zero), and non-RGB images must
115 have a pack factor of at least 1.
116 The packfactor is the amount of compression done on the original video
117 signal to obtain pictures. In other words, if only one out of three pixels
118 and lines is stored (so every 9 original pixels have one pixel in the
119 data) the packfactor is three. Width and height are the size of the
120 \fIoriginal\fP picture.
121 Viewers are expected to enlarge the picture so it is shown in the
122 original size. RGB videos cannot be packed.
128 means that this was a 200x200 picture that is stored as 100x100 pixels.
132 Each frame is preceded by a single header line. This line contains timing information
133 and optional size information. The time information is mandatory, and
134 contains the time this frame should be displayed, in milliseconds since
135 the start of the film. Frames should be stored in chronological order.
137 An optional second number is interpreted as the size of the luminance
138 data in bytes. Currently this number, if present, should always be the
139 same as \fCwidth*height/(packfactor*packfactor)\fP (times 4 for RGB
140 data), but this might change if we come up with variable-length encoding
143 An optional third number is the size of the chrominance data
144 in bytes. If present, the number should be equal to
146 luminance_size2*/(chrompack*chrompack).
150 For RGB films, the frame data is an array of 32 bit pixels containing
151 RGB data in the lower 24 bits. For greyscale films, the frame data
152 is an array of 8 bit pixels. For split luminance/chrominance films the
153 data consists of two parts: first an array of 8 bit luminance values
154 followed by an array of 16 bit chrominance values.
156 For all data formats, the data is stored left-to-right, bottom-to-top.
160 Since the human eye is apparently more sensitive to luminance changes
161 than to chrominance changes we support a coding where we split the luminance
162 and chrominance components of the video image. The main point of this
163 is that it allows us to transmit chrominance data in a coarser granularity
164 than luminance data, for instance one chrominance pixel for every
165 2x2 luminance pixels. According to the theory this should result in an
166 acceptable picture while reducing the data by a fair amount.
168 The coding of split chrominance/luminance data is a bit tricky, to
169 make maximum use of the graphics hardware on the Personal Iris. Therefore,
170 there are the following constraints on the number of bits used:
172 No more than 8 luminance bits,
174 No more than 11 bits total,
176 The luminance bits are in the low-end of the data word, and are stored
179 The two sets of chrominance bits are stored in 16 bit words, correctly
182 The color map offset is added to the chrominance data. The offset should
183 be at most 4096-256-2**(total number of bits). To reduce interference with
184 other applications the offset should be at least 1024.
186 So, as an example, an HLS video with 5 bits L, 4 bits H, 2 bits S and an
187 offset of 1024 will look as follows in-core and in-file:
191 31 15 11 10 9 8 5 4 0
192 +-----------------------------------+
193 incore + 0+ 1+ S + H + L +
194 +-----------------------------------+
198 +-----------------------+
199 C-array + 0+ 1+ S + H + 0 +
200 +-----------------------+