Fixed initialisation of tf in file_open(). Without setting the memory to 0,
[cinelerra_cv/mob.git] / libmpeg3 / docs / index.html
blob3b5d98290a50710a19ed7065e3a90ec794652ade
1 <TITLE>LibMPEG3</TITLE>
3 <CENTER>
5 <FONT FACE=HELVETICA SIZE=+4><B>Using LibMPEG3 for advanced MPEG
6 decoding</B></FONT><P>
8 <TABLE>
9 <TR>
10 <TD>
11 <CODE>
12 Author: Heroine Virtual Ltd. (Motion picture solutions for Linux without a warranty)<BR>
13 Harassment: broadcast@earthling.net<BR>
14 Homepage: heroinewarrior.com<P>
16 libmpeg3 is free software; you can redistribute it and/or modify it
17 under the terms of the GNU General Public License as published by the
18 Free Software Foundation; either version 2, or (at your option) any
19 later version.<P>
21 libmpeg3 is distributed in the hope that it will be useful,
22 but WITHOUT ANY WARRANTY; without even the implied warranty of
23 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
24 GNU General Public License for more details.<P>
27 </CODE> </TD> </TR> </TABLE> </CENTER>
30 <H1>Table of contents</H1><P>
31 <A HREF="#INTRO">Intro</A><BR>
32 <A HREF="#BUILDING">Building the library</A><BR>
33 <A HREF="#USAGE">Usage</A><BR>
34 <A HREF="#TOC">Using tables of contents for editing</A><BR>
35 <A HREF="#SUBTITLES">Decoding subtitles</A><BR>
36 <A HREF="#UTILITIES">Using the utilities</A><BR>
45 <P>
47 <A NAME=INTRO>
49 LibMPEG3 decodes several MPEG standards into uncompressed data suitable
50 for editing and playback.<P>
52 libmpeg3 currently decodes:<P>
54 <BLOCKQUOTE>MPEG-2 video<BR>
55 MPEG-1 video<BR>
56 mp3 audio<BR>
57 mp2 audio<BR>
58 ac3 audio<BR>
59 MPEG-2 transport streams<BR>
60 MPEG-2 program streams<BR>
61 MPEG-1 program streams<BR>
62 IFO files<BR>
63 </BLOCKQUOTE><P>
65 The video output can be in many different color models and frame
66 sizes. The audio output can be in twos compliment or floating point.
67 Frame accurate seeking, normally impossible in transport streams, is
68 possible in libmpeg3 through the use of a <B>table of contents</B>.
69 MPEG-2 video in YUV-422 colorspace is decodable. Digital TV broadcasts
70 and DVD's can be edited using libmpeg3. Libmpeg3 takes what is
71 normally a last mile distribution format and makes it editable.<P>
73 Because of these and other features libmpeg3 is not intended for
74 consumer applications but serves users who are interested in high
75 quality editing and footage acquisition.<P>
83 <A NAME=BUILDING>
84 <FONT FACE=HELVETICA SIZE=+4><B>Building the library</B></FONT><P>
86 libmpeg3 depends on the CFLAGS environment variable to get optimization
87 flags. You should set it to <P>
89 <TT>-O3 -march=i686 -fmessage-length=0 -funroll-all-loops
90 -fomit-frame-pointer -malign-loops=2 -malign-jumps=2
91 -malign-functions=2</TT><P>
93 You must run <B>make</B> to build the library and should be using
94 Kernel 2.4.9 or later. The makefile automatically determines
95 appropriate parameters and puts the library in i686/libmpeg3.a.
96 Several utilities are also built. Install the utilities by running
97 <B>make install</B>.<P>
99 Unfortunately libmpeg3 excercizes the
100 system more aggressively than a consumer library and this brings out
101 different bugs in each kernel version.<P>
104 2.4.9: ext3 filesystem failure<BR>
106 2.4.10: memory management failure when running mpeg3toc<BR>
108 2.4.17: memory management failure after 5 hours of decoding video<P>
111 As libmpeg3 is not one of the standard MPEG decoding
112 libraries, these utilities are unlike any you've ever seen before.
113 Remember a utility is only as illegal or legal as the guy who runs
114 it.<P>
119 <A NAME=USAGE>
120 <H1>Usage</H1><P>
123 <FONT FACE=HELVETICA SIZE=+4><B>STEP 1: Verifying file compatibility</B></FONT><P>
125 Programs using libmpeg3 must <CODE>#include "libmpeg3.h"</CODE>.<P>
127 Call <CODE>mpeg3_check_sig</CODE> to verify if the file can be read by
128 libmpeg3. This returns a 1 if it is compatible and 0 if it isn't.<P>
140 <FONT FACE=HELVETICA SIZE=+4><B>STEP 2: Open the file</B></FONT><P>
142 You need an <CODE>mpeg3_t*</CODE> file descriptor:<P>
143 <CODE>
144 mpeg3_t* file;
145 </CODE>
148 Then you need to open the file:<P>
150 <CODE>file = mpeg3_open(char *path);</CODE><P>
152 <CODE>mpeg3_open</CODE> returns a NULL if the file couldn't be opened
153 for some reason. Be sure to check this. Everything you do with
154 libmpeg3 requires passing the <CODE>file</CODE> pointer.<P>
156 Another way of opening a file is <P>
158 <CODE>mpeg3_open_copy(char *path, mpeg3_t *old_file)</CODE><P>
160 You need to open multiple copies of a file in realtime situations
161 because only one thread can access a mpeg3_t structure at a time. The
162 audio and video can't read simultaneously. The solution is not to
163 repeatedly call mpeg3_open but to call mpeg3_open_copy for every file
164 handle after the first one. This copies tables from the first file to
165 speed up opening.<P>
178 <FONT FACE=HELVETICA SIZE=+4><B>STEP 3: Set optimization strategies</B></FONT><P>
180 Call <CODE>mpeg3_set_cpus(mpeg3_t *file, int cpus)</CODE> to set how
181 many CPUs should be devoted to video decompression. LibMPEG3 can use
182 any number.<P>
184 Call <CODE>mpeg3_set_mmx(mpeg3_t *file, int use_mmx)</CODE> to set if
185 MMX is used for video. Disabling MMX is mandatory for low bitrate
186 streams since it is very lossy. By the way, lately the compiled MMX
187 output has been producing corrupted video. This is a change in the way
188 modern compilers and CPU's handle MMX from the way it was done 4 years
189 ago but since modern CPU's are so fast, you're better off not using MMX
190 at all.<P>
198 <FONT FACE=HELVETICA SIZE=+4><B>STEP 4: Query the file.</B></FONT><P>
200 There are a number of queries for the audio components of the stream:<P>
202 <CODE><PRE>
203 int mpeg3_has_audio(mpeg3_t *file);
204 int mpeg3_total_astreams(mpeg3_t *file); // Number of multiplexed audio streams
205 int mpeg3_audio_channels(mpeg3_t *file, int stream);
206 int mpeg3_sample_rate(mpeg3_t *file, int stream);
207 long mpeg3_audio_samples(mpeg3_t *file, int stream); // Total length
208 </PRE></CODE>
210 The audio is presented as a number of <B>streams</B> starting at 0 and
211 including <CODE>mpeg3_total_astreams</CODE> - 1. Each stream contains a
212 certain number of <B>channels</B> starting at 0 and including
213 <CODE>mpeg3_audio_channels</CODE> - 1.
215 The methodology is first determine if the file has audio, then get
216 the number of streams in the file, then for each stream get the number
217 of channels, sample rate, and length.<P>
219 There are also queries for the video components:<P>
221 <CODE><PRE>
222 int mpeg3_has_video(mpeg3_t *file);
223 int mpeg3_total_vstreams(mpeg3_t *file); // Number of multiplexed video streams
224 int mpeg3_video_width(mpeg3_t *file, int stream);
225 int mpeg3_video_height(mpeg3_t *file, int stream);
226 float mpeg3_frame_rate(mpeg3_t *file, int stream); // Frames/sec
227 long mpeg3_video_frames(mpeg3_t *file, int stream); // Total length
228 int mpeg3_colormodel(mpeg3_t *file, int stream);
229 </PRE></CODE>
231 The video behavior is the same as with audio, except that video has no
232 subdivision under <B>streams</B>. Frame rate is a floating point
233 number of frames per second.<P>
235 <TT>mpeg3_colormodel</TT> returns either MPEG3_YUV420P or
236 MPEG3_YUV422P. MPEG3_YUV422P is only encountered in high quality video
237 not available in any consumer distribution medium.<P>
243 <FONT FACE=HELVETICA SIZE=+4><B>STEP 5: Seeking to a point in the file</B></FONT><P>
245 Each audio stream and each video stream has a position in the file
246 independant of each other stream. A variety of methods are available
247 for specifying the position of a stream: <B>byte offset, frame,
248 sample</B>. Which method you use depends on whether you're seeking
249 audio or video and whether you have a table of contents for the
250 stream.<P>
252 The preferred seeking method if you're writing a player is:<P>
254 <CODE><PRE>
255 int mpeg3_seek_byte(mpeg3_t *file, int64_t byte);
256 int64_t mpeg3_tell_byte(mpeg3_t *file);
257 </PRE></CODE>
259 This seeks all tracks to an absolute byte offset in the file. The
260 byte offset is from 0 to the result of:<P>
262 <CORE><PRE>
263 mpeg3_get_bytes(mpeg3_t *file)
264 </PRE></CODE>
267 The alternative to byte seeking is <B>frame or sample seeking</B>.
268 Frame seeking is only possible if a <B>table of contents</B> exists.
269 The <B>mpeg3toc</B> that comes with libmpeg3 creates tables of contents
270 from MPEG 1 & 2 streams. Sample seeking is only possible if the stream
271 is fixed bitrate audio. The audio seeking is handled by:<P>
273 <CODE><PRE>
274 int mpeg3_set_sample(mpeg3_t *file, long sample, int stream); // Seek
275 long mpeg3_get_sample(mpeg3_t *file, int stream); // Tell current position
276 </PRE></CODE>
278 and the video seeking is handled by:<P>
280 <CODE><PRE>
281 int mpeg3_set_frame(mpeg3_t *file, long frame, int stream); // Seek
282 long mpeg3_get_frame(mpeg3_t *file, int stream); // Tell current position
283 </PRE></CODE>
286 You can either perform percentage seeking or absolute byte seeking but
287 not both on the same file handle. Once you perform either method, the
288 file becomes configured for that method.<P>
290 If you're in byte seeking mode and you want the current time stamp in
291 the file you can't use mpeg3_get_frame or mpeg3_get_sample because you
292 don't know the total length in the desired units. The
293 <CODE>mpeg3_audio_samples</CODE> and <CODE>mpeg3_video_frames</CODE>
294 commands don't work in percentage seeking either. Instead use
296 <CODE><PRE>
297 double mpeg3_get_time(mpeg3_t *file);
298 </PRE></CODE>
300 which gives you the last timecode read in seconds. The MPEG standard
301 specifies timecodes being placed in the streams. Now you know the
302 absolute byte position in the file and the current time stamp, enough
303 to update a progress bar or a text box.<P>
305 Finally, there is a way to seek to the previous frame of video:
308 <CODE><PRE>
309 int mpeg3_previous_frame(mpeg3_t *file, int stream);
310 </PRE></CODE>
312 Because MPEG 1 & 2 are really hairy, the set commands won't do much
313 good for playing backwards. mpeg3_previous_frame does some tricks to
314 seek to the previous frame. Next you have to call a read_frame
315 command to read it.
322 <FONT FACE=HELVETICA SIZE=+4><B>STEP 6: Read the data</B></FONT><P>
324 <I>To read <B>audio</B> data use:</I><P>
326 <CODE><PRE>
327 int mpeg3_read_audio(mpeg3_t *file,
328 float *output_f, // Pointer to pre-allocated buffer of floats
329 short *output_i, // Pointer to pre-allocated buffer if int16's
330 int channel, // Channel to decode
331 long samples, // Number of samples to decode
332 int stream); // Stream containing the channel
333 </PRE></CODE>
335 This decodes a buffer of sequential floats or int16's for a single
336 channel, depending on which *output... parameter has a nonzero
337 argument. To get a floating point buffer pass a pre-allocated buffer
338 to <CODE>output_f</CODE> and NULL to <CODE>output_i</CODE>. To get an
339 int16 buffer pass NULL to <CODE>output_f</CODE> and a pre-allocated
340 buffer to <CODE>output_i</CODE>. Alternatively you can pass NULL to
341 both buffer arguments and the decoder won't render anything.<P>
343 After reading an audio buffer, the current position in the one stream
344 is advanced. Remember that if you're using percentage seeking you
345 can't call <TT>mpeg3_set_sample</TT> to rewind and read every channel.
346 How then, do you read more than one channel of audio data? Use
348 <CODE><PRE>
349 mpeg3_reread_audio(mpeg3_t *file,
350 float *output_f, /* Pointer to pre-allocated buffer of floats */
351 short *output_i, /* Pointer to pre-allocated buffer of int16's */
352 int channel, /* Channel to decode */
353 long samples, /* Number of samples to decode */
354 int stream);
355 </PRE></CODE>
357 to read each remaining channel after the first channel.<P>
365 <I>To read <B>video</B> data there are two methods. RGB frames or YUV
366 frames. To get an RGB frame use:</I> <BR>
368 <CODE><PRE>
369 int mpeg3_read_frame(mpeg3_t *file,
370 unsigned char **output_rows, // Array of pointers to the start of each output row
371 int in_x, // Location in input frame to take picture
372 int in_y,
373 int in_w,
374 int in_h,
375 int out_w, // Dimensions of output_rows
376 int out_h,
377 int color_model, // One of the color model #defines given above.
378 int stream);
379 </PRE></CODE>
381 The video decoding works like a camcorder taking copies of a movie
382 screen. The decoder "sees" a region of the movie screen defined by
383 <CODE>in_x, in_y, in_w, in_h</CODE> and transfers it to the frame
384 buffer defined by <CODE>**output_rows</CODE>. The input values must be
385 within the boundaries given by <CODE>mpeg3_video_width</CODE> and
386 <CODE>mpeg3_video_height</CODE>. The size of the frame buffer is
387 defined by <CODE>out_w, out_h</CODE>. Although the input dimensions
388 are constrained, the frame buffer can be any size.<P>
390 <CODE>color_model</CODE> defines which RGB color model the picture
391 should be decoded to and the possible values are given in
392 <B>libmpeg3.h</B>. The frame buffer pointed to by
393 <CODE>output_rows</CODE> must have enough memory allocated to store the
394 color model you select.<P>
396 <B>You must allocate 4 extra bytes in the last output_row.</B> This is
397 scratch area for the MMX routines.<P>
399 <CODE>mpeg3_read_frame</CODE> advances the position in the one stream by 1 frame.<P>
401 <I>To read YUV frames use one of two methods:</I><BR>
403 <CODE><PRE>
404 int mpeg3_read_yuvframe(mpeg3_t *file,
405 char *y_output,
406 char *u_output,
407 char *v_output,
408 int in_x,
409 int in_y,
410 int in_w,
411 int in_h,
412 int stream);
413 </PRE></CODE>
415 The behavior of in_x, in_y, in_w, in_h is identical to mpeg3_read_frame
416 except here you have no control over the output frame size. <B>You
417 must allocate in_w * in_h for the y_output, and in_w * in_h / 4 for the
418 u_output and v_output.</B>
420 While <B>mpeg3_read_yuvframe</B> allows cropping of letterbox it still
421 requires one memcpy. A faster alternative is:<P>
423 <CODE><PRE>
424 int mpeg3_read_yuvframe_ptr(mpeg3_t *file,
425 char **y_output,
426 char **u_output,
427 char **v_output,
428 int stream);
429 </PRE></CODE>
431 This redirects a *y_output, *u_output, and *v_output pointer to the
432 scratch buffer that decoding took place in. Since MPEG is temporal
433 compression, there is always a buffer containing the last decoder
434 output.<P>
439 For professional use the library can decode YUV 4:2:2 video in addition
440 to YUV 4:2:0. This variable is determined at encoding time, won't
441 affect the usage of <B>mpeg3_read_frame</B> but you do need an extra
442 function call in order to use <B>mpeg3_read_yuvframe</B>. To determine
443 the encoding of the video stream use
445 <CODE><PRE>
446 mpeg3_colormodel(mpeg3_t *file, int stream)
447 </PRE></CODE>
449 This returns either <B>MPEG3_YUV420P</B> or <B>MPEG3_YUV422P</B>. The
450 output buffers and the YUV to RGB conversion for
451 <B>mpeg3_read_yuvframe</B> must be adjusted for the higher sampling of
452 MPEG3_YUV422P. When using <B>mpeg3_read_yuvframe_ptr</B> you don't
453 need to adjust any output buffers.<P>
455 <FONT FACE=HELVETICA SIZE=+4><B>Synchronizing video with audio</B></FONT><P>
457 To synchronize video with audio in realtime you need to sometimes delay
458 the video and sometimes drop frames. It's easy to calculate the number
459 of frames to drop but if you're using percentage seeking you can't
460 calculate the exact percentage to seek forward by. Instead call <P>
462 <CODE>mpeg3_drop_frames(mpeg3_t *file, long frames, int stream);</CODE><P>
464 This skips <CODE>frames</CODE> frames from the current position whether
465 in percentage seeking or absolute seeking.<P>
468 <FONT FACE=HELVETICA SIZE=+4><B>STEP 7: Close the file</B></FONT><P>
470 Be sure to close the file with <CODE>mpeg3_close(mpeg3_t *file)</CODE>
471 when you're done with it.
482 <A NAME=TOC>
483 <FONT FACE=HELVETICA SIZE=+4><B>Using tables of contents for editing</B></FONT><P>
485 In 1985 everyone watched Smurfs but one guy watched Robotech. In 1990
486 everyone watched Teenage Mutant Ninja Turtles but one guy watched
487 Transformers. In 1995 everyone watched Pokemon but one guy watched
488 Behind the Scenes. Now everyone wants handheld organizers but one guy
489 wants MPEG editors. For the wierdos who always looked at the camera
490 rig instead of the celebrity, libmpeg3 supports a way of seeking to an
491 exact frame or sample in any kind of MPEG encapsulation format for
492 editing.<P>
494 A table of contents must be built for any footage to be edited with
495 libmpeg3. Run <TT>mpeg3toc &lt;mpeg stream> &lt;output table of contents></TT><P>
498 For editing DVD footage, the mpeg stream argument should be the ifo
499 file belonging to the title set to be edited. This utility reads
500 through every file comprising the mpeg stream and records the offset of
501 every 65536th sample and every keyframe so it can be pretty slow.<P>
503 The resulting table of contents file should be passed to mpeg3_open
504 and mpeg3_open_copy just like a normal file. The only difference is
505 frame seeking of video is available.<P>
516 <A NAME=SUBTITLES
517 <FONT FACE=HELVETICA SIZE=+4><B>Decoding subtitles</B></FONT><P>
519 Subtitles are encoded in DVD Program streams as 4 color bitmaps. They
520 require invoking mpeg3_open with the IFO file corresponding to the VOB
521 files in order to play the VOB files. The IFO file contains the color
522 palette.
527 <A NAME=UTILITIES
528 <FONT FACE=HELVETICA SIZE=+4><B>USING THE UTILITIES</B></FONT><P>