libmpeg3/docs/index.html

   1 <TITLE>LibMPEG3</TITLE>
   2
   3 <CENTER>
   4
   5 <FONT FACE=HELVETICA SIZE=+4><B>Using LibMPEG3 for advanced MPEG
   6 decoding</B></FONT><P>
   7
   8 <TABLE>
   9 <TR>
  10 <TD>
  11 <CODE>
  12 Author: Heroine Virtual Ltd. (Motion picture solutions for Linux without a warranty)<BR>
  13 Harassment: broadcast@earthling.net<BR>
  14 Homepage: heroinewarrior.com<P>
  15
  16 libmpeg3 is free software; you can redistribute it and/or modify it
  17 under the terms of the GNU General Public License as published by the
  18 Free Software Foundation; either version 2, or (at your option) any
  19 later version.<P>
  20
  21 libmpeg3 is distributed in the hope that it will be useful,
  22 but WITHOUT ANY WARRANTY; without even the implied warranty of
  23 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  24 GNU General Public License for more details.<P>
  25
  26
  27 </CODE> </TD> </TR> </TABLE> </CENTER>
  28
  29
  30 <H1>Table of contents</H1><P>
  31 <A HREF="#INTRO">Intro</A><BR>
  32 <A HREF="#BUILDING">Building the library</A><BR>
  33 <A HREF="#USAGE">Usage</A><BR>
  34 <A HREF="#TOC">Using tables of contents for editing</A><BR>
  35 <A HREF="#SUBTITLES">Decoding subtitles</A><BR>
  36 <A HREF="#UTILITIES">Using the utilities</A><BR>
  37
  38
  39
  40
  41
  42
  43
  44
  45 <P>
  46
  47 <A NAME=INTRO>
  48
  49 LibMPEG3 decodes several MPEG standards into uncompressed data suitable
  50 for editing and playback.<P>
  51
  52 libmpeg3 currently decodes:<P>
  53
  54 <BLOCKQUOTE>MPEG-2 video<BR>
  55 MPEG-1 video<BR>
  56 mp3 audio<BR>
  57 mp2 audio<BR>
  58 ac3 audio<BR>
  59 MPEG-2 transport streams<BR>
  60 MPEG-2 program streams<BR>
  61 MPEG-1 program streams<BR>
  62 IFO files<BR>
  63 </BLOCKQUOTE><P>
  64
  65 The video output can be in many different color models and frame
  66 sizes.  The audio output can be in twos compliment or floating point.
  67 Frame accurate seeking, normally impossible in transport streams, is
  68 possible in libmpeg3 through the use of a <B>table of contents</B>.
  69 MPEG-2 video in YUV-422 colorspace is decodable.  Digital TV broadcasts
  70 and DVD's can be edited using libmpeg3.  Libmpeg3 takes what is
  71 normally a last mile distribution format and makes it editable.<P>
  72
  73 Because of these and other features libmpeg3 is not intended for
  74 consumer applications but serves users who are interested in high
  75 quality editing and footage acquisition.<P>
  76
  77
  78
  79
  80
  81
  82
  83 <A NAME=BUILDING>
  84 <FONT FACE=HELVETICA SIZE=+4><B>Building the library</B></FONT><P>
  85
  86 libmpeg3 depends on the CFLAGS environment variable to get optimization
  87 flags.  You should set it to <P>
  88
  89 <TT>-O3 -march=i686 -fmessage-length=0 -funroll-all-loops
  90 -fomit-frame-pointer -malign-loops=2 -malign-jumps=2
  91 -malign-functions=2</TT><P>
  92
  93 You must run <B>make</B> to build the library and should be using
  94 Kernel 2.4.9 or later.  The makefile automatically determines
  95 appropriate parameters and puts the library in i686/libmpeg3.a.
  96 Several utilities are also built.  Install the utilities by running
  97 <B>make install</B>.<P>
  98
  99 Unfortunately libmpeg3 excercizes the
 100 system more aggressively than a consumer library and this brings out
 101 different bugs in each kernel version.<P>
 102
 103
 104 2.4.9: ext3 filesystem failure<BR>
 105
 106 2.4.10: memory management failure when running mpeg3toc<BR>
 107
 108 2.4.17: memory management failure after 5 hours of decoding video<P>
 109
 110
 111 As libmpeg3 is not one of the standard MPEG decoding
 112 libraries, these utilities are unlike any you've ever seen before.
 113 Remember a utility is only as illegal or legal as the guy who runs
 114 it.<P>
 115
 116
 117
 118
 119 <A NAME=USAGE>
 120 <H1>Usage</H1><P>
 121
 122
 123 <FONT FACE=HELVETICA SIZE=+4><B>STEP 1: Verifying file compatibility</B></FONT><P>
 124
 125 Programs using libmpeg3 must <CODE>#include "libmpeg3.h"</CODE>.<P>
 126
 127 Call <CODE>mpeg3_check_sig</CODE> to verify if the file can be read by
 128 libmpeg3.  This returns a 1 if it is compatible and 0 if it isn't.<P>
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140 <FONT FACE=HELVETICA SIZE=+4><B>STEP 2: Open the file</B></FONT><P>
 141
 142 You need an <CODE>mpeg3_t*</CODE> file descriptor:<P>
 143 <CODE>
 144 mpeg3_t* file;
 145 </CODE>
 146 <P>
 147
 148 Then you need to open the file:<P>
 149
 150 <CODE>file = mpeg3_open(char *path);</CODE><P>
 151
 152 <CODE>mpeg3_open</CODE> returns a NULL if the file couldn't be opened
 153 for some reason.  Be sure to check this.  Everything you do with
 154 libmpeg3 requires passing the <CODE>file</CODE> pointer.<P>
 155
 156 Another way of opening a file is <P>
 157
 158 <CODE>mpeg3_open_copy(char *path, mpeg3_t *old_file)</CODE><P>
 159
 160 You need to open multiple copies of a file in realtime situations
 161 because only one thread can access a mpeg3_t structure at a time.  The
 162 audio and video can't read simultaneously.  The solution is not to
 163 repeatedly call mpeg3_open but to call mpeg3_open_copy for every file
 164 handle after the first one.  This copies tables from the first file to
 165 speed up opening.<P>
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178 <FONT FACE=HELVETICA SIZE=+4><B>STEP 3: Set optimization strategies</B></FONT><P>
 179
 180 Call <CODE>mpeg3_set_cpus(mpeg3_t *file, int cpus)</CODE> to set how
 181 many CPUs should be devoted to video decompression.  LibMPEG3 can use
 182 any number.<P>
 183
 184 Call <CODE>mpeg3_set_mmx(mpeg3_t *file, int use_mmx)</CODE> to set if
 185 MMX is used for video.  Disabling MMX is mandatory for low bitrate
 186 streams since it is very lossy.  By the way, lately the compiled MMX
 187 output has been producing corrupted video.  This is a change in the way
 188 modern compilers and CPU's handle MMX from the way it was done 4 years
 189 ago but since modern CPU's are so fast, you're better off not using MMX
 190 at all.<P>
 191
 192
 193
 194
 195
 196
 197
 198 <FONT FACE=HELVETICA SIZE=+4><B>STEP 4: Query the file.</B></FONT><P>
 199
 200 There are a number of queries for the audio components of the stream:<P>
 201
 202 <CODE><PRE>
 203 int mpeg3_has_audio(mpeg3_t *file);
 204 int mpeg3_total_astreams(mpeg3_t *file);             // Number of multiplexed audio streams
 205 int mpeg3_audio_channels(mpeg3_t *file, int stream);
 206 int mpeg3_sample_rate(mpeg3_t *file, int stream);
 207 long mpeg3_audio_samples(mpeg3_t *file, int stream); // Total length
 208 </PRE></CODE>
 209
 210 The audio is presented as a number of <B>streams</B> starting at 0 and
 211 including <CODE>mpeg3_total_astreams</CODE> - 1.  Each stream contains a
 212 certain number of <B>channels</B> starting at 0 and including
 213 <CODE>mpeg3_audio_channels</CODE> - 1.
 214
 215 The methodology is first determine if the file has audio, then get
 216 the number of streams in the file, then for each stream get the number
 217 of channels, sample rate, and length.<P>
 218
 219 There are also queries for the video components:<P>
 220
 221 <CODE><PRE>
 222 int mpeg3_has_video(mpeg3_t *file);
 223 int mpeg3_total_vstreams(mpeg3_t *file);            // Number of multiplexed video streams
 224 int mpeg3_video_width(mpeg3_t *file, int stream);
 225 int mpeg3_video_height(mpeg3_t *file, int stream);
 226 float mpeg3_frame_rate(mpeg3_t *file, int stream);  // Frames/sec
 227 long mpeg3_video_frames(mpeg3_t *file, int stream); // Total length
 228 int mpeg3_colormodel(mpeg3_t *file, int stream);
 229 </PRE></CODE>
 230
 231 The video behavior is the same as with audio, except that video has no
 232 subdivision under <B>streams</B>.  Frame rate is a floating point
 233 number of frames per second.<P>
 234
 235 <TT>mpeg3_colormodel</TT> returns either MPEG3_YUV420P or
 236 MPEG3_YUV422P.  MPEG3_YUV422P is only encountered in high quality video
 237 not available in any consumer distribution medium.<P>
 238
 239
 240
 241
 242
 243 <FONT FACE=HELVETICA SIZE=+4><B>STEP 5: Seeking to a point in the file</B></FONT><P>
 244
 245 Each audio stream and each video stream has a position in the file
 246 independant of each other stream.  A variety of methods are available
 247 for specifying the position of a stream: <B>byte offset, frame,
 248 sample</B>.  Which method you use depends on whether you're seeking
 249 audio or video and whether you have a table of contents for the
 250 stream.<P>
 251
 252 The preferred seeking method if you're writing a player is:<P>
 253
 254 <CODE><PRE>
 255 int mpeg3_seek_byte(mpeg3_t *file, int64_t byte);
 256 int64_t mpeg3_tell_byte(mpeg3_t *file);
 257 </PRE></CODE>
 258
 259 This seeks all tracks to an absolute byte offset in the file.  The
 260 byte offset is from 0 to the result of:<P>
 261
 262 <CORE><PRE>
 263 mpeg3_get_bytes(mpeg3_t *file)
 264 </PRE></CODE>
 265
 266
 267 The alternative to byte seeking is <B>frame or sample seeking</B>.
 268 Frame seeking is only possible if a <B>table of contents</B> exists.
 269 The <B>mpeg3toc</B> that comes with libmpeg3 creates tables of contents
 270 from MPEG 1 & 2 streams.  Sample seeking is only possible if the stream
 271 is fixed bitrate audio.  The audio seeking is handled by:<P>
 272
 273 <CODE><PRE>
 274 int mpeg3_set_sample(mpeg3_t *file, long sample, int stream);    // Seek
 275 long mpeg3_get_sample(mpeg3_t *file, int stream);    // Tell current position
 276 </PRE></CODE>
 277
 278 and the video seeking is handled by:<P>
 279
 280 <CODE><PRE>
 281 int mpeg3_set_frame(mpeg3_t *file, long frame, int stream); // Seek
 282 long mpeg3_get_frame(mpeg3_t *file, int stream);            // Tell current position
 283 </PRE></CODE>
 284
 285
 286 You can either perform percentage seeking or absolute byte seeking but
 287 not both on the same file handle.  Once you perform either method, the
 288 file becomes configured for that method.<P>
 289
 290 If you're in byte seeking mode and you want the current time stamp in
 291 the file you can't use mpeg3_get_frame or mpeg3_get_sample because you
 292 don't know the total length in the desired units.  The
 293 <CODE>mpeg3_audio_samples</CODE> and <CODE>mpeg3_video_frames</CODE>
 294 commands don't work in percentage seeking either.  Instead use
 295
 296 <CODE><PRE>
 297 double mpeg3_get_time(mpeg3_t *file);
 298 </PRE></CODE>
 299
 300 which gives you the last timecode read in seconds.  The MPEG standard
 301 specifies timecodes being placed in the streams.  Now you know the
 302 absolute byte position in the file and the current time stamp, enough
 303 to update a progress bar or a text box.<P>
 304
 305 Finally, there is a way to seek to the previous frame of video:
 306
 307
 308 <CODE><PRE>
 309 int mpeg3_previous_frame(mpeg3_t *file, int stream);
 310 </PRE></CODE>
 311
 312 Because MPEG 1 & 2 are really hairy, the set commands won't do much
 313 good for playing backwards.  mpeg3_previous_frame does some tricks to
 314 seek to the previous frame.  Next you have to call a read_frame
 315 command to read it.
 316 <P>
 317
 318
 319
 320
 321
 322 <FONT FACE=HELVETICA SIZE=+4><B>STEP 6: Read the data</B></FONT><P>
 323
 324 <I>To read <B>audio</B> data use:</I><P>
 325
 326 <CODE><PRE>
 327 int mpeg3_read_audio(mpeg3_t *file,
 328                 float *output_f,      // Pointer to pre-allocated buffer of floats
 329                 short *output_i,      // Pointer to pre-allocated buffer if int16's
 330                 int channel,          // Channel to decode
 331                 long samples,         // Number of samples to decode
 332                 int stream);          // Stream containing the channel
 333 </PRE></CODE>
 334
 335 This decodes a buffer of sequential floats or int16's for a single
 336 channel, depending on which *output... parameter has a nonzero
 337 argument.  To get a floating point buffer pass a pre-allocated buffer
 338 to <CODE>output_f</CODE> and NULL to <CODE>output_i</CODE>. To get an
 339 int16 buffer pass NULL to <CODE>output_f</CODE> and a pre-allocated
 340 buffer to <CODE>output_i</CODE>.  Alternatively you can pass NULL to
 341 both buffer arguments and the decoder won't render anything.<P>
 342
 343 After reading an audio buffer, the current position in the one stream
 344 is advanced.  Remember that if you're using percentage seeking you
 345 can't call <TT>mpeg3_set_sample</TT> to rewind and read every channel.
 346 How then, do you read more than one channel of audio data?  Use
 347
 348 <CODE><PRE>
 349 mpeg3_reread_audio(mpeg3_t *file,
 350                 float *output_f,      /* Pointer to pre-allocated buffer of floats */
 351                 short *output_i,      /* Pointer to pre-allocated buffer of int16's */
 352                 int channel,          /* Channel to decode */
 353                 long samples,         /* Number of samples to decode */
 354                 int stream);
 355 </PRE></CODE>
 356
 357 to read each remaining channel after the first channel.<P>
 358
 359
 360
 361
 362
 363
 364
 365 <I>To read <B>video</B> data there are two methods.  RGB frames or YUV
 366 frames.  To get an RGB frame use:</I> <BR>
 367
 368 <CODE><PRE>
 369 int mpeg3_read_frame(mpeg3_t *file,
 370                 unsigned char **output_rows, // Array of pointers to the start of each output row
 371                 int in_x,                    // Location in input frame to take picture
 372                 int in_y,
 373                 int in_w,
 374                 int in_h,
 375                 int out_w,                   // Dimensions of output_rows
 376                 int out_h,
 377                 int color_model,             // One of the color model #defines given above.
 378                 int stream);
 379 </PRE></CODE>
 380
 381 The video decoding works like a camcorder taking copies of a movie
 382 screen.  The decoder "sees" a region of the movie screen defined by
 383 <CODE>in_x, in_y, in_w, in_h</CODE> and transfers it to the frame
 384 buffer defined by <CODE>**output_rows</CODE>.  The input values must be
 385 within the boundaries given by <CODE>mpeg3_video_width</CODE> and
 386 <CODE>mpeg3_video_height</CODE>.  The size of the frame buffer is
 387 defined by <CODE>out_w, out_h</CODE>.  Although the input dimensions
 388 are constrained, the frame buffer can be any size.<P>
 389
 390 <CODE>color_model</CODE> defines which RGB color model the picture
 391 should be decoded to and the possible values are given in
 392 <B>libmpeg3.h</B>.  The frame buffer pointed to by
 393 <CODE>output_rows</CODE> must have enough memory allocated to store the
 394 color model you select.<P>
 395
 396 <B>You must allocate 4 extra bytes in the last output_row.</B>  This is
 397 scratch area for the MMX routines.<P>
 398
 399 <CODE>mpeg3_read_frame</CODE> advances the position in the one stream by 1 frame.<P>
 400
 401 <I>To read YUV frames use one of two methods:</I><BR>
 402
 403 <CODE><PRE>
 404 int mpeg3_read_yuvframe(mpeg3_t *file,
 405                 char *y_output,
 406                 char *u_output,
 407                 char *v_output,
 408                 int in_x,
 409                 int in_y,
 410                 int in_w,
 411                 int in_h,
 412                 int stream);
 413 </PRE></CODE>
 414
 415 The behavior of in_x, in_y, in_w, in_h is identical to mpeg3_read_frame
 416 except here you have no control over the output frame size.  <B>You
 417 must allocate in_w * in_h for the y_output, and in_w * in_h / 4 for the
 418 u_output and v_output.</B>
 419
 420 While <B>mpeg3_read_yuvframe</B> allows cropping of letterbox it still
 421 requires one memcpy.  A faster alternative is:<P>
 422
 423 <CODE><PRE>
 424 int mpeg3_read_yuvframe_ptr(mpeg3_t *file,
 425                 char **y_output,
 426                 char **u_output,
 427                 char **v_output,
 428                 int stream);
 429 </PRE></CODE>
 430
 431 This redirects a *y_output, *u_output, and *v_output pointer to the
 432 scratch buffer that decoding took place in.  Since MPEG is temporal
 433 compression, there is always a buffer containing the last decoder
 434 output.<P>
 435
 436
 437
 438
 439 For professional use the library can decode YUV 4:2:2 video in addition
 440 to YUV 4:2:0.  This variable is determined at encoding time, won't
 441 affect the usage of <B>mpeg3_read_frame</B> but you do need an extra
 442 function call in order to use <B>mpeg3_read_yuvframe</B>.  To determine
 443 the encoding of the video stream use
 444
 445 <CODE><PRE>
 446 mpeg3_colormodel(mpeg3_t *file, int stream)
 447 </PRE></CODE>
 448
 449 This returns either <B>MPEG3_YUV420P</B> or <B>MPEG3_YUV422P</B>.  The
 450 output buffers and the YUV to RGB conversion for
 451 <B>mpeg3_read_yuvframe</B> must be adjusted for the higher sampling of
 452 MPEG3_YUV422P.  When using <B>mpeg3_read_yuvframe_ptr</B> you don't
 453 need to adjust any output buffers.<P>
 454
 455 <FONT FACE=HELVETICA SIZE=+4><B>Synchronizing video with audio</B></FONT><P>
 456
 457 To synchronize video with audio in realtime you need to sometimes delay
 458 the video and sometimes drop frames.  It's easy to calculate the number
 459 of frames to drop but if you're using percentage seeking you can't
 460 calculate the exact percentage to seek forward by.  Instead call <P>
 461
 462 <CODE>mpeg3_drop_frames(mpeg3_t *file, long frames, int stream);</CODE><P>
 463
 464 This skips <CODE>frames</CODE> frames from the current position whether
 465 in percentage seeking or absolute seeking.<P>
 466
 467
 468 <FONT FACE=HELVETICA SIZE=+4><B>STEP 7: Close the file</B></FONT><P>
 469
 470 Be sure to close the file with <CODE>mpeg3_close(mpeg3_t *file)</CODE>
 471 when you're done with it.
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481 <P>
 482 <A NAME=TOC>
 483 <FONT FACE=HELVETICA SIZE=+4><B>Using tables of contents for editing</B></FONT><P>
 484
 485 In 1985 everyone watched Smurfs but one guy watched Robotech.  In 1990
 486 everyone watched Teenage Mutant Ninja Turtles but one guy watched
 487 Transformers.  In 1995 everyone watched Pokemon but one guy watched
 488 Behind the Scenes.  Now everyone wants handheld organizers but one guy
 489 wants MPEG editors.  For the wierdos who always looked at the camera
 490 rig instead of the celebrity, libmpeg3 supports a way of seeking to an
 491 exact frame or sample in any kind of MPEG encapsulation format for
 492 editing.<P>
 493
 494 A table of contents must be built for any footage to be edited with
 495 libmpeg3.  Run <TT>mpeg3toc &lt;mpeg stream> &lt;output table of contents></TT><P>
 496
 497
 498 For editing DVD footage, the mpeg stream argument should be the ifo
 499 file belonging to the title set to be edited.  This utility reads
 500 through every file comprising the mpeg stream and records the offset of
 501 every 65536th sample and every keyframe so it can be pretty slow.<P>
 502
 503 The resulting table of contents file should be passed to mpeg3_open
 504 and mpeg3_open_copy just like a normal file.  The only difference is
 505 frame seeking of video is available.<P>
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516 <A NAME=SUBTITLES
 517 <FONT FACE=HELVETICA SIZE=+4><B>Decoding subtitles</B></FONT><P>
 518
 519 Subtitles are encoded in DVD Program streams as 4 color bitmaps.  They
 520 require invoking mpeg3_open with the IFO file corresponding to the VOB
 521 files in order to play the VOB files.  The IFO file contains the color
 522 palette.
 523
 524
 525
 526
 527 <A NAME=UTILITIES
 528 <FONT FACE=HELVETICA SIZE=+4><B>USING THE UTILITIES</B></FONT><P>
 529
 530
 531
 532
 533