README

   1 README for the Dirac video codec
   2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   3
   4 by the BBC R&D Dirac team (diracinfo@rd.bbc.co.uk)
   5
   6
   7 1. Executive Summary
   8 ~~~~~~~~~~~~~~~~~~~~
   9
  10 Dirac is an open source video codec. It uses a traditional hybrid video codec
  11 architecture, but with the wavelet transform instead of the usual block
  12 transforms.  Motion compensation uses overlapped blocks to reduce block
  13 artefacts that would upset the transform coding stage.
  14
  15 Dirac can code just about any size of video, from streaming up to HD and
  16 beyond, although certain presets are defined for different applications and
  17 standards.  These cover the parameters that need to be set for the encoder to
  18 work, such as block sizes and temporal prediction structures, which must
  19 otherwise be set by hand.
  20
  21 Dirac is intended to develop into real coding and decoding software, capable
  22 of plugging into video processing applications and media players that need
  23 compression. It is intended to develop into a simple set of reliable but
  24 effective coding tools that work over a wide variety of content and formats,
  25 using well-understood compression techniques, in a clear and accessible
  26 software structure. It is not intended as a demonstration or reference coder.
  27
  28
  29 2. Documentation
  30 ~~~~~~~~~~~~~~~~
  31
  32 Documentation can be found at
  33 http://diracvideo.org/wiki/index.php/Main_Page#Documentation
  34
  35 3. Building and installing
  36 ~~~~~~~~~~~~~~~~~~~~~~~~~~
  37
  38   GNU/Linux, Unix, MacOS X, Cygwin, Mingw
  39   ---------------------------------------
  40     ./configure --enable-debug
  41         (to enable extra debug compile options)
  42      OR
  43     ./configure --enable-profile
  44         (to enable the g++ profiling flag -pg)
  45      OR
  46     ./configure --disable-mmx
  47         (to disable MMX optimisation which is enabled by default)
  48      OR
  49     ./configure --enable-debug --enable-profile
  50         (to enable extra debug compile options and profiling options)
  51      OR
  52      ./configure
  53
  54      By default, both shared and static libraries are built. To build all-static
  55      libraries use
  56      ./configure --disable-shared
  57
  58      To build shared libraries only use
  59      ./configure --disable-static
  60
  61      make
  62      make install
  63
  64   The INSTALL file documents arguments to ./configure such as
  65   --prefix=/usr/local (specify the installation location prefix).
  66
  67
  68   MSYS and Microsoft Visual C++
  69   -----------------------------
  70      Download and install the no-cost Microsoft Visual C++ 2008 Express
  71      Edition  from
  72      http://msdn.microsoft.com/vstudio/express/visualc/
  73
  74      Download and install MSYS (the MinGW Minimal SYStem), MSYS-1.0.10.exe,
  75      from http://www.mingw.org/download.shtml. An MSYS icon will be available
  76      on the desktop.
  77
  78      Click on the MSYS icon on the desktop to open a MSYS shell window.
  79
  80      Create a .profile file to set up the environment variables required.
  81      vi .profile
  82
  83      Include the following four lines in the .profile file.
  84
  85      PATH=/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/Common7/IDE:/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/BIN:/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/Common7/Tools:/c/WINDOWS/Microsoft.NET/Framework/v3.5:/c/WINDOWS/Microsoft.NET/Framework/v2.0.50727:/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/VCPackages:$PATH
  86
  87      INCLUDE=/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/INCLUDE:$INCLUDE
  88          LIB=/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/LIB:$LIB
  89
  90          LIBPATH=/c/WINDOWS/Microsoft.NET/Framework/v3.5:/c/WINDOWS/Microsoft.NET/Framework/v2.0.50727:/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/LIB:$LIBPATH
  91
  92     (Replace /c/Program\ Files/Microsoft\ Visual\ Studio\ 9/ with
  93     the location where VC++ 2008 is installed if necessary)
  94
  95      Exit from the MSYS shell and click on the MSYS icon on the desktop to open
  96      a new MSYS shell window for the .profile to take effect.
  97
  98      Change directory to the directory where Dirac was unpacked. By default
  99      only the dynamic libraries are built.
 100
 101      ./configure CXX=cl LD=cl --enable-debug
 102          (to enable extra debug compile options)
 103      OR
 104      ./configure CXX=cl LD=cl --disable-shared
 105          (to build static libraries)
 106      OR
 107      ./configure CXX=cl LD=cl
 108      make
 109      make install
 110
 111      The INSTALL file documents arguments to ./configure such as
 112      --prefix=/usr/local (specify the installation location prefix).
 113
 114   Microsoft Visual C++ .NET 2008
 115   ------------------------------
 116   Download and install the no-cost Microsoft Visual C++ 2008 Express
 117   Edition  from
 118   http://www.microsoft.com/express/download/
 119
 120   The MS VC++ 2008 solution and project files are in win32/VisualStudio
 121   directory.  Double-click on the solution file, dirac.sln, in the
 122   win32/VisualStudio directory.  The target 'Everything' builds the codec
 123   libraries and utilities. Four build-types are supported
 124
 125   Debug - builds unoptimised encoder and decoder dlls with debug symbols
 126   Release - builds optimised encoder and decoder dlls
 127   Debug-mmx - builds unoptimised encoder and decoder dlls with debug symbols
 128               and mmx optimisations enabled.
 129   Release-mmx - builds optimised encoder and decoder dlls  with mmx
 130               optimisations enabled.
 131   Static-Debug - builds unoptimised encoder and decoder static libraries
 132                  with debug symbols
 133   Static-Release - builds optimised encoder and decoder static libraries
 134   Static-Debug-mmx - builds unoptimised encoder and decoder static libraries
 135                      with debug symbols and mmx optmisations enabled.
 136   Static-Release-mmx - builds optimised encoder and decoder static libraries
 137                        with mmx optmisations enabled.
 138
 139   Static libraries are created in the win32/VisualStudio/build/lib/<build-type> directory.
 140
 141   Encoder and Decoder dlls and import libraries, encoder and decoder apps are
 142   created in the win32/VisualStudio/build/bin/<build-type> directory. The "C"
 143   public API is exported using the _declspec(dllexport) mechanism.
 144
 145   Conversion utilites are created in the
 146   win32/VisualStudio/build/utils/conversion/<build-type> directory. Only static
 147   versions are  built.
 148   Instrumentation utility is created in the
 149   win32/VisualStudio/build/utils/instrumentation/<build-type> directory. Only
 150   static versions are built.
 151
 152
 153   Older editions of Microsoft Visual C++  (e.g. 2003 and 2005)
 154   -----------------------------------------------------------
 155
 156   NOTE: Since Visual C++ 2008 Express edition is freely available to
 157   download, older versions of the Visual C++ editions are no longer
 158   supported. So it is suggested that the users upgrade their VC++ environment
 159   to VC++ 2008.
 160
 161 4. Running the example programs
 162 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 163
 164 4.1 Command-line parameters
 165
 166 At the moment there is a simple command-line parser class which is
 167 used in all the executables. The general procedure for running a program
 168 is to type:
 169
 170   prog_name -<flag_name> flag_val ... param1 param2 ...
 171
 172 In other words, options are prefixed by a dash; some options take values,
 173 while others are boolean options that enable specific features. For example:
 174 When running the encoder, the -qf options requires a numeric argument
 175 specifying the "quality factor" for encoding. The -verbose option enables
 176 detailed output and does not require an argument.
 177
 178 Running any program without arguments will display a list of parameters and
 179 options.
 180
 181 4.2 File formats
 182
 183 The example coder and decoder use raw 8-bit planar YUV data.  This means that
 184 data is stored bytewise, with a frame of Y followed by a frame of U followed
 185 by a frame of V, all scanned in the usual raster order. The video dimensions
 186 , frame rate and chroma are passed to the encoder via command line arguments.
 187
 188 Other file formats are supported by means of conversion utilities that
 189 may be found in the subdirectory util/conversion. These will convert to
 190 and from raw RGB format, and support all the standard raw YUV formats as
 191 well as bitmaps. Raw RGB can be obtained as an output from standard conversion
 192 utilities such as ImageMagick.
 193
 194 Example.
 195   Compress an image sequence of 100 frames of 352x288 video in tiff format.
 196
 197   Step 1.
 198
 199   Use your favourite conversion routine to produce a single raw RGB file of
 200   all the data. If your routine converts frame-by-frame then you will
 201   need to concatenate the output.
 202
 203   Step 2.
 204
 205   Convert from RGB to the YUV format of your choice. For example, to do
 206   420, type
 207
 208   RGBtoYUV420 <file.rgb >file.yuv 352 288 100
 209
 210   Note that this uses stdin and stdout to read and write the data.
 211
 212   We have provided a script create_test_data.pl to help convert rgb format
 213   files into all the input formats supported by Dirac. The command line
 214   arguments it supports can be listed using
 215
 216   create_test_data.pl -use
 217
 218   Sample usage is
 219
 220   create_test_data.pl -width=352 -height=288 -num_frames=100 file.rgb
 221
 222   (This assumes that the RGBtoYUV utilities  are in a directory specified in
 223   PATH variable. If not in the path, then use options -convutildir and to set
 224   the directories where the script can find the conversion utilities.)
 225
 226   The scripts then outputs files in all chroma formats (420, 422,
 227   444) supported by Dirac to the current directory.
 228
 229
 230   Step 4.
 231
 232   Run the encoder. This will produce a locally decoded output in the
 233   same format if the locally decoded output is enabled using the -local flag.
 234
 235   Step 5.
 236
 237   Convert back to RGB.
 238
 239   YUV420toRGB <file.yuv >file.rgb 352 288 100
 240
 241   Step 6.
 242
 243   Use your favourite conversion utility to convert to the format of your
 244   choice.
 245
 246 You can also use the transcode utility to convert data to and from Dirac's
 247 native formats (see http://zebra.fh-weingarten.de/~transcode/):
 248
 249   This example uses a 720x576x50 DV source, and transcodes to 720x576 YUV in
 250   4:2:0 chroma format.  Cascading codecs (DV + Dirac) is generally a bad idea
 251   - use this only if you don't have any other source of uncompressed video.
 252
 253     transcode -i source.dv -x auto,null --dv_yuy2_mode -k -V -y raw,null -o file.avi
 254     tcextract -i test.avi -x rgb > file.yuv
 255
 256 Viewing and playback utilities for uncompressed video include MPlayer and
 257 ImageMagick's display command.
 258
 259   Continuing the 352x288 4:2:0 example above, to display a single frame
 260   of raw YUV with ImageMagick use the following (use <spacebar> to see
 261   subsequent frames):
 262
 263     display -size 352x288 test.yuv
 264
 265   Raw YUV 420 data can also be played back in MPlayer - use the following
 266   MPlayer command:
 267
 268     mplayer -fps 15 -rawvideo on:size=152064:w=352:h=288 test.yuv
 269
 270   (at the time of writing MPlayer could not playback 4:2:2 or 4:4:4 YUV data)
 271
 272
 273 4.3 Encoding
 274
 275 The basic encoding syntax is to type
 276
 277 dirac_encoder [options] file_in file_out
 278
 279 This will compress file_in and produce an output file_out of compressed data.
 280
 281 A locally decoded output file_out.local-dec.yuv and instrumentation data
 282 file_out.imt  (for debugging the encoder and of interest to developers only)
 283 are also produced if the -local flag is enabled on the command line.
 284
 285 There are a large number of optional parameters that can be used to run the
 286 encoder, all of which are listed below. To encode video you need three types of
 287 parameter need to be set:
 288
 289 a) quality factor or target bit rate
 290 b) source parameters (width, height, frame rate, chroma format)
 291 c) encoding parameters (motion compensation block sizes, preferred viewing
 292    distance)
 293
 294 In practice you don't have to set all these directly because presets can be used
 295 to use appropriate default values.
 296
 297 a) The most important parameters are the quality factor or target bit rate.
 298
 299 The quality factor is specified by using the option
 300
 301 qf     : Overall quality factor (>0)
 302
 303 This value is greater than 0, the higher the number, the better
 304 the quality. Typical high quality is 8-10, but it will vary from sequence to
 305 sequence, sometimes higher and sometimes lower.
 306
 307 The target bit rate is set using the option
 308
 309 targetrate : Target bit rate in Kb/s
 310
 311 This will attempt to maintain constant bit rate over the sequence. It works
 312 reasonably well, but actual bit rate, especially over short sequences, may be
 313 slightly different from the target.
 314
 315 Setting -targetrate overrides -qf, in that CBR will still be applied, although
 316 the initial quality will be set by the given qf value. This might help the CBR
 317 algorithm to adapt faster.
 318
 319 Setting -lossless overrides both -qf and -targetrate, and enforces lossless
 320 coding.
 321
 322 b) Source parameters need to be set as the imput is just a raw YUV file and
 323 the encoder doesn't have any information about it.
 324
 325 The best way to set source parameters is to use a preset for
 326 different video formats.
 327
 328 The available preset options  are:
 329 QSIF525   : width=176; height=120; 4:2:0 format; 14.98 frames/sec
 330 QCIF      : width=176; height=144; 4:2:0 format; 12.5 frames/sec
 331 SIF525    : width=352; height=240; 4:2:0 format; 14.98 frames/sec
 332 CIF       : width=352; height=288; 4:2:0 format; 12.5 frames/sec
 333 4SIF525   : width=704; height=480; 4:2:0 format; 14.98 frames/sec
 334 4CIF      : width=704; height=576; 4:2:0 format; 12.5  frames/sec
 335 SD480I60  : width=720; height=480; 4:2:2 format; 29.97 frames/sec
 336 SD576I50  : width=720; height=576; 4:2:2 format; 25 frames/sec
 337 HD720P60  : width=1280; height=720; 4:2:2 format; 60 frames/sec
 338 HD720P50  : width=1280; height=720; 4:2:2 format; 50 frames/sec
 339 HD1080I60 : width=1920; height=1080; 4:2:2 format; 29,97 frames/sec
 340 HD1080I50 : width=1920; height=1080; 4:2:2 format; 25 frames/sec
 341 HD1080P60 : width=1920; height=1080; 4:2:2 format; 59.94 frames/sec
 342 HD1080P50 : width=1920; height=1080; 4:2:2 format; 50 frames/sec
 343 DC2K24    : width=2048; height=1080; 4:2:2 format; 24 frames/sec
 344 DC4K24    : width=4096; height=2160; 4:2:2 format; 24 frames/sec
 345 UHDTV4K60 : width=3840; height=2160; 4:2:2 format; 59.94 frames/sec
 346 UHDTV4K50 : width=3840; height=2160; 4:2:2 format; 50 frames/sec
 347 UHDTV8K60 : width=7680; height=4320; 4:2:2 format; 59.94 frames/sec
 348 UHDTV8K50 : width=7680; height=4320; 4:2:2 format; 50 frames/sec
 349
 350 The default format used is CUSTOM format which has the following preset values
 351 width=640; height=480; 4:2:0 format; 23.97 frames/sec.
 352
 353 If your video is not one of these formats, you should pick the nearest preset
 354 and override the parameters that are different.
 355
 356 Example 1 Simple coding example. Code a 720x576 sequence in Planar 420 format to
 357 high quality.
 358
 359 Solution.
 360
 361   dirac_encoder -cformat YUV420P -SD576I50 -qf 9 test.yuv test_out.drc
 362
 363 Example 2. Code a 720x486 sequence at 29.97 frames/sec in 422 format to
 364 medium quality
 365
 366 Solution
 367
 368   dirac_encoder -SD576I50 -width 720 -height 486 -fr 29.97 -cformat YUV422P -qf 5.5 test.yuv test_out.drc
 369
 370 Source parameters that affect coding are:
 371
 372 width           : Width of video frame
 373 height          : Height of video frame
 374 cformat         : Chroma Sampling format. Acceptable values are
 375                   YUV444P, YUV422P and YUV420P.
 376 fr              : Frame rate. Can be a decimal number or a fraction. Examples
 377                   of acceptable values are 25, 29.97, 12.5, 30000/1001.
 378 source_sampling : Source material type - 0 - progressive or 1 - interlaced
 379
 380 For a complete list of source parameters, refer to Annex C of the Dirac
 381 Specification.
 382
 383 WARNING!! If you use a preset but don't override source parameters that
 384 are different, then Dirac will still compress, but the bit rate will be
 385 much, much higher and there may well be serious artefacts. The encoder prints
 386 out the parameters it's actually using before starting encoding (in verbose
 387 mode only), so that you can abort at this point.
 388
 389 c) The presets ALSO set encoding parameters. That's why it's a very good idea
 390 to use presets, as the encoding parameters are a bit obscure. They're still
 391 supported for those who want to experiment, but use with care.
 392
 393 Encoding parameters are:
 394
 395 L1_sep        : the separation between L1 frames (frames that are predicted but
 396                 also used as reference frames, like P frames in MPEG-2)
 397 num_L1        : the number of L1 frames before the next intra frame
 398 xblen         : the width of blocks used for motion compensation
 399 yblen         : the height of blocks used for motion compensation
 400 xbsep         : the horizontal separation between blocks. Always <xblen
 401 ybsep         : the vertical separation between blocks. Always <yblen
 402 cpd           : normalised viewing distance parameter, in cycles per degree.
 403 iwlt_filter   : transform filter to use when encoding INTRA frames, Valid
 404                 values are DD9_7, LEGALL5_3, DD13_7, HAAR0, HAAR1, FIDELITY,
 405                 DAUB97. Default value is DD9_5.
 406 rwlt_filter   : transform filter to use when encoding INTER frames, Valid
 407                 values are DD9_7, LEGALL5_3, DD13_7, HAAR0, HAAR1, FIDELITY,
 408                 DAUB97. Default value is LEGALL5_3.
 409 wlt_depth     : transform depth, i.e number of times the component is split
 410                 while applying the wavelet transform
 411 no_spartition : Do not split a subband into coefficient blocks before
 412                 entropy coding
 413 multi_quants  : If subbands are split into multiple coefficient blocks before
 414                 entropy coding, assign different quantisers to each block
 415                 within the subband.
 416 prefilter     : Prefilter to apply to input video before encoding. The name of
 417                 the filter to be used and the filter strength have to be
 418                 supplied. Valid filter names are NO_PF, CWM, RECTLP and
 419                 DIAGLP. Filter strenth range should be in the range 0-10.
 420                 (note PSNR statistics will be computed relative to the filtered
 421                 video if -local is enabled)
 422 lossless      : Lossless coding.(overrides -qf and -targetrate)
 423 mv_prec       : Motion vector precision. Valid values are 1 (Pixel precision),
 424                 1/2 (half-pixel precision), 1/4 (quarter pixel precision which
 425                 is the default), 1/8 ( Eighth pixel precision).
 426 full_search   : Use full search motion estimation
 427 field_coding  : Code the input video as fields instead of frames.
 428                 Default coding is frames.
 429 use_vlc       : Use VLC for entropy coding of coefficients instead of arithmetic
 430                 coding.
 431 Modifying L1_sep and num_L1 allows for new GOP structures to be used, and
 432 should be entirely safe. There are two non-GOP modes that can also be used for
 433 encoding: setting num_L1=0 gives I-frame only coding, and setting L1_sep to
 434 1 will do IP-only coding (no B-pictures). P-only coding isn't possible, but
 435 num_L1=very large and L1_sep=1 will approximate it.
 436
 437 Modifying the block parameters is strongly deprecated: it's likely to break
 438 the encoder as there are many constraints. Modifying cpd will not break
 439 anything, but will change the way noise is distributed which may be more (or
 440 less) suitable for your application. Setting cpd equal zero turns off
 441 perceptual weighting altogether.
 442
 443 For more information, see the algorithm documentation on the website:
 444 http://diracvideo.org/wiki/index.php/Dirac_Algorithm
 445
 446 Other options. The encoder also supports some other options, which are
 447
 448 verbose   : turn on verbosity (if you don't, you won't see the final bitrate!)
 449 start     : code from this frame number
 450 stop      : code up until this frame number
 451 local     : Generate diagnostics and locally decoded output (to avoid running a
 452             decoder to see your video)
 453
 454 Using -start and -stop allows a small section to be coded, rather than the
 455 whole thing.
 456
 457 If the -local flag is present in the command line, the encoder produces
 458 diagnostic information about motion vectors that can be used to debug the
 459 encoder algorithm. It also produces a locally decoded picture so that you
 460 don't have to run the decoder to see what the pictures are like.
 461
 462 4.4 Decoding
 463
 464 Decoding is much simpler. Just point the decoder input at the bitstream and the
 465 output to a file:
 466
 467   dirac_decoder -verbose test_enc test_dec
 468
 469 will decode test_enc into test_dec with running commentary.