README for the Dirac video codec
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

by the BBC R&D Dirac team (dirac@rd.bbc.co.uk)


1. Executive Summary
~~~~~~~~~~~~~~~~~~~~

Dirac is an open source video codec. It uses a traditional hybrid video codec
architecture, but with the wavelet transform instead of the usual block
transforms.  Motion compensation uses overlapped blocks to reduce block
artefacts that would upset the transform coding stage.

Dirac can code just about any size of video, from streaming up to HD and
beyond, although certain presets are defined for different applications and
standards.  These cover the parameters that need to be set for the encoder to
work, such as block sizes and temporal prediction structures, which must
otherwise be set by hand.

Dirac is intended to develop into real coding and decoding software, capable
of plugging into video processing applications and media players that need
compression. It is intended to develop into a simple set of reliable but
effective coding tools that work over a wide variety of content and formats,
using well-understood compression techniques, in a clear and accessible
software structure. It is not intended as a demonstration or reference coder.


2. Documentation
~~~~~~~~~~~~~~~~

A user guide and a guide to the software is in progress. More details on
running the codec can be found at http://dirac.sourceforge.net/


3. Building and installing
~~~~~~~~~~~~~~~~~~~~~~~~~~

  GNU/Linux, Unix, MacOS X, Cygwin, Mingw
  ---------------------------------------
    ./configure --enable-debug
        (to enable extra debug compile options)
     OR
    ./configure --enable-profile
        (to enable the g++ profiling flag -pg)
     OR
    ./configure --disable-mmx
        (to disable MMX optimisation which is enabled by default)
     OR
    ./configure --enable-debug --enable-profile
        (to enable extra debug compile options and profiling options)
     OR
     ./configure

     By default, both shared and static libraries are built. To build all-static
     libraries use
     ./configure --disable-shared

     To build shared libraries only use
     ./configure --disable-static

     make
     make install

  The INSTALL file documents arguments to ./configure such as
  --prefix=/usr/local (specify the installation location prefix).

  
  MSYS and Microsoft Visual C++
  -----------------------------
     Download and install the no-cost Microsoft C++ compiler from
     http://msdn.microsoft.com/visualc/vctoolkit2003/

     Download and install MSYS (the MinGW Minimal SYStem), MSYS-1.0.10.exe, 
     from http://www.mingw.org/download.shtml. An MSYS icon will be available
     on the desktop.

     Click on the MSYS icon on the desktop to open a MSYS shell window.

     Create a .profile file to set up the environment variables required. 
     vi .profile

     Include the following three lines in the .profile file.

         export PATH=/c/Program\ Files/Microsoft\ Visual\ C++\ Toolkit\ 2003/bin:$PATH
         export INCLUDE=/c/Program\ Files/Microsoft\ Visual\ C++\ Toolkit\ 2003/include
         export LIB=/c/Program\ Files/Microsoft\ Visual\ C++\ Toolkit\ 2003/lib

         (Replace /c/Program\ Files/Microsoft\ Visual\ C++\ Toolkit\ 2003/ with
         the location where VC++ 2003 is installed if necessary)


     Exit from the MSYS shell and click on the MSYS icon on the desktop to open 
     a new MSYS shell window for the .profile to take effect.

     Change directory to the directory where Dirac was unpacked. By default 
     only the dynamic libraries are built.

     ./configure CXX=cl --enable-debug
         (to enable extra debug compile options)
     OR
     ./configure CXX=cl --disable-shared
         (to build static libraries)
     OR
     ./configure CXX=cl
     make
     make install

     The INSTALL file documents arguments to ./configure such as
     --prefix=/usr/local (specify the installation location prefix).

  Microsoft Visual C++ .NET 2003
  ------------------------------
  The MS VC++ .NET 2003 solution and project files are in win32/VS2003 
  directory.  Double-click on the solution file, dirac.sln, in the 
  win32/VS2003 directory.  The target 'Everything' builds the codec 
  libraries and utilities. Four build-types are supported

  Debug - builds unoptimised encoder and decoder dlls with debug symbols
  Release - builds optimised encoder and decoder dlls
  Static-Debug - builds unoptimised encoder and decoder static libraries
                 with debug symbols
  Static-Release - builds optimised encoder and decoder static libraries
 
  Static libraries are created in the win32/VS2003/lib/<build-type> directory.

  Encoder and Decoder dlls and import libraries, encoder and decoder apps are 
  created in the win32/VS2003/bin/<build-type> directory. The "C" public API
  is exported using the _declspec(dllexport) mechanism.

  Conversion utilites are created in the 
  win32/VS2003/utils/conversion/<build-type> directory. Only static versions
  are built.

  Instrumentation utility is created in the 
  win32/VS2003/utils/instrumentation/<build-type> directory. Only static
  versions are built.


4. Running the example programs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

4.1 Command-line parameters

At the moment there is a simple command-line parser class which is 
used in all the executables. The general procedure for running a program
is to type:

  prog_name -<flag_name> flag_val ... param1 param2 ...

In other words, options are prefixed by a dash; some options take values, 
while others are boolean options that enable specific features. For example:
When running the encoder, the -qf options requires a numeric argument
specifying the "quality factor" for encoding. The -verbose option enables
detailed output and does not require an argument.

Running any program without arguments will display a list of parameters and
options.

4.2 File formats

The example coder and decoder use raw 8-bit planar YUV data.  This means that
data is stored bytewise, with a frame of Y followed by a frame of U followed
by a frame of V, all scanned in the usual raster order. The video dimensions
, frame rate and chroma are passed to the encoder via command line arguments.

Other file formats are supported by means of conversion utilities that
may be found in the subdirectory util/conversion. These will convert to
and from raw RGB format, and support all the standard raw YUV formats as
well as bitmaps. Raw RGB can be obtained as an output from standard conversion
utilities such as ImageMagick.

Example.
  Compress an image sequence of 100 frames of 352x288 video in tiff format.

  Step 1.

  Use your favourite conversion routine to produce a single raw RGB file of 
  all the data. If your routine converts frame-by-frame then you will
  need to concatenate the output.

  Step 2.

  Convert from RGB to the YUV format of your choice. For example, to do
  420, type

  RGBtoYUV420 <file.rgb >file.yuv 352 288 100

  Note that this uses stdin and stdout to read and write the data.

  We have provided a script create_test_data.pl to help convert rgb format 
  files into all the input formats supported by Dirac. The command line
  arguments it supports can be listed using
  
  create_test_data.pl -use

  Sample usage is
  
  create_test_data.pl -width=352 -height=288 -num_frames=100 file.rgb
 
  (This assumes that the RGBtoYUV utilities  are in a directory specified in
  PATH variable. If not in the path, then use options -convutildir and to set
  the directories where the script can find the conversion utilities.)

  The scripts then outputs files in all chroma formats (Yonly, 411, 420, 422,
  444) supported by Dirac to the current directory.


  Step 4.

  Run the encoder. This will produce a locally decoded output in the
  same format.

  Step 5.

  Convert back to RGB.

  YUV420toRGB <file.yuv >file.rgb 352 288 100

  Step 6.

  Use your favourite conversion utility to convert to the format of your
  choice.

You can also use the transcode utility to convert data to and from Dirac's
native formats (see http://zebra.fh-weingarten.de/~transcode/):

  This example uses a 720x576x50 DV source, and transcodes to 720x576 YUV in
  4:2:0 chroma format.  Cascading codecs (DV + Dirac) is generally a bad idea
  - use this only if you don't have any other source of uncompressed video.

    transcode -i source.dv -x auto,null --dv_yuy2_mode -k -V -y raw,null -o file.avi
    tcextract -i test.avi -x rgb > file.yuv

Viewing and playback utilities for uncompressed video include MPlayer and
ImageMagick's display command.

  Continuing the 352x288 4:2:0 example above, to display a single frame
  of raw YUV with ImageMagick use the following (use <spacebar> to see
  subsequent frames):

    display -size 352x288 test.yuv

  Raw YUV 420 data can also be played back in MPlayer - use the following 
  MPlayer command:

    mplayer -fps 15 -rawvideo on:size=152064:w=352:h=288 test.yuv

  (at the time of writing MPlayer could not playback 4:2:2 or 4:4:4 YUV data)


4.3 Encoding

The basic encoding syntax is to type

dirac_encoder [options] file_in file_out

This will compress file_in.yuv and produce an output file_out.drc of compressed
data. 

By default it will also produce a locally decoded output file_out.yuv and
instrumentation data file_out.imt (for debugging the encoder and of interest to
developers only).

There are a large number of optional parameters that can be used to run the encoder,
all of which are listed below. To encode video you need three types of 
parameter need to be set:

a) quality factor
b) sequence parameters (width, height, frame rate, chroma format)
c) encoding parameters (motion compensation block sizes, preferred viewing
   distance)

In practice you don't have to set all these directly because presets can be used
to use appropriate default values.

a) The most important parameter is the quality factor. This is specified by
using the option

qf 		: Overall quality factor (>0)

This value is greater than 0, the higher the number, the better
the quality. Typical high quality is 8-10, but it will vary from sequence to 
sequence, sometimes higher and sometimes lower.

b) Sequence parameters need to be set as the imput is just a raw YUV file and the
encoder doesn't have any information about it.

The best way to set sequence parameters is to use a preset for
different video formats. 

The available preset options  are:

CIF      : width=352; height=288; 4:2:0 format; 12.5 frames/sec
SD576    : width=720; height=576; 4:2:0 format; 25 frames/sec
HD720    : width=1280; height=720; 4:2:0 format; 50 frames/sec
HD1080   : width=1920; height=1080; 4:2:0 format; 25 frames/sec

If your video is not one of these formats, you should pick the nearest preset
and override the parameters that are different.

Example 1 Simple coding example. Code a 720x576 sequence to high quality.

Solution.

  dirac_encoder -SD576 -qf 9 test test_out

Example 2. Code a 720x480 sequence at 29.97 frames/sec in 422 format to 
medium quality

Solution

  dirac_encoder -SD576 -width 720 -height 480 -fr 29.97 -cformat 1 -qf 7 test test_out

The complete list of sequence parameters is:

width     : Width of video frame
height    : Height of video frame
cformat   : Chroma format. It accepts a number between 0 and 4.
            0 - Yonly, 1 - 4:2:2, 2 - 4:4:4, 3 - 4:2:0 and 4 - 4:1:1
fr        : Frame rate. Can be a decimal number or a fraction. Examples
            of acceptable values are 25, 29.97, 12.5, 30000/1001.

WARNING!! If you use a preset but don't override sequence parameters that
are different, then Dirac will still compress, but the bit rate will be
much, much higher and there may well be serious artefacts. The encoder prints
out the parameters it's actually using before starting encoding, so that
you can abort at this point.

c) The presets ALSO set encoding parameters. That's why it's a very good idea
to use presets, as the encoding parameters are a bit obscure. They're still 
supported for those who want to experiment, but use with care.

Encoding parameters are:

L1_sep    : the separation between L1 frames (frames that are predicted but 
            also used as reference frames, like P frames in MPEG-2)
num_L1    : the number of L1 frames before the next intra frame
xblen     : the width of blocks used for motion compensation
yblen     : the height of blocks used for motion compensation
xbsep     : the horizontal separation between blocks. Always <xblen
ybsep     : the vertical separation between blocks. Always <yblen
cpd       : normalised viewing distance parameter, in cycles per degree.

Modifying L1_sep and num_L1 allows for new GOP structures to be used, and
should be entirely safe. There are two non-GOP modes that can also be used for
encoding: setting num_L1=0 gives I-frame only coding, and setting num_L1<0
will produce a sequence with infinitely many L1 frames i.e. with a single I
frame at the beginning of the sequence. 

Modifying the block parameters is strongly deprecated: it's likely to break
the encoder as there are many constraints. Modifying cpd will not break
anything, but will change the way noise is distributed which may be more (or
less) suitable for your application. Setting cpd equal zero turns off
perceptual weighting altogether.

For more information, see the algorithm documentation on the website:
http://dirac.sourceforge.net

Other options. The encoder also supports some other options, which are

verbose   : turn on verbosity (if you don't, you won't see the final bitrate!)
start     : code from this frame number
stop      : code up until this frame number
nolocal   : Do not generate diagnostics and locally decoded output

Using -start and -stop allows a small section to be coded, rather than the
whole thing.

By default the encoder produces diagnostic information about motion vectors
that can be used to debug the encoder algorithm. It also produces a locally
decoded picture so that you don't have to run the decoder to see what the
pictures are like. -nolocal turns both these features off.

4.4 Decoding

Decoding is much simpler. Just point the decoder input at the bitstream and the
output to a file:

  dirac_decoder -verbose test_enc test_dec

will decode test_enc.drc into test_dec.yuv with running commentary.
