Creating a Master Playlist with Ffmpeg

In this post you’ll see how to create an HLS master playlist with ffmpeg for video on-demand.

A master playlist contains references to different variant streams (typically encoded at different bit rates) and can also include links to alternative audio tracks and audio-only renditions. It allows a client device to choose the most appropriate stream based on factors such as the capabilities of the device, available bandwidth, and so on. This is known as adaptive streaming.

The examples below use the 1080p version of the Sintel trailer which you can download here. You can of course use your own videos. Apple publishes guidelines for authoring HLS video streams that includes – among other things – recommended video and audio bit rates. We’ll create two variants: a 540p and a 360p version. The bit rates will be 2 kB/s and 365 kb/s respectively (as per Apple’s guidelines).

The first thing to do is determine what streams are in the video, which we can do with the following command:

$ ffmpeg -i sintel_trailer-1080p.mp4 -hide_banner
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'sintel_trailer-1080p.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    creation_time   : 1970-01-01T00:00:00.000000Z
    title           : Sintel Trailer
    artist          : Durian Open Movie Team
    encoder         : Lavf52.62.0
    copyright       : (c) copyright Blender Foundation | durian.blender.org
    description     : Trailer for the Sintel open movie project
  Duration: 00:00:52.21, start: 0.000000, bitrate: 2240 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080, 2108 kb/s, 24 fps, 24 tbr, 24 tbn, 48 tbc (default)
    Metadata:
      creation_time   : 1970-01-01T00:00:00.000000Z
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 126 kb/s (default)
    Metadata:
      creation_time   : 1970-01-01T00:00:00.000000Z
      handler_name    : SoundHandler
At least one output file must be specified

Ignore the warning about the missing output file.

We can see there are two available streams: a video and an audio stream. (The stream references are highlighted, which we’ll need shortly.) We’ll need to encode the video stream for each variant but we can just copy the audio as is because it’s already encoded as AAC and the bit rate lies within the recommended range (32 to 160 kb/s).

Here’s the ffmpeg command to generate the variants and the master playlist:

ffmpeg -y -i sintel_trailer-1080p.mp4 \
  -preset slow -g 48 -sc_threshold 0 \
  -map 0:0 -map 0:1 -map 0:0 -map 0:1 \
  -s:v:0 640x360 -c:v:0 libx264 -b:v:0  2000k \
  -s:v:1 960x540 -c:v:1 libx264 -b:v:1 365k  \
  -c:a copy \
  -var_stream_map "v:0,a:0 v:1,a:1" \
  -master_pl_name master.m3u8 \
  -f hls -hls_time 6 -hls_list_size 0 \
  -hls_segment_filename "v%v/fileSequence%d.ts" \
  v%v/prog_index.m3u8

Let’s break it down. First, we specify the input file (-i).

This is followed by a number of general encoding options. The preset is set to slow; the default is medium. A preset is a collection of values that determines the “quality” of the encoding. A slower preset will achieve a better compression ratio but will take longer to run. You may want to play around with the different values to see what works best for you. Apple’s authoring guidelines specify that a key frame should be present every 2 seconds. We achieve this by setting the GOP size (-g) to 48, or twice the frame rate of the video input. Lastly, scene change detection is disabled by setting the threshold to zero.

Next, we need to specify the input streams that should be included in the output, which we can do with the map command. (The map command means “include this stream in the output file”.) In this instance, 0:0 refers to the video stream and 0:1 to the audio stream in the input file. The first pair of map commands represent the video and audio stream of the first variant and will be referred to subsequently in the output as v:0 and a:0; the next pair of commands represent the second variant and will be labelled as v:1 and a:1, and so on. (If we wanted to generate 3 variants streams, we would need 6 map statements, assuming that each variant has a video and audio stream.)

The next two lines specify how each video stream should be encoded. We don’t need to change the audio so it can be copied directly to the output using -c:a copy which means copy all audio streams from the input(s). The video and audio streams for each output variant are then specified using the -var_stream_map option. The -master_pl_name option sets the name of the master playlist. Finally, we set a number of HLS options including the segment filename and the duration.

Now according to Apple’s guidelines, a video on-demand playlist must have the EXT-X-PLAYLIST-TYPE tag set to VOD. In theory, this can be done with ffmpeg by setting the -hls_playlist_type option to vod. However, running the command above with this option set using the latest release of ffmpeg (4.1.3) causes a segmentation fault. Turns out it’s a bug. I downloaded a recent version of the source code and compiled ffmpeg from scratch and it works so the problem has been fixed. For the time being, you can either do what I did and compile the latest source code or just add the tag to each playlist afterwards.

Run the command. When it completes you should see the following output:

Open the master playlist (master.m3u8) in a text editor. It should look like this:

#EXTM3U
#EXT-X-VERSION:3
#EXT-X-STREAM-INF:BANDWIDTH=2339363,RESOLUTION=640x360,CODECS="avc1.64001e,mp4a.40.2"
v0/prog_index.m3u8

#EXT-X-STREAM-INF:BANDWIDTH=540863,RESOLUTION=960x540,CODECS="avc1.64001f,mp4a.40.2"
v1/prog_index.m3u8

You can see the references to the alternative streams. In the directories v0 and v1 you will find the playlist and segments for each variant stream.

Leave a Reply