Flussonic Media Server documentation

Transcoding

Satellite video is transmitted in either MPEG-2 or H. 264 (aka AVC or MPEG-4 part10). As a rule, MPEG-4 part 10 is for simplicity reduced to MPEG-4, but it is important not to confuse it with MPEG-4 part 2, which is absolutely incompatible and is not like H. 264; it was used in older IP cameras.

Audio is transmitted in MPEG audio layer 2 (abbreviated mp2) or in ac3 (a/52).

It is important to understand that today H264 is usually compressed with intra-refresh, i.e., the video stream contains no reference frames (IDR or keyframe). This compression method makes it possible to smooth out bitrate surges.

As a result, none of the transmitted satellite variants of audio or video can be played on iPhones. The browser would play back only H264.

During transmission via the Internet, video from mpeg2 can usually be safely compressed to h264 with a threefold decrease in traffic.

When transmitting HD channels via the Internet, today one has to compress the stream into several qualities: from HD with the best quality to standard SD to compensate for overloaded channels.

In the end, in order to provide high-quality OTT service, the video from the satellite should be transcoded into other codecs and qualities.

It is important not to confuse transcoding with repackaging. Transcoding is a very resource-intensive operation that includes:

  • unpacking the stream to encoded video/audio
  • decoding to raw video/audio
  • changing the size and other parameters
  • reverse coding
  • packing into the transport for the stream

Packing and unpacking are relatively easy operations; the streaming server can handle up to 1,000 channels on the same computer. The same computer can be used for transcoding 1 to 30 channels, depending on the size and capacity of the computer.

Specialized dedicated hardware, a CPU or a video card, external or integrated into the processor may be used for transcoding.

We will not consider specialized devices, since en masse they are either computers with some application, or extremely expensive and very specialized equipment, or unreasonably expensive devices that are sold exclusively through the manufacturer's marketing efforts and do not allow achieving any significant results.

H264 Anchor Anchor x2

For video processing on the CPU there are several applications, but to a large extent, today there are only two libraries that can be reasonably used for compressing into the H264 codec on CPU: a free libx264 and proprietary MainConcept. Everything else is either worse, or much worse, both in terms of the result and in terms of the use of resources.

Working with MainConcept will not be considered in this article, only libx264 will be mentioned.

Today, the H264 codec is de facto the standard for video, since it is supported by all modern devices, except perhaps for some devices from Google.

There are virtually no alternatives to it. Today there is a growing H265, it already has a lot of support, but until not working with it is investing into the future.

Codecs from Google: VP8 and VP9 are more Google's desire to pull the blanket over, rather than something actually useful. The resulting quality is worse, there is no support for hardware decoding, and therefore the price of the device grows.

When encoding video, one should understand that a balance should be observed between these parameters:

  • delay inside the encoder in frames
  • CPU usage (the number of milliseconds required for compressing a single frame)
  • output image quality (pixel rate and color)
  • output bitrate

For all kinds of broadcasts, CPU usage is absolutely critical. If the encoder settings require full CPU load or more, the video will fail to be encoded in real time, and therefore the streaming nature of the video will be lost.

VOD does not have such tight restrictions, and a one-hour long movie may be encoded for three hours if you wish to lower the bitrate. With that, for broadcasting video, usually the full CPU capacity is not used, in order to process 10 channels on the same computer, rather than 4.

As to the delay inside the encoder, it is critical for video conferencing, but is absolutely not critical for IPTV. Even a 5 seconds delay in TV broadcasting does not change the quality of service.

There is a clear relation between the bitrate and the quality of connection: the more information about the picture is transmitted, the better it will be displayed. The quality of the picture may be improved by reducing the bitrate, usually by selecting more efficient compression tools that require a greater delay and more cycles.

Understanding this complex relationship is needed for better understanding of the assertion that "our encoder is the best encoder in the world." The comparison should be made by at least 4 parameters, but in the end it all boils down to the price for one-time and monthly transcoding of a single channel with the desired quality and output bitrate.

VLC and ffmpeg Anchor Anchor x2

The most frequently used tool for transcoding video from satellite into a usable OTT is VLC or ffmpeg.

Both these applications use the same libx264 library for compressing H264, and the same libavcodec library for decoding h264 to raw video, but they use different codes for unpacking MPEG-TS, and different codes for decoding mpeg2 video.

In their use, they are different in the fact that VLC can be used as an HTTP MPEG-TS server, and ffmpeg can only publish videos via HTTP MPEG-TS or RTMP to the server.

This difference influences the organization of the video processing network. The fact that when there are more than 150 channels, and the total number of computers exceeds 25-30, it is hard and absolutely unnecessary to remember where everything is located.

Therefore, when it is explicitly specified on the streaming server where to get the stream from, it is usually more convenient than to find out where the video is published from.

Also, when organizing a transcoder as a server, it is easier to organize a reliable system: if one source is not responding, the streamer switches to the next one.

Flussonic to transcoding Anchor Anchor x2

Flussonic has a separate transcoder packet that is based on ffmpeg. The packet is called flussonic-ffmpeg and comes with all source codes.

Flussonic can decode video from UDP/HTTP, MPEG-TS, RTMP sources, and encode it to multiple qualities and sizes.

This feature becomes useful when there is a need to play the video not only on set-top boxes, but on tablets as well: the choice of available codecs is significantly less considerable there, as compared to set-top boxes.

It should be noted that in order to play the video on an iPhone, even the H264 from a satellite should be transcoded, since for variable bitrate the satellite usually uses intra-refresh coding mode that creates videos that cannot be played back on iPhones.

Flussonic is more convenient than VLC or other variants of organizing transcoding, since it is controlled by a single configuration file and monitors the status of transcoding automatically. On the contrary, VLC requires writing many monitoring scripts for tracking the transcoding status.

The next important transcoding feature of Flussonic is automatic rebalancing streams if one of the servers goes down. If one of 20 transcoders fails at night time, the rest of the transcoders can be configured to automatically capture streams for transcoding, and the streamer will itself pick streams from the backup transcoders.