Skip to content

Using WebRTC protocol

WebRTC (Web Real-Time Communications) is a P2P (peer-to-peer) protocol for bidirectional secure real-time communication between the two clients.*

Lets's break it down into sections. P2P (peer-to-peer) tells us that two agents (clients) interact directly with each other without the third party interfering. Bidirectional refers to functioning in two (opposite) directions. Secure means that the connection is encrypted by one or more security protocols. Real-time communication stands for nearly instant information exchange without or negligibly low delays.

WebRTC protocol is used for data exchange between the two browsers (no plugins or any extensions needed) or applications that support that protocol via the "one-to-one" or "client-client" type of communication. For example, visiting the same website, two browsers can interact with each other with WebRTC. That is why it is great for voice and video calling (Zoom), content sharing and more.

WebRTC has its advantages:

  • ultra-low latency (less than a second),
  • obligatory encryption with DTLS and SRTP protocols,
  • open-source standard,
  • no need to install any additional plugins or applications, etc.

A so-called signaling server is used for this purpose. Signaling server is a media mediator that manages the connections between devices. Thus "client-client" type of connection becomes "client-server-client".

For the clients, or agents, to start communicating with each other using WebRTC, four sequential steps are taken:

1. Signalling

WebRTC uses SDP (Session Description Protocol) to negotiate media parameters. One WebRTC Agent makes an "offer" to start a call, and the other Agent replies with an "answer" if it agrees to accept the "offer". It is called an "SDP offer/answer" model. This is a moment when the responder declines the unsupported codecs.

The SDP "messages" contain the details like IPs an ports that the agent is reachable on (ICE Сandidates), a number of video and audio tracks it is willing to send, what audio and video codecs each of the agents support, etc. ICE

Now the agents are ready to attempt to connect.

2. Connecting

ICE (Interactive Connectivity Establishment) protocol is used to find the best way to pair the agents using candidates. ICE candidates are simply combinations of IPs, ports and transport protocols of the agents. They are used to get the perfect match for the two agents to allow to establish a connection, overcoming NAT (Network Address Translation) and firewalls. STUN (Session Traversal Utilities for NAT) and TURN (Traversal Using Relays around NAT) servers are used to establish a connection between the agents.

3. Securing

WebRTC uses DTLS and SRTP security protocols to encrypt the messages and secure the connection. Every connection is secured and authenticated.

Now that the coast is clear and we have two agents with secure bidirectional connection, we can start communicating.

4. Communicating

Here the data exchange begins. Both WebRTC Agents start the data and media interchage. WebRTC supports 2 audio (Opus, G.711) and 2 video (VP8, H.264) codecs.

But what if we want to configure a data exchange between 3 or more browsers or applications that support WebRTC? For example, start a video/audio conference?

A so-called signaling server is used for this purpose. Signaling server is a media mediator that manages the connections between devices. Thus "client-client" type of connection becomes "client-server-client". As a result, "one-to-one" becomes "one-to-many", offering more to functionality to the user.

About publication through WebRTC to Flussonic

Flussonic Media Server uses WebRTC for publishing a media stream from a client device or app (the source) to Flussonic (the recipient). Then Flussonic becomes the source in order to play the stream on another client (the recipient). In both cases, Flussonic also acts as the signaling server to exchange the data about the connection.

Why do we use WebRTC to send media data between clients? Because with the WebRTC mechanism we can provide ultra-low latency.

Therefore, the exchange of video via Flussonic cannot be called peer-to-peer, rather, we call it video publication to Flussonic Media Server through WebRTC and video playback through WebRTC.

The diagram shows the process of initiating the connection between Flussonic and a client device, for publication:

WebRTC

The connection to Flussonic Media Server for a media stream publishing through WebRTC is established in the similar way as for video playback.

The principle here stays the same – parties should exchange SDPs via the mediator (signaling server - Flussonic), and then start the direct data transfer. In the case of video publishing, it's the client that initiates the process and sends an SDP offer.

Connection is established via WebSocket, and then video is transferred via RTP.

For details, see WebRTC Publishing.

About playback via WebRTC from Flussonic

Flussonic Media Server uses WebRTC for playback a media stream from Flussonic (the source) to a client device or app (the recipient). Flussonic also acts as the signaling server during connection establishment to exchange data about the connection.

Why do we use WebRTC to send media data between clients? Because with the WebRTC mechanism we can provide ultra-low latency.

Therefore, the exchange of video via Flussonic cannot be called peer-to-peer; rather, we call it video publication to Flussonic Media Server via WebRTC and video playback via WebRTC.

The diagram shows the process of initiating the connection between Flussonic and a client device, for playback:

WebRTC Playback

Parties should exchange SDPs via the mediator (signaling server - Flussonic), and then start the direct data transfer. In the case of video playback, it's the Flussonic server (video source) that initiates the process and sends an SDP offer.

Connection is established via WebSocket, and then video is transferred via RTP.

For details, see WebRTC Playback.