Web Real-Time Communication

WebRTC Journal

Subscribe to WebRTC Journal: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get WebRTC Journal: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

WebRTC Journal Authors: Elizabeth White, Carmen Gonzalez, Liz McMillan, Rich DeFabritus, Pat Romanski

Blog Post

WebRTC Interoperability - Are We There Yet? | @WebRTCSummit #RTC #WebRTC

WebRTC adoption has increased in the past few years and is expected to continue growing in the near future

WebRTC adoption has increased in the past few years and is expected to continue growing in the near future. Gartner expects that by 2019, WebRTC will be utilized for 15% of enterprise voice and video communication1. By the end of 2015 there were more than 850 vendors and projects using it2, a more than 100% growth in 2 years, which is a good indication that the technology is booming. We can see already a wave of creative uses of WebRTC for communications and collaboration through websites, sales apps, contact centers and customer care and business applications-to name a few.

We believe that one of the reasons for this trend is that real time video enables a live experience in multiple different industries at a fraction of the cost of running branches, schools, clinics, and so on. It recaptures the human element that was lost through the "optimization" of customer support. Trying to cut costs by making everything online and asynchronous just made us (the consumers) unhappy. Now live customer experiences can be optimized to save the company money and to keep the customer engaged and happy.

We have been implementing WebRTC-related projects over the past 4 years for organizations of all sizes, from startup companies to large carriers. Among the challenges and lessons learned, there is one which stands out - interoperability.

The Interoperability Challenge

To give you some perspective, our very first project, more than 4 years ago, consisted of developing a gateway to allow WebRTC to interoperate with existing communications infrastructure. Exactly the same scope came up for a more recent project, which leads us to believe the repetitive need for interoperability is not a coincidence.

And it leads us to ask, why is interoperability such a big deal?

Of course, for those implementing new apps with no integration to legacy communications infrastructure, there is no need to worry. However, if you are planning to interop your WebRTC-enabled app with an existing communications solution or network, you should be aware that some complications may arise.

By implementing WebRTC-related projects for Telecom, Call Center and Enterprise Collaboration scenarios-and even enabling remote video calls to prison inmates-we identified some patterns, involving mainly five aspects of interoperability:

  • Signaling
  • Call control
  • Transcoding
  • Identity management
  • Security


Signaling is the process of coordinating communication, including controlling the session3, error messages, codec settings, security information and network data, such as IP addresses and ports.

WebRTC does not specify signaling, in order to maximize compatibility with existing technologies. So a number of different protocols have been used to implement signaling for WebRTC, while legacy typically uses SIP (Session Initiation Protocol) or H.323.

However, even when both the WebRTC endpoint and the legacy use the same signaling protocol, such as SIP, there may be caveats. One of them is that SIP, for example, "is an application layer protocol designed to be independent of the underlying transport layer"4. In other words, it relies on other underlying protocols which may be different in each endpoint. It is common for us to see SIP over WebSocket on the WebRTC side, and SIP over TCP/UDP on the legacy side creating the need for conversion.

The solution for this and other interoperability challenges is the use of gateways and proxies that are able to translate protocols efficiently. We mention some of these tools below.

Call Control

Legacy communication platforms usually support multiple call controlling features for hold, park, transfer and so on. So how to allow for that type of feature in the WebRTC side? Solving this requires development that enables efficient management of voice and video, so when the conversation is put on hold, for example, video content is not sent-therefore, saving bandwidth. That involves renegotiation between the peers, and thus, coding.


WebRTC uses modern video codecs, such as VP8 and VP9, whereas legacy networks use several different, usually older ones, which may require conversion. A common example of a video codec requiring conversion is H.264. Although in the future it is likely that WebRTC clients will support it, given it is now a specification requirement. There are also common audio codecs in legacy networks, such as AMR, AMR-WB and G.722.1 that are not currently supported by WebRTC, and may require transcoding, although there are discussions on whether some of them should be included as part of the WebRTC specification in the future, as suggests a recent memo from IETF (Internet Engineering Task Force)5.

Transcoding is also needed when variants of the same protocol are used by the endpoints. For example, Microsoft created H.264UC which does not work with the common H.264AVC.

Transcoding video is undesirable because it is is CPU-intensive, but sometimes unavoidable if the objective is to achieve interoperability. One of our clients, for example, uses an array of 60+ core computers only for that task.

Conferencing may also introduce the need for transcoding in WebRTC, independent of interoperability. This is because in a many-to-many communication, a centralized media server may be needed to mix the video from different endpoints and dynamically decide which will be the most optimized broadcast model, for example deciding which participants to display to the others, and so on.  Although there are many discrete aspects to Transcoding, the point here is that all of these challenges can be solved so WebRTC interoperates with existing infrastructure.

Identity Management

Legacy communication platforms usually control subscriber identity, if not for billing, at least for security aspects. There are uses cases where WebRTC clients needs to register with these legacy systems and that introduces a need for unified identification of users, which must be handled by the applications.


When it comes to security, on the one hand WebRTC is all encrypted, but on the other, unfortunately legacy communication infrastructure usually is not. And sometimes the only feasible alternative for endpoints on both sides to interop is by encrypting/decrypting media. For example, a WebRTC gateway may convert media from RTP (Real-time Transport Protocol)-a network protocol for delivering audio and video over IP networks-to SRTP or DTLS-SRTP, its secure counterpart.

Other P2P Challenges

Because legacy networks are behind firewalls and other security mechanisms, in order to provide peer-to-peer communication (as much as possible) through these barriers, usually a technique known as ICE (Interactive Connectivity Establishment) is used to find the best path to connect peers. Although not a major challenge in WebRTC interop initiatives, it is also important to consider as it leverages two other technologies that may be components of an interop architecture: STUN and TURN.

The basic steps in this mechanism are:

  • ICE tries to obtain a direct connection between peers.
  • If that fails (for example when peers are behind NATs6), ICE obtains an external address using a STUN (Session Traversal Utilities for NAT) server
  • If that also fails, ICE falls back to a TURN (Traversal Using Relays around NAT) server, an intermediary server which improves call success rate, but also increases bandwidth consumption.

Legacy network endpoints usually do not take advantage of ICE, so gateways that leverage this technology may be used to allow for external media to interop with these endpoints deep into legacy infrastructure.


Luckily enough, many open source tools are available to help overcome interoperability challenges, and most are a good fit for WebRTC and legacy infrastructures. Here are a few we have experience using, along with a short description from their websites:

  • Kamailio: a high performance open source SIP server, able to handle thousands of call setups per second
  • JSSip: a JavaScript library that provides a fully featured SIP endpoint in any website
  • FreeSWITCH: An open-source telephony platform designed to facilitate the creation of voice and chat driven products
  • reSIProcate: set of components including a SIP stack implementation and a few related protocols
  • webrtc2sip: a gateway that allows a web browser to make and receive calls from/to any SIP-legacy network or PSTN
  • libnice: an implementation of the Interactive Connectivity Establishment (ICE) standard and the Session Traversal Utilities for NAT (STUN) standard. It automates the process of traversing NATs and provides security against some attacks. It also allows applications to create reliable streams using a TCP over UDP layer.
  • PJSIP: an open source SIP, media, and NAT traversal library implementing standard based protocols such as SIP, SDP, RTP, STUN, TURN, and ICE

Our experience shows that success requires choosing the right pieces for each scenario, and good engineering practices to build a scalable, reliable architecture out of the chosen options. Although the tools themselves are great, they will be part of a multi-component architecture, which will need to integrate with large, mission-critical communication infrastructures with challenging availability requirements. So, yes, it can be complicated, but solving interoperability challenges is very doable.

On the Horizon for WebRTC

A couple of emerging technologies, protocols and solutions from vendors may introduce some changes in the WebRTC interoperability scenario in the short term and are important to watch. Among them we highlight:

  • Next generation of audio and video codecs
  • VP9 and H.265 for video
  • iSAC, iLBC for audio
  • New Enterprise SBC (Session Border Controller)7 features
  • E-SBCs are devices used between Enterprise networks and Session Initiation Protocol (SIP) trunking providers, as well as between different enterprise unifi€ed communications (UC) platforms, and between UC endpoints and the associated UC platforms
  • Some E-SBC vendors are starting to include gateways to enable WebRTC endpoints to connect to non-WebRTC devices, such as to a phone connected through the PSTN, which may facilitate interoperability
  • Microsoft has launched a variant of WebRTC called ORTC - Object real-time communications, which is not out of the box interoperable with WebRTC (currently)


WebRTC is gaining momentum in multiple industries, and some use cases still require interoperability with legacy systems. Challenges for WebRTC interoperability primarily revolve around signaling, call control, transcoding, identity management and security. There are a number of open source tools that help address these challenges. However, as these components are added to your infrastructure, new concerns will arise, such as how to scale, monitor and manage all of them, which may involve a learning curve and careful engineering. Additionally, as new codecs and protocols emerge, we believe interoperability will continue to be a challenge, although likely to simplify in the future as mainstream adoption grows and equipment infrastructure vendors and web browsers increase native support to the same emerging protocols.

WebRTC Interoperability Case Studies

Now we would like to share some of our experience with the most common interoperability challenges, as well as our approach in cases for the Telecom, Call Center and Enterprise Collaboration industries.

Case Study 1: WebRTC Interop with Telecom IMS for RCS and VoLTE

Before we get into the case study itself, let's make sure we don't get lost in the acronyms:

  • IMS stands for IP Multimedia Subsystem or IP Multimedia Core Network Subsystem (IMS), which is an architectural framework for delivering IP multimedia services in the telecom industry
  • RCS stands for Rich Communication Services, which allows for inter-operator communication services based on IP Multimedia Subsystem (IMS)
  • VoLTE, or Voice over LTE, is a standard for high-speed wireless communication for mobile phones and data terminals. It is based on the IP Multimedia Subsystem (IMS) network.

Project Requirements

The basic requirements in this case were:

  • A WebRTC client user should be able to establish a point-to-point audio/video call with a mobile user using a mobile RCS client
  • And vice-versa
  • A WebRTC client user should be able to register as a subscriber in the IMS Network.

Solving Signaling and Identity Management

Starting with Signaling and Identity Management - SIP was the protocol for signaling all through the infrastructure into our client's IMS network. However, part of the SIP path was implemented over WebSocket, and part over different protocols, which could be TCP, UDP or TLS.

A SIP proxy was put in place in order to do the conversion. Because it is a new component in the infrastructure, one should always be concerned with aspects such as scalability, failover, and so on, which adds to the solution complexity.

A particular challenge in this case was that the SIP headers implemented in the IMS network used optional, undocumented parameters that needed to be discovered. SIP is a flexible protocol, which is good, but sometimes lack of documentation on the specific design decisions related to its implementation may prevent other teams to easily interop with it in the future.

Addressing Security Aspects

In terms of security, as explained in the previous section, usually we still find in legacy networks the need for unencrypted media. So again, because WebRTC is encrypted, there is a need for a gateway to decrypt it before it can interop with legacy. In this case we implemented a WebRTC gateway which communicates with the video transcoder and with our client's legacy network.

Incorporating an Administration Layer to Monitor

As we architected this project, we designed an administrative layer that would help manage the other components such as the Web Portal, proxies, gateways and transcoders mentioned above. The concern became introducing multiple new points of failure that would impact scalability and risk high availability in the service. So by adding the administration layer, we could ensure quality of service through continuous monitoring and administration. And of course we enabled the solution to persist session and management data in high availability databases, in order to provide fault tolerance.

Building a Developer Portal

One special requirement in this case was a Developer's Portal that would enable internal developers to continue to leverage the new services in different ways. For those extra components, the Web Portal and Administrative/Monitoring functions, we use the standard REST API over HTTP or HTTPS.  The simplified final logical architecture is depicted below:

Figure 1 - WebRTC interoperability with an IMS Telecom infrastructure

WebRTC Interoperability w/IMS Telecom Infrastructure


Case Study 2: WebRTC Interop with Contact Center infrastructure

In this second case study, a WebRTC-ready PaaS Communication Service Provider wanted to interop with a typical Call Center Communications infrastructure.

Multiple Interop Challenges

In this case, we experienced multiple challenges related to signal protocols, call feature access and security. Each one drove specific requirements that combined into a single solution architecture.

  • The first interop challenge was their proprietary signaling protocol, which needed to be converted to SIP by a gateway.
  • The second was making features from a Call Manager available to WebRTC clients. A Call Manager is an application for governing the routing of inbound telephone calls through a network, including call distribution and management of call features such as call queues, IVR menus, recorded announcements and more.

The omnipresent security aspect was also a challenge in this case, because Call Center Unified Communications systems might not allow for media encryption, so RTP/SRTP conversion was a requirement.

Architecting the Total Solution

The final architecture involved the connection with a PSTN/PLMN network which is part of the Call Center default infrastructure, and a Web Server for administration. The simplified final architecture for case 2 is depicted below:

Figure 2 - WebRTC interoperability with a Legacy Contact Center infrastructure

WebRTC Interoperability w/Legacy Contact Center Infrastructure

Case Study 3: WebRTC Interop with Legacy Enterprise Collaboration Tools

The requirement for our third case was to allow for a Cloud Video Conferencing solution to interop with WebRTC, Skype, Polycom, Cisco, among other H.323 (which is a protocol to provide audio-visual communication sessions on any packet network) clients.

Again, SIP over WebSocket was translated to other underlying protocols. In this case we also developed the other gateways, such as for Skype, Skype for Business and H.323. We also implemented video transcoding between codecs VP8 and H.264.

In terms of security, we encountered the RTP/SRTP conversion requirement. In addition, we had NAT/firewall issues to deal with, so ICE was used for the connection establishment.

Architecting this Enterprise Collaboration Solution

As with each case study, all other components in the architecture were connected, prepared for scalability and included an administration interface for monitoring. The simplified final architecture for case 3 is depicted below:

Figure 3 - WebRTC interoperability with a Legacy Enterprise Collaboration infrastructure

WebRTC Interoperability w/Legacy Enterprise Collaboration Infrastructure


WebRTC is gaining momentum in multiple industries, and some use cases still require interoperability with legacy systems. Challenges for WebRTC interoperability usually involve signaling, call control, transcoding, identity management and security.

In this article, we illustrated such challenges by exploring their existence in selected WebRTC projects from our customers in the Telecom, Call Center and Enterprise Collaboration industries.

We used a number of open source tools that help address these challenges. However, as these components are added to the infrastructure, new concerns arise, such as how to scale, monitor and manage all of them, which may involve a learning curve and careful engineering.

As new codecs and protocols emerge, we believe interoperability will continue to be a challenge for a while, although with perspectives of simplification in the future.

More Stories By Graham Holt

Mr. Holt, CTO, Daitan Group brings 2 decades of accomplished product and technical expertise in information technologies including big data, analytics, cloud computing, social enterprise collaboration, unified communications, IT service management (ITSM) and data center infrastructure management (DCIM). Daitan Group specializes in help customers use Agile Application Lifecycle Management with a DevOps culture to digitally transform. His Engineering and Computer Science degrees are from Leeds University.