Skip to Content
 
Logo of Marquette University BIEN 167 Module 3 Telerehabilitation

Conferencing Over IP: H.323 & SIP

Outline History Univ Tele-Access Models Technologies Telerehab
Part 5 (Technologies): | tele-standards | H.320 ISDN | H.324 POTS | H.323 & SIP | Wireless | Multi-Node |

IP Videoconferencing: ITU's H.323 and IETF's SIP

Conferencing over the packet-based circuits, often called Internet Telephony, refers to real-time transport of multimedia telephone calls over the Internet. When only voice is transferred, it is often called Voice over IP (VoIP). VoIP is currently a very big deal as we gradually transition towards an Internet-based phone system. Advantages of IP-based conferencing include low costs (e.g., webcams are as cheap as $20, and Microsoft's client-side H.323-based Netmeeting and MSN Messenger packages, and their SIP-based Windows Messenger product and Real Time Communciation software suite within the .Net Framework, are free). Another advantage of the complex, multi-faceted standard is the support for multi-point conferences and for interfaces to H.320 systems. But perhaps the biggest advantage is the integration with computer-based data and application sharing.

While an attractive alternative to circuit-based approaches such as those using H.320 or H.324, there are several fundamental disadvantages:

  • Quality of Service: Sending information by asynchronous IP packets is like sending a letter via a collection small postcards that take different paths but end up being re-assembled in order at the final destination. There is no guarantee of a timely arrival of every postcard, especially if the Internet is overloaded as can happen during certain times of the day. Specifically, CODECs multiplex audio, video and/or data information, process and compress the information, and send it as a as a series of packets which are transmitted over the Internet network to the destination address over varying pathways in the network. Upon reaching their destination, packets are reassembled, decompressed and displayed. While there is some built-in redundancy to help with lost or delayed packets, the end result is that there is not guaranteed quality and reliability of service.
  • Security/Confidentiality: Sending IP information is much easier to intercept than a dedicated circuit connection, i.e. it is a less secure medium. Thus we have firewalls, encription approaches, etc., that are not as critical an issue for telephone lines.

Currently, the most common way to implement such calls is via the H.323 protocol, and perhaps 90% of VoIP calls use this detailed ITU-T standard that comes out of the telecommunications community. A popular alternative approach, the IETF's Session Initiation Protocol (SIP), comes out of the internet software engineering community. We will discuss both.

H.323 "describes terminals and other entities that provide multimedia communications services over Packet Based Networks (PBN) which may not provide a guaranteed Quality of Service. H.323 may provide real-time audio, video and/or data communication" (from ITU-T Recommendation H.323 V4). Notice the explicit mention of a lack of guaranteed quality of service. In fact, H.323 serves as an umbrella for a collection of other "best practice" standards.

H.323 entities are:

  • a terminal endpoint on a LAN that must support signaling/control (H.245 and H.225 (includes Q.931 and RAS)), real-time 2-way transfer and communication protocols (often called RTP/RTCP), and audio codecs (including at least G.711 audio, and usually at least one other), with optional support for video (H.261, H.263) and data (T.120) standards,
  • a gateway that serves as an interface between a LAN and circuit-switched network (e.g., IP/PSTN), and can translate communication procedures and protocols between these networks,
  • a gatekeeper that manages a zone (collection of H.323 devices), with certain required functionality: address translation, admission control, bandwidth control, and certain optional functionality related to other call control and call management features),
  • a multi-point control unit (MCU) to support 3 or more endpoints.

An important subset of H.323 is the “voice and data” mode (i.e., video is not required for compliance, but if available must meet certain standards).

In the late 1990's, Microsoft's Netmeeting package, freely available, became a defacto "gold standard" for low-end videoconferencing over IP. It was one of 9 IP-based packages that Donal Lauderdale and myself evaluated in 1998. In addition to support for H.323, including the T.120 standard (e.g., including platform-independent support for chat, file transfer, shared white board), it provided support for application-sharing on the Windows platform. For the most part, we found the various products we evaluated to be interoperable, but typically not without some effort. In about 1998 Microsoft also embedded Netmeeting within its MSN Messenger product. While the addition of video (or applications such as powerpoint) to voice was nice, a key problem with Netmeeting was an audio time delay of about a quarter of a second. This was in part built in to their implementation of H.323, perhaps in part because of concerns of unpredectable packet delays. A summary of Microsoft's implementation of T.120 is available. The quality of video for Netmeeting and similar products was strongly a function of bandwidth, and was pretty decent for connections involving two sites on a LAN. Because Netmeeting added value to H.323 via application-sharing on Windows platforms, and also had a "minimal" implementation with an available SDK, many third-party companies provided added value to the H.323/Netmeeting protocol through features such as multi-point conferencing and more convenient phone-like calling options.

Importantly, current state-of-the-art H.323-based VoIP products do not have significant audio time delays, and there are many H.323 vendors. While most H.323 implementations are proprietary, there is also an open source forum (www.openh323.org) with which our group used to participate. It is possible to access the core of the H.323 standard from the ITU site. A good source, one of many available on the web, is the H.323 Forum. The competion with the alternative SIP protocol has helped push improvements into theH.323 often-updated standard, and it is now up to Version 4 (H.323-V4), with there being many H.323 V4 products on the market and more coming.

With Windows XP, Microsoft dropped continued development of H.323-based Netmeeting in favor of the SIP-based collection of standards, discussed below. Microsoft has explicitly embedded SIP within its new .Net framework, and SIP is used for its Windows Messenger product (versus H.323 for its MSN Messenger product). The former has much shorter audio time delays and more features. It is also embedded within the .Net Framework (essentially a library), and thus SIP becomes one of many tools that is available for Windows developers for platforms ranging from mobile (e.g., PocketPC) to desktop.

SIP is "an application layer signaling protocol that defines initiation, modification and termination of interactive, multimedia communication sessions between users." (IETF RFC 2543 SIP). It consists of:

  • a user agent that initiates, receives and terminates calles (with client/server model)
  • a proxy server that relays call signaling, a SIP redirect server, and a SIP registrar that accepts requests and maintains user's whereabouts

SIP (IETF FC 2543) is thus a simpler control protocol which uses textual (ASCII commands) client-server model to create, maintain, modify or terminate multimedia sessions with one or more participants. It is intended to be used as a software module for managing sessions, similar to the strategy of HTTP protocol for the web or SMTP protocol for email. To the programmer, SIP is a toolbox that turns a telephony or multimedia session into a web application that can integrate with other Internet services. In considers user location, user capabilities, user availability, call setup and call handling. A nice summary is available from the SIPCenter, and to get a sense of the degree of commercial activity see the SIPCenter main page.

SIP differs from H.323 at a fundamental level: in SIP, the " intelligence" in distributed out to clients (i.e., their computers) in a more distributed architecture, as opposted to the H.323 model of an intelligent central coordinating site surrounded by "dumb" terminals.

Collectively, H.323 and SIP provide a a suite of standards (see also VoIP standards Reference page at protocols.com. The current "killer application" is not videoconferencing, but VoIP, where billions of dollars rest on who and what coordinates a phone connection. Many believe that there will be a gradual shift away from standard circuit-based PSTN toward an packet-based Internet system, once consumers are convinced that quaqlity-of-service issues can be addressed. This is already happening. Indeed, Marquette's telephone backbone is typical: we depend on products from Cisco and Siemens, and both of these companies have products for both the H.323 and SIP standards, and well as traditional digitial/analog circuit-based phones. Packet-based approaches are starting to do well at large companies with their own controlled LAN environment. Siemens in particular has come to campus to try to convince Marquette to upgrade its phone network to these new hybrid, multi-standard phones.

Of note is that while most H.323 and SIP system make use of computers, there are "phone" systems that do not require the user to own a computer (e.g., mm146 from Motion Media for H.323 over cable/DSL, DV325 from 8x8 for SIP). In our own lab R&D where we've wanted to integrate our Intelligent Telerehabilitation Assistant (ITA) with videoconferencing capabilities that include mobile (PocketPC) capabilities, we originally spent some time understanding the H.323 standard, then switched to SIP.

What does this mean to healthcare? There is no question but that H.323/SIP videoconferencing can be integrated into electronic health records (EHRs), more transparently than is possible for circuit-based approaches. Furthermore, these approaches can be used for both desktop and mobile computing environments, including wings of hospitals that are now wireless (e.g., Zablocki VAMC). Yet despite all the movement towards IP-based conferencing, there are barriers such as the lack of guaranteed quality of service across the Internet, security issues in this era of HIPAA, and the strong bias within the medical device industry toward proprietary systems. Also, for telehomecare, the convention for asymmetric allocation of downstream versus upstream bandwidth for DSL and Cable Modems, with typically about 5 times more bandwidth allocated for downstream transmission to the consumer than upstream from the consumer, does hurt interactive conferencing since it is the lowest common two-way allocation between the participating nodes that will matter.

 

| telerehab outline || tele-standards | H.320 ISDN | H.324 POTS | H.323 & SIP | Wireless | Multi-Node |

©2003-2004 Jack Winters ... BIEN 167 Home