Internet Videoconferencing

Extracted and modified from my Master thesis entitled "Improved Encoding and Control Algorithms for the IVS Videoconferencing Software"

This page is an introduction to Internet videoconferencing technology. It discusses the problems and solutions present in this recent technology.

(I) WHAT IS INTERNET VIDEOCONFERENCING?

Videoconferencing can be defined as "the combination of dedicated audio, video and communications networking technology for real-time interaction, used by groups of people who gather in specific settings to communicate with other groups of people" [1]. It provides two way audio and video communications, allowing two or more people from different locations to see and talk to each other in real time. On a wider scope, videoconferencing can be used to support other applications, such as distributed meetings, conference broadcasts, distance learning, and telemedicine.

The Internet is a global collection of interconnected computers and computer networks extending throughout the world and running on the Internet Protocol (IP) and a variety of underlying technologies. The Internet can also be viewed as a network of networks.

Thus, Internet videoconferencing, being composed of the words "Internet" and "videoconferencing", can be regarded as real-time audio and video communications between two or more people over the Internet.

(II) INTERNET VIDEOCONFERENCING TECHNOLOGY

Videoconferencing is a rather recent technology, making its first appearance in the 1980s. Until the 1980s, videoconferencing was not possible due to the immaturity of the semiconductor, computer and network technologies.

The early forms of videoconferencing are ‘room based videoconferencing’, because they were restricted to specially equipped conference rooms. Back then, videoconferencing was a very demanding application, requiring expensive and bulky hardware, extremely high transmission bandwidths (in the order of Mbps), and network performance guarantees (such as throughput and delay guarantees). Therefore, videoconferencing had to be conducted in special rooms connected with leased dedicated lines. Only large organizations could afford these videoconferencing systems.

In the early 1990s came the appearance of ‘desktop videoconferencing’. Desktop videoconferencing is conducted over desktop computers and workstations. This was made possible through recent advances in computers, networks and data compression. Desktop videoconferencing is rather inexpensive, requiring only minimal upgrades to a typical computer. Desktop videoconferencing is the type discussed in this thesis.

Within desktop videoconferencing, video data can be transmitted via circuit-switched networks or packet-switched networks. Circuit-switched networks consist of point-to-point connections between two end-systems. Circuit-switched networks provide network performance guarantees, with constant bit rates. The disadvantage of circuit-switched networks is that during a connection, the line is dedicated and cannot be used by other parties, even if no data is being transmitted. Therefore, the bandwidth of the line is wasted if not fully utilized. Examples of circuit-switched networks are ISDN and Switched 56.

Packet-switched networks are the trend in today’s networks. In packet-switched networks, data is segmented into smaller units called ‘packets’. The major advantage of packet-switched networks is their ability to multiplex data from different sources. This results in higher utilization of the network. However, its disadvantage is that it does not provide network performance guarantees. The available bandwidth and packet delay variation are functions of the load of the network. Examples of packet-switched networks are Frame Relay and the Internet. The focus of this thesis is on videoconferencing over the Internet, and in general over packet-switched networks.

The concerns in Internet videoconferencing stem primarily from two factors, namely the enormous amount of data required for video, and the harsh unpredictable environment of the Internet.

Uncompressed full motion video requires the rapid transmission of a massive amount of data, which is far too large for storage and transmission over most Internet networks. For instance, transmitting uncompressed NTSC quality video requires a bandwidth of about 166 Mbps [2]. This exceeds the bandwidth of all but the fastest networks available today, and the cost of transmission is highly prohibitive. Clearly, transmitting uncompressed video is impractical. Video data must be reduced significantly if it is to be transmitted through any network.

Data compression reduces the amount of data required to represent information, allowing the information to occupy less storage space on storage devices and consume less bandwidth for transmission. Compression techniques can be divided into two categories, namely redundancy reduction techniques and entropy reduction techniques. Redundancy reduction techniques produce lossless compression by only removing redundant information. Examples of redundancy reduction techniques include Huffman encoding and run-length encoding. On the other hand, entropy reduction techniques produce lossy compression by reducing the information content. As such, entropy reduction techniques are irreversible, meaning that the original information cannot be exactly recovered. An example is quantization. Redundancy reduction techniques generally produce much lower compression ratios than entropy reduction techniques. The low compression ratios of redundancy reduction techniques make them unsuitable for use alone in video coding. Thus, entropy reduction techniques with their higher compression ratios must be used.

In order to maximize the amount of compression on video data, most video coding standards utilize some combination of both redundancy reduction and entropy reduction techniques. Since these standards include entropy reduction techniques, video compression is lossy and comes at the expense of degraded video image quality. The compression ratio is directly related to the amount of video image degradation. And considering that a higher compression ratio results in lower bandwidth consumption, this leads to a tradeoff between video image quality and bandwidth requirements. Furthermore, efficient compression often requires significantly longer processing (CPU) time. This results in a tradeoff between processing time and bandwidth requirements.

Significant data reduction can also be achieved by reducing the quality of the videoconference. This includes lowering the number of video frames per second, coding video in grayscale, and reducing the size of the viewing window.

Videoconferencing requires reliable and timely delivery of data, because the quality of perceived audio and video signals is sensitive to data loss and delays. The Internet is a difficult environment for videoconferencing because it does not offer any performance guarantees. There is neither a guaranteed minimum bandwidth nor a guaranteed maximum delay. Instead, the available bandwidth and packet delay time fluctuate depending on the state of the Internet. The available bandwidth decreases with increasing load in the Internet, whereas the packet delay time increases with increasing load in the Internet. Furthermore, the current Internet typically only provides low to medium speed transmission (less than 100 kbps) between hosts. For these reasons, the quality of Internet videoconferencing is often mediocre. The problems frequently encountered in Internet videoconferencing are:

Low quality video images: Image quality is degraded to satisfy bandwidth limitations.
Jerky movements: Movements are jerky and disjointed due to low video frame rates.
Poor synchronization between audio and video: Voice is not well synchronized with lip movements.

Multiparty videoconferencing requires identical video data to be delivered to each destination. The standard IP routing consists of point-to-point (unicast) transmission. Therefore, to deliver identical data to multiple destinations, separate copies of the data have to be sent over multiple point-to-point connections. Using multiple point-to-point connections is for most cases inefficient and wastes network bandwidth because duplicate data is transmitted over segments of a network.

Multicasting is an efficient way to deliver identical data to multiple destinations on the Internet. Multicasting consists of transmitting packets to a host group identified by a multicast (class D) IP address. Multicast IP addresses range from 224.1.1.1 to 239.255.255.255. Multicast IP addresses are differentiated from standard IP addresses by their first four bits, which are always binary 1110. The remaining 28 bits of the multicast IP address identify the host group.

The Multicast Backbone (MBone) is a virtual network running on top of the Internet (a subset of the Internet) which uses the IP multicast protocols to provide multicasting services across the Internet. It was created by the Internet Engineering Task Force (IETF) in 1993. The MBone is composed of a collection of Internet routers (called ‘islands’) supporting IP multicasting, that are linked by virtual point-to-point links (called ‘tunnels’) [3]. The tunnel endpoints are typically workstations or routers with support for multicasting [4].

In multicasting, a source transmits to multiple destinations by sending only one copy of the distributed data to the host group. The MBone network intelligently replicates the data transmitted by the source at certain points. This ensures that duplicate data does not pass through any segments of the network, thus minimizing network load. Figure 1 illustrates the multicasting transmission.

Figure 1: Multicasting transmission

(III) REFERENCES

"Videoconferencing FAQ" < www.bitscout.com/faqtoc.htm >
Kanakia, H., Mishra, P. P., Reibman, A., "An Adaptive Congestion and Control Scheme for Real-Time Packet Video Transport",
Proceedings of SIGCOMM '93, San Francisco, CA, Sep. 1993, pp. 20-31.
Macedonia, M. R., Brutzman, D. P., "MBone provides audio and video across the Internet", IEEE Computer, Apr. 1994, pp. 30-36.
"Frequently Asked Questions (FAQ) on the Multicast Backbone (MBone)", < ftp://isi.edu/mbone/faq.txt >