Traditional uses of audio on the Internet depend mostly on conventional transfer techniques, meaning a file must be downloaded from a server, usually via FTP, Gopher, or electronic mail, before hearing it. Because of this time lapse limitation, audio content providers have used Internet servers mainly as centrally located storage devices; archives that can be accessed worldwide. An excellent example of this use is the Time of The Day function at Yale University.
Forms of audio common to non-computer media, such as telephony, radio broadcasts, and tele/videoconferencing, continue on the Internet. Gradually, computers have incorporated the functions performed by separate electronic devices around the home and business: facsimile machines, CD players, telephones/answering machines, televisions/VCRs, radios, and so on. The functions that require audio input/output are easily handled by a computer with some additional audio circuitry, a microphone, speaker(s), and inputs for audio media, such as a CD player. The audio functions that work across networks, such as telephony, broadcast audio/video, and teleconferencing, are dependent also on sufficient bandwidth for success.
Considering the relatively low quality of telephone audio (approximately that of AM radio), it occupies little bandwidth. Telephony applications, such as DigiPhone, are able to use the bandwidth of a modem, 14,400 bits per second, to achieve full-duplex audio, functionally replacing the telephone.
Telephony applications have practical obstacles to overcome when trying to replace the phone. First, the computer must be kept on anytime incoming calls are desired. Second, the other party must be using the same system (there is yet no computer industry standard, much less Internet-wide standard, for telephony). Third, a familiar physical user-interface, analogous to a handset, is not part of the standard computer configuration. Speakerphone and handset solutions have thus far been implemented rather clumsily because they must coordinate several audio inputs and outputs, but have potential for future success.
Progressive Networks Inc.’s RealAudio, an audio client-server system that approximates real-time delivery, is already being used to deliver news broadcasts. The power of the spoken word can be found recorded in many places, such as the introductory message by Bill Clinton at the White House WWW page. While the client is designed to play an audio file as it is downloaded, the server is optimized for sending many simultaneous streams of audio. By integrating the player with the Web browser, RealAudio and similar systems take advantage of established distribution, authoring, and security protocols.
Future plans for the server include live audio feeds in real-time, broadcasting text along with the audio stream, and increased capability from hundreds to thousands of simultaneous streams.
The RealAudio compression algorithm and technical specifications are kept confidential, although others are developing similar methods (see below). The sound quality approximates an 8 bit, 11Khz sample, playable over an 14,400 bps modem connection with modest processing requirements. Since the playback method is not scaleable, the same quality is had on computers with faster connections and greater processing power.
The DSP Group, Inc. has developed a similar product called TrueSpeech. TrueSpeech concentrates on compression algorithms tuned specifically for speech. DSP has decided to offer more than one compression scheme in the TrueSpeech player, offering options in sound quality, file size, and transmission speed. Additionally, TrueSpeech is working toward standardizing it’s compression algorithms for cross-platform use of computer and telephony products.
TrueSpeech 8.5, using a 15:1 compression ratio, is a part of the Windows 95 Internet Explorer Web Browser, ensuring widespread use. The DSP Group plans to develop a TrueSpeech server to be used much like the RealAudio server. For now, Windows 95 users are able to encode audio and save it as WAV files. Presumably, these files have the advantage of being played on either TrueSpeech players or in non real-time with a conventional sound utility, as opposed to the proprietary RealAudio format.
The Xing Technology Corporation has released the similar StreamWorks product. The major differences being the capacity for live broadcast feeds and the international standard MPEG compression. Also, it is designed to operate over TCP/IP networks of varying topologies, including Ethernet, ATM, FDDI, ISDN, T1 and Frame Relay. For this reason, it has uses beyond the Internet, such as in corporate and educational Wide Area Networks.
More advanced uses of audio on the Internet include teleconferencing. The great advantage of conferencing over traditional telephony is the ability of several parties to simultaneously communicate with each other.
Developers have bypassed audio teleconferencing systems, finding enough bandwidth in the typical modem connection to transmit both audio and video together . CU-SeeMe, a personal computer-based videoconferencing client-server system, manages to work audio material into it’s compression scheme along with video. It achieves such excellent efficiency by using four compression algorithms with a choice of two different audio sampling rates. All compression functions in software with modest microprocessor requirements.
CU-SeeMe operates in a client-server configuration: clients transmit signals to “reflector sites,” UNIX-based servers which receive and re-transmit the signals. This configuration, rather than a client-client system, allows broadcasting from one point to many, as well as true multi-point conferencing.
While the ability to videoconference over TCP/IP connections with no additional cost is attractive, it is unfortunate that the CU-SeeMe protocol is not compatible with current videoconferencing standards. Conversion between standard videoconferencing protocols and the CU-SeeMe protocol would enable most anyone on the Internet to exchange real-time audio and video with the many installed conventional videoconferencing systems.