|
Sound, Hearing and Accommodation StrategiesOutline:1. Gift of Hearing
2. From Sound Waves to the Inner Ear
3. Sensing to Hearing Within the Brain
4. Speaking
5. Audio Codecs
6. Access/Accommodation Strategies for Hearing Disability
1. Gift of HearingThere are many aspects to the process of hearing sound. First, it must be sensed. This involves a structure within the inner ear and sensory neurons that map to it. Next, it must be interpreted at a basic level, which is essentially involves both a filtering process of the incoming sensory information stream and an active recognition process that includes integration with memory. There are then advanced processes such as augmenting sounds of interest ("signal") and minimizing analysis of other sounds considered by the human to be "noise." There are also active compensation processes, such as how to deal with self-produced sounds. The ability to interpret sounds is a function of their quality, and we will also consider strategies for audio compression-decompression algorithms (codecs). This section briefly examines some of these aspects of process of hearing, and how hearing impairments can influence the interpretation of sound. 2. Sound Transduction and the EarThis topic is well covered in many sources, including standard physiology textbooks. Good web resources on anatomy include the HearingCenterOnline and ASHA's site. We briefly highlight key aspects of the signal transduction process from a systems perspective. It should be remembered that from a systems perspective, for sound to be reliably coded to the brainstem, every part of the ear – outer, middle and inner -- should function well. However, due to various problems in the part of ear, various impairments occur that can cause disabilities. In some cases there are well defined solutions but in others there are limited solutions. To help understand these disabilities, we need to understand basic aspects of the sensory transduction and coding process. 2.1 Sound Waves to the EarThe topics of the physics of sound and the anatomy of the ear is well covered in many sources. This section will be brief. Sound is transmitted via longitudinal waves that travel through air at about 344 m/s (770 miles/hr), or about 1 meter in 3 ms, or traverse the distance between the two ears in about a half millisecond.. This speed through air increases with temperature and altitude. Sound pressure waves transmit over 4 times faster through water that air, and even faster in salt water. It travels fastest in solids (due to the packed structure of atoms within solids). These waves can come from various sources, and thus from various distances and directions from the outer ear. A wave from a given source has a magnitude (correlates with loudness) and frequency (correlates with pitch). Sound dissipates as it traverses through air, and its magnitude also weakens when the pressure waves have to go around and through physical barriers such as a wall or a person's head. Sound sources invariably are not perfect sine waves, and musical sounds commonly have repeating patterns that include a primary pitch and harmonics which helps give sound its timbre quality. Often, as with hearing the speech of others with music in the background, the actual signal is a constantly changing collection of frequencies at different magnitudes (i.e., maximum pressure changes, or amplitude). And of course, sounds from many sources may be converging on each ear. But ultimately, what matters is the magnitude and frequency content that actually reaches each ear. This is the "signal" over time that the ear must sense. 2.2 Anatomy of Signal Transduction from the Ear to the Cochlear NucleiThe input signal is this complex, constantly changing sound waveform is what arrives at the ear. There are many good ear anatomy sources, such as http://en.wikipedia.org/wiki/Ear or http://www.wisc-online.com/objects/index_tj.asp?objid=AP1502 or http://webschoolsolutions.com/patts/systems/ear.htm - see http://www.audiologynet.com/anatomy-of-the-ear.html for a good review of sites. The outer ear serves to help deflect sound into the air-filled external auditory canal and then to the eardrum (tympanic membrane), which then deflects. The ear canal also helps protect the eardrum from damaging sounds since the ear drum is hidden a few centimeters from the outer ear. The eardrum functions as a resonator that reliably transduces the vibrations of the pressure waves over a very high frequency bandwidth. The eardrum is kept in position of equilibrium by the outer air pressure and the inner edge of the Eustachian tube, which connects from the mouth to the end of ear drum. If these pressures are not the same on both sides, one generally experiences a sort of ringing in the ears. This is generally experienced when someone has a bad cold or by someone who suffers from sinusitis and other similar conditions. The vibration of the eardrum is transmitted inward via three auditory ossicles that serve as a mechanical transmission system: eardrum is mechanically coupled to the malleus, which transmits through the incus to the stapes. This couples to the oval window of the inner ear (labyrinth). Movements of the foot plate of the stapes generates traveling waves. The distance from the stapes to the point of maximum height is a function of the pitch frequency, with high pitch sounds providing maximum basilar membrane displacement at shorter distances (e.g., about 1 cm along the length for 2 KHz) and lower pitch sounds at longer distances (e.g., about 3 cm for 100 Hz). There are two middle ear muscles (tensor tympani and stapedius) that work to decrease sound transmission gain, especially for loud sounds. This is often called the tympanic reflex, and functions as a protective mechanism for loud sounds that are sustained for more than the roughly 50 ms latency time. The cochlear portion of the labyrinth is a coiled fluid-filled tube that makes about 2 3/4 turns with a length of about 35 mm. Located on its basilar membrane is the organ of corti, the structure that contains the hair cells that are the sensory receptors and extend from the base to the apex. Each hair cell has a resting membrane potential of about -60 mv, but with displacement goes to about -50 mv in one direction and greater hyperpolarization in the other. There are 4 rows of cells along its length, three outer (with about 20,000 hair cells, and 90-95% of innervation) and one inner (about 3500 hair cells). Each auditory nerve has about 28,000 of afferent and efferent fibers to the cochlear nuclei in the medulla oblongata of the brain stem. Within this bundle are signals coding pitch by way of where their associated hair cell is located along the length of the cochlea. Thus the transduction mechanism is such that "which hair cells" connected along the basilar membrane signals pitch and "how much deflection" in these cells signals magnitude. Finally, this magnitude is coded by action potentials in auditory nerve fibers, with the higher firing frequency correlated with loudness. It's a great design. 2.3 Ranges of Sensation: Frequencies and MagnitudesThe range of frequencies that the typical human auditory system can hear is from about 20 Hz (low pitch) to 20K Hz (high pitch). For practical reasons, the key range that easily encompasses speech and includes a reasonable representation of the range music (at least AM radio quality) is about 50 Hz to 8 KHz. The average pitch for the adult male in regular conversation is about 120 Hz, and for the adult female about 250 Hz. Sounds of many pitches add together, depending on their magnitude. For instance, low frequency sounds such as an airplane or a Harley-Davidson motorcycle sound different live versus on AM radio because for the latter, some of the richness of the lower frequencies is lost. Sound is normally measured by a relative scale called a decibel (DB). A decibel is defined with the equation 10 log (s2/s1) where the logarithm is to the base of 10 and s2 and s1 are magnitudes of sound signals. Sound is usually measured with microphones and they respond (approximately) proportionally to the sound pressure, p. The power in a sound wave, all else equal, goes as the square of the pressure. The log of the square of x is 2 log x, which introduces a factor of 2 when we convert to decibels for pressures. Thus a decibel is defined 20 log (p/po), where p is the pressure and po is reference value. This reference is an estimate of the typical human auditory threshold in a young person. This value is assumed to be 0.000204 dyne/cm2 (or 0.02 , in microPascals). (This is very low, about 20 billionths of an atmosphere.)
Normal conversation is around 60 DB, live rock music is about 100 dB, and painful sound is about 140 dB (of course, some might find certain 100 dB sounds "painful"). Thus the range of useful sensing of sound is a factor of 100,000 (i.e., 100 dB), which is a remarkable range. 2.4 Audiogram: Basic Testing of Sensing SoundHearing is usually tested by simple tuning fork tests followed by a detailed audiogram. A tuning fork test involves beating a tuning fork and having the subject confirm whether they heard the sounds produced. Various parts of the upper body - the forehead, jaws, and outer ear - are used in tuning fork tests. Tuning forks tests general hearing and cannot confirm if the subject has a more serious problem. Often another method of air pressure testing is an audiogram, which is performed in the hospital or testing facility to confirm the integrity of the ear canal and tympanic membrane. The test is given with an ear probe inserted into the outer ear as an ear plug, and then testing with different air pressures. If the ear drum responds well to the pressure, we can confirm that the drum has no serious problems and responds well. During the test some ringing is experienced by the subject. An audiogram can do a complete test of the human range of frequencies. When all these tests are complete, an audiologist or ear-nose-throat (ENT) surgeon should be able to come to a good conclusion of the defect if any. Students in MU's speech/language program receive considerable training in these techniques. 3. From Sensing to Hearing (Within the Brain)In the previous section on sound and anatomy, we reviewed the sensory transduction process, starting with the arrival of sound waves and going through action potentials on nerve fibers that signal a measure of frequency/pitch by "what neurons" are firing and the magnitude/loudness by the firing rate of these neurons. In this section we briefly review signal processing in the auditory cortex, and develop some understanding of the rich infrastructure of research and analysis related to hearing. 3.1 Sound and Sound Pattern RecognitionBasic spatial mapping along auditory cortical tissue is primarily coding pitch. Injury to the auditory cortex, located in the temporal lobe above the ear, does not destroy the sensation of hearing, but rather the ability to recognize and interpret tonal patterns, including "remembered" sounds (both short-term and long-term mechanisms). Sound recognition applies to many sounds, and thus one remembers old songs, voices, etc. It is a remarkable capability, one that is still not that well understood. 3.2 Sound LocalizationWe use two mechanisms to help us localize sound.
4. SpeakingThe generation of human speech involves a remarkably complex process. There is often considered to be two principal stages to the process of speech production:
We will not go through the anatomy of speech production in detail, which involves coordinated action of muscles and structures that include the diaphragm of the lungs, the extrinsic and intrinsic muscles of the larynx ("voice box"), the musculature of the tongue, muscles affecting the size of the opening at the mouth, and muscles affecting the shape of the lips. There are many good sources on the web. The chief function of the larnyx, a structure connecting the pharynx and trachea that is made up mostly of cartilage (including vocal cords) and has muscle attachments, is phonation, with the pitch of sound determined by the shape and tension of the vocal cords (long, lax cords for lower-pitch tones; short, tense for higher tones). Given the challenge of creating skilled motor patterns that involve the larynx, lips, mouth, respiratory system and other accessory muscles of speech, it is in some ways remarkable that speech is so effective. The facial and laryngeal regions of the motor cortex (near the temporal lobe), Wernicke's area (more sensory) and Brocca's area (more motor) and various others, all participate in controlling the sequences and intensities of muscle contractions. A common "information flow" during conversation is from auditory cortex to Werneke's area for language interpretation to Brocca's area for formation of words to the motor cortex for control of speech muscles. Normal speech can produce up to 150 to 175 words/min, and do so at many different pitches, volumes and intonations. It is not surprising that many types of dynfunctions can occur (e.g., see "problems" under http://en.wikipedia.org/wiki/Speech). Such functional impairment can occur at various stages of these processes, and affects between 0.2% and 0.6% of the total world population (e.g., about a million persons in the U.S.). The following terms are used to describe some of these disorders:
A key part of the training of Speech-Language Pathologists involves understanding such disorders and their sources, and then developing and implementing therapeutic intervention plans. Often computers are now used to help implement such plans. There are also a variety of alternative and augmentative communication (AAC) approaches available to help individuals with speech impairment to communicate more effectively. This may include:
These speech synthesis systems continue to improve, both in terms of quality of speech and of memory/computation requirements. 5. Audio CodecsAs with video codecs, the aim of audio codecs is to compress a signal that will be sensed (sound) for signal transmission or data storage. It is not surprising that there are a variety of compression algorithms, given that there are so many different aims for audio. For instance, one would expect different approaches for stereo music than for cell phones. Audio codecs can be roughly separated into 2 types: those targeting real-time transmission of speech and its reception/recognition, and those for multimedia sound that includes both voice and music. For the former, the most common transmission media are regular plain old telephone system (POTS) lines, cell phones, and Internet (Voice over IP, or VoIP). As we've seen, for human conversation frequencies up to 8 KHz need to be included. For the conventional 8 KHz mono audio at the norm of 16 bit pulse code modulation, the ideal rate would be 128 Kbps/sec. Compression rates decrease from this, with 4 - 1 for MPEG (MP3) 32 Kbits/sec (MPEG can also be 16, 22 and 24 Kbits/sec). A classic cell phone standard, GSM, uses 13.5 Kbits/sec. Others cell phone rates are lower, down to about 8 Kbits/sec, but most are 11 Kbits/sec or so. For stereo sound, of course, the required rate doubles, and a typical bandwidth for quality stereo is 64 Kbits/sec. For one of many surveys of technical coding approaches on the web, see G. Stephen Kinnear's site. There are many coding approaches, which can be roughly broken into two types: waveform codecs and vocodecs. The latter are intriguing in that they are based explicitly on recognizing human speech patterns, and using knowledge of speech to compress sound more effectively. There are several dozen universities in the U.S. alone that target speech analysis, many based in electrical engineering departments. Interestingly, these approaches build on many of the original speech recognition research and systems that came out of the disability research community. But with the excitement and economic incentives surrounding mass-market Voice over IP (VoIP, or digital phones), leadership is now outside of the disability community. The challenge in VoIP, and cell phone technologies, is more targeted on the low-bandwidth end, with aggressive codecs that have efficient processing algorithms since consumers find time delays irritating. There is also considerable effort at VoIP over phone lines and within LANs. A rich variety of technical approaches are being tried, with most of the VoIP codecs being proprietary despite considerable efforts at standardization and a collection of common conferences for talking out such issues. From a consumer perspective, the key point is that audio codecs can be expected to continue to improve, for all modes of telecommunication, and the level of quality of VoIP is starting to reach the point where major companies are starting to choose VoIP as the primary mode of telecommunications within their enterprise. The disability community will be able to piggyback on this wave of technical innovation. On the medium-higher bandwidth and web-oriented end where music-related digital sound quality and a higher range of pitches and tonal quality matters, there is also considerable development activity. While there are standard file formats such as *.wav, the most popular for compressing music files is MP3, which actually is a standard associated with MPEG-2 Audio Layer 3. Pushing the envelope is Microsoft (and the technologies it acquires), and its proprietary collection of codecs within the newer Windows Media Audio (WMA) 8.0 & 9.0 (both require Windows Media Player 9, which can be downloaded at http://windowsmedia.com/9series/Home.asp). The encoder package is broadly considered the state of the art, and is inherently scalable, allowing range of capabilities within one package. It supports WMA and MP3. Version 9 includes many new features and the new, 24-bit, 96 KHx "surround sound) capability for uses who have appropriate peripherals. Importantly, Windows Media has moved towards a consumer-centered interactive "try alternatives and pick what meets your needs" approach to codec selection, which should work well for the disability research and consumer communities. To summarize, audio codecs remain somewhat of a delightful mess right now, but as the saga unfolds the winner will be the consumer. 6. Access/Accommodation Strategies for Hearing Loss[Most of this section is contributed by Dr. Sarma Danturthi] 6.1 Reasons for Hearing DisabilityHearing disability can start at any age and includes the following types
It is important to note that being "deaf" does not mean the person is stone deaf and cannot hear anything. For instance, if a person is exposed repetitively to a particular sound, (s)he may become deaf to that primary frequency. Thus a person with high frequency hearing loss can actually hear some sounds, especially of lower frequency, and cannot hear others. The audiogram helps establish the degree and nature of deafness. In general the hearing-disabled population is classified as “deaf” and “hard of hearing.” Deaf culture tends to discourage the use of word “impaired,” since it would mean something a person cannot do. Also the word “handicap” is not encouraged since the person is not handicapped, but might be in a handicapping situation, and many people do not understand the distinction. Usually the terms “disabled” or "deficit" are used. 6.2 Hearing Aid as an Assistive TechnologyThe main purpose of a hearing aid is to amplify sounds. The basic hearing aid does not differentiate a human voice from the buzzing sound of a bee or the sound of a passing car. Again everything the aid receives is amplified and delivered to the ear drum. This is the best option for age induced deafness that needs amplification. There are many cases of hard of hearing people becoming stone deaf with the continuous use of hearing aid because the ear drum is damaged by the excessive noise. Then there are hearing aids that are programmed for particular users. Each person has unique hearing loss, and it may be on a different scale of frequency. Only a certain group of people benefit from a hearing aid. Even if everything fits and works, a social question arises as whether the person with hearing disability wants to use the hearing aid in public (e.g., attracting attention, acceptance of this visible change). Hearing aids are still expensive and run anywhere from two hundred to several thousand dollars, depending on the type of hearing aid. Continuous battery replacements and repairs are other costs to consider. 6.3 Mechanical Dysfunction in the Transduction Apparatus
6.4 Accessibility StrategiesHere are some of the communication or access methods for the deaf:
There are some problems involved with the above (by Dr. Sarma Danturthi):
6.5 Emergency Devices for the Deaf
6.6. Communication for the Deaf-BlindDeaf blind people can also have an additional disability. If the deaf blind person was born deaf (s)he would also have loss of speech. Triple disabilities such as these are difficult to manage since those persons depend on smell and touch. Strange as it may sound, these people recognize your presence with smell and your touch. One could use Braille readers or age-old ASL. The deaf blind person can touch and feel your fingers and understand the ASL very well without even missing a single letter of your conversation. There are also some browsers that convert every word of the web page to text and show it on a deaf person’s computer screen. But there are other problems such as an inability to trace the cursor position, follow the frames etc. A deaf blind person can also benefit from the Braille reader keyboard. This keyboard translates the character at the cursor position to pop up the particular character key on this special keyboard. Thus the deaf blind “reads” the character by finger touch and go to the next character. As such a web page content is read character by character. It is very sluggish but with repetitive use, it becomes easy and fast for the user. One major problem is the cost of Braille Keyboard. The cost can be from $10K and upwards. There are also multiple gadgets to aid the deaf-blind. There is one particular electronic gadget that can detect a door bell, dog bark, a phone ring and the fire in its vicinity. The deaf blind can wear it on the waist belt and live independently. Depending on which signal is active, the user can know by touch key events, such as, if there is someone at the door, or if a fire broke in the house, or what is the cause of his dog’s bark. 6.7 Guide DogsThese are gifts of God to the blind and the deaf blind. These animals are trained for years before being given to the users. These dogs are the most gentle and lovable and never harm even an ant. So to train, they choose only a select breed of dogs such as a German Shepard. But the cost of a guide dog grooming can run into thousands of dollars – typically up to (10K and up). 6.8 Text Telephones (TTY)Are there any public TTYs for the deaf people? It is a law in USA to keep accessibility to all in all places, such as airports and shopping malls. Since the percentage of the population using TTYs is small, it is usually placed in a remote corner along with a usual phone that can be used by a hearing person as well. But sometimes since the majority of population are not familiar with the TTYs, they choose to use this particular TTY phone to make their hearing calls. If a deaf person is looking to make a call at that time, it involves waiting though for other hearing people other phones are readily available. The biggest advantage of TTY phones is that the lines are not congested. If you are trying to call an airline for reservation and if it is busy, try calling their TTY line (but you should have a TTY first!) and you can talk to the agent right away. Some people are rude with relay calls (personal experience with HR). This is probably due to bad mood on a particular day? What if the caller is inquiring about a particular job? Actually this depends from person to person and state to state. With the only deaf University (The Gallaudet) in the entire world at Washington DC, the deaf awareness is pretty high in DC area. Whether an individual with hearing disability uses any mechanism to augment communication, the whole point of this topic is to make a deaf person’s life as independent as possible. So the research has to go on. But then ultimately everything comes down to the cost of living and the cost of equipment. For a particular deaf blind person I am aware of, the average start up is about $25K to equip him with a computer, Braille reader, gadget and TTY for individual apartment life along with a guide dog etc.
|
|
|
|