Source: [IEEE Signal Processing Magazine]
Signal Processing Supports a New Wave of Audio Research
Immersive technology makes audio more realistic and engaging, yet for some listeners audio can sometimes become a bit too immersive. At Columbia University in New York, a professor is leading the design of an intelligent headphone system that will warn distracted pedestrians of imminent dangers posed by oncoming vehicles.
With input from miniature microphones, the system uses intelligent signal processing to detect the sounds of approaching vehicles. If a hazard appears imminent, the system sends an audio alert—a beep—to the headphones, interrupting the audio stream and alerting the wearer that danger is rapidly approaching,” explains Fred Jiang, a professor of electrical engineering in the university’s Electrical Engineering Department. The threat’s direction and distance is represented by the relative loudness—in stereo—of the beep. “The beep will happen while the car is still 50 meters away, giving the pedestrian plenty of time to react and avoid a potential fatal encounter,” he notes.
While the technology’s main goal is warning pedestrians and joggers about approaching vehicles, there are also other potential applications. Construction workers, for instance, often wear headsets to muffle the sounds of heavy machinery. Yet conventional hearing protection products also prevent them from detecting sounds that might indicate an imminent threat. “Our system can be adapted to alert danger for them,” Jiang says. The system can also be adapted to fit on bike helmets and backpacks.
The four-year research project is supported by a grant from the U.S. National Science Foundation. The research team includes Peter Kinget, chair of Columbia’s Electrical Engineering Department; Shahriar Nirjon, a computer science professor at the University of North Carolina at Chapel Hill; and Joshua New, a psychology professor at Barnard College in New York. Graduate students from both Columbia and the University of North Carolina are also working on the project. The technology involves embedding multiple miniature microphones into the headset as well as developing a data pipeline to process nearby sounds and extract the correct cues that signal impending danger. The system features a new, ultralow power, custom-integrated circuit (Figure 3) that extracts the relevant features from the sounds while using very little battery power.
Jiang says a prototype system offers promising early results (Figure 4). “The system includes a modified headset with embedded sensors, signal processing algorithms, and machine-learning (ML) classifiers in a smartphone application,” he says.
“Signal processing is integral to our research, since our entire system is all about processing the sound and ultrasound signals from vehicles into useful alerts,” Jiang says. “The signals we use include the sounds of tires against gravel, the sound of air displacement as vehicles approach, engine sounds, and also honks.”
Figure 3. An ultralow-power custom integrated circuit for computing multichannel audio delays. Columbia University IEEE Signal Processing Magazine | March 2018 | 15 The signal processing pipeline starts in the analog domain with four channels of microelectromechanical system (MEMS) microphones. Signals from the analog sensors are amplified, sampled, filtered, and used to compute a number of useful features, such as cross-correction, zero-crossing, and signal power. On the digital side, frequency domain features relating to sound are computed using fast Fourier transforms, cosine transforms, and other techniques. “In particular, we created a set of frequency domain features specific to cars,” Jiang explains.
ML classifiers are used to produce various outputs, such as the directions and distances of approaching vehicles. These outputs are then converted into signals that generate the sounds—beeps—directed at the headset’s wearer.
The signal processing pipeline is divided into two parts: the headset and the smartphone. “We decided to locally process the multichannel audio data inside the headset and generate audio features that are much smaller in size, but still able to carry sufficient information for the ML algorithms, inside the smartphone to classify locations of vehicles,” Jiang says. “We decided to build a custom integrated circuit to perform cross-correlation to further reduce the power consumption on the embedded side.
Design decisions, one of the project’s biggest technical challenges, center around a tradeoff between accuracy, latency, and power. “In terms of which sensors to use, we decided to go with MEMS microphones instead of other sensors, such as a camera or light detection and ranging, because of the extremely limited energy budget on a smartphone/headset,” Jiang says.
The biggest challenge the team currently faces is improving the system’s accuracy in various environments. “Since part of our system relies on training models, sometimes it works in a particular type of environment, but does not work as well in another,” Jiang says. “We need to work on making our system work well in various environments, from quiet rural streets to cities with tall buildings to the sides of highways.”
The researchers are also looking to make the system more compact and less power hungry. “Ultimately, we want this to look no different from any existing headphones on the market, yet provide the additional benefit of danger alerts,” Jiang says.
—by John Edwards (email@example.com)