Abstract:
With the prevalence of smartphones, pedestrians and joggers today often walk or run while listening to music. Since they are deprived of their auditory senses that would have provided important cues to dangers, they are at a much greater risk of being hit by cars or other vehicles. PAWS/SEUS is a wearable system that uses multi-channel audio sensors embedded in a headset to help detect and locate cars from their honks, engine and tire noises, and warn pedestrians of imminent dangers of approaching cars. We demonstrate that using a three stage architecture and implementation consisting of headset mounted audio sensors, an embedded front end that performs signal processing and feature extraction, and machine learning based classification on a smartphone, we are able to provide early danger detection in real-time, from up to 80m distance, with up to 99% precision and 97% recall, and alert the user on time (about 6s in advance for a 30mph car).
Working Principle:
To illustrate how PAWS/SEUS is able to identify and locate approaching vehicles we ask you to listen to the following track with headphones or earbuds on:
Remember to use your headphones!
Based only on the sound of this track you were probably able to guess that a vehicle located on your right honked, accelerated towards you and crossed to your left. How come that you predict so much information from this audio clip? Well, all the sounds captured in the recording are present in your daily life, and your brain has already understood how important it is to identify them. In the spectrograms shown bellow we can see that each event of the clip has a particular signature.
And what about the spatial cues, why are you able to get them? As shown in the figure to the right, we used a binaural recording prototype to generate this track. By using microphones place in the ears of a mannequin we are able to capture the effects produced by the relative position of the sound source and the listener. Similarly to the identification of the sound source, your brain is also trained to detect this cues and estimate where is the stimulus coming from. The main effects that allow binaural spacial perception are the interaural time difference (ITD) and interaural level difference (ILD), with only two microphones ITD and ILD alone are still not enough to provide more than the azimuth of the incident sound, it delimits what is called the cone of confusion. PAWS/SEUS enhances these perceptions techniques; it uses redundancy of inputs combined with advanced classification algorithms to artificially infer the presence and location of the desired sounds.
Experimental Database:
All the recordings done during the experiments with the mannequin are shared in the Google Drive bellow. You must be logged in your Google account to access the files. To view the folders click in “Request access”.
The files are in a MATLAB data format, each file has the synchronous recording of 8 microphones distributed over the mannequin.
For any further details feel free to contact:
Daniel @ dd2697@columbia.edu
Code Repository
Front End: https://bitbucket.org/columbia-icsl/frontend/src/master/
Smartphone Application: https://bitbucket.org/columbia-icsl/smartphone/src/master/
System Overview
More details on the system will be released upon official publication of the work.
You must be logged in to post a comment.