In January 2018, some friends and I created a raspberry pi3 app that tranforms image input to soundscapes. This hypothetically can help blind people perceive their surroundings. How does it work?

Introduction

Present in every clip of audio are different sound frequencies with different amplitudes. We then use a Hilbert curve to map this frequency domain (which is 1 dimension) to the 2 dimensional Image. So each pixel in the Image corresponds to a unique frequency. We then simply make the amplitude of the frequency correspond to the pixel intensity in greyscale. Watch a Youtube video that explains the concepts better.

See the code

Demo

Youtube demo

Please excuse our poor videography! We filmed this in a one-take at the end of the hackathon, so we were very tired.

Dependencies

  1. Hardware
  2. Software
    • python3
    • numpy (for numpy.fft.irfft)
    • alsaaudio
    • picamera
    • hilbert_curve (found here)
    • PIL (python image library)
    • threading

Caveats

The hilbert_curve library that we found produces hilbert curves in a fairly inefficient manner, so increasing the resolution from 64x64 to 2048x2048, for example, may run for an obscenely long time preparing the curve.

Also, for some unknown reason to us, the write function provided in alsaaudio normalizes (?) the amplitudes of the signal passed in, so when we used the function to write a sine curve with amplitude 1e-12, it produced loud audio exactly the same as with amplitude 1e0.

Finally, the algorithm plays .8 seconds (constant stored as signal_time_length) of audio for each image, and then after that .8 second period it will capture a new image. This creates a refresh period of around 1 second for the audio out. Theoretically, we should be able to change the signal_time_length to a smaller period, but there is something weird the python interpreter does with thread locks which prevents it from working.

Team

Acknowledgements

We would like to acknowledge the following:

  • Build18 officers and sponsers. Build18 is the 4.5-day hackathon in which we completed this project. The hackathon provided us with funding for building materials, space for working, and snacks to keep us going.
  • 3Blue1Brown (an awesome youtube channel that made a great video on the Hilbert curve and its potential use in mapping visual space to audio space).
  • minerscale for open-sourcing initial (if extremely inefficient) code.
  • jburkardt for providing hilbert curve code