Data Over Sound

Bootstrapping communications between two devices can be a challenge. While your long-term communications might be over WiFi, for example, you may have an initial step of getting a device onto that WiFi network. This is reasonably common in IoT scenarios, where you have some device that has limited input options and you need to teach it a WiFi network SSID and passphrase, so it can try to connect.

There are many options for that bootstrapping process, though they all have their hardware requriements:

  • Bluetooth or BLE
  • NFC
  • QR code scanned via a camera
  • And so on

A lesser-used technique – at least in terms of obvious usage – is “data over sound”. One device plays an audio clip that contains embedded data, and the other device receives it via a microphone and decodes it. Google Nearby uses this, in the form of near ultrasound, as one of its communications paths.

Google Nearby is nice but is closed source as part of Play Services. Google offers very limited cross-platform support, and the system is fairly tightly tied to the Google Play ecosystem. This makes it impractical for many IoT scenarios, and it also limits the auditabilty of the code (from a privacy standpoint).

There have been various attempts at addressing this over the years. I had contemplated creating one (code name: chirpr) years ago and never found the time.

Georgi Gerganov is taking another stab at it, with an interesting approach. His focus is purely on converting data to and from raw PCM audio clips, leaving it up to individual platforms to handle the actual audio I/O. For the data-to-sound conversions, he has an MIT-licensed C/C++/WASM library that allows for use on just about every platform imaginable. Right now, he has demos for Android, iOS, and macOS/Linux/Windows desktops, in addition to the browser. He also has CLIs and a Web service for the raw sound clip conversions. Basically, he has proven that his code can run pretty much anywhere.

His Android and iOS demos though are just that: demos. What the project lacks is an Android library and iOS framework/Cocoapod for adding his data-over-sound code to an app. I took a shot at it, but while I am pretty decent at Android overall, I am not strong with the NDK or with lower-level audio APIs. The Android library would need both of those in time, and so I am ill-suited to create the library, though I could help test and perhaps maintain it.

With luck, somebody else stronger in those areas will have a similar itch to scratch and will take the lead on creating a proper Android library for this project.