Radio signal identification «by ear»: Discrete Fast Fourier audio hashes comparison in Python

Most sound identification apps uses fast fourier transform to generate hashes from a sound. These hashes are stored in a huge catalog of audio fingerprints, and when a user tags a song for a few seconds, the app generates an audio fingerprint and seeks for a match on the database.

«I think that you’re listening to X«.

This clever guy created his own version of Shazam for experimental purposes. When I saw it, I thought about the posibility of use this application for SIGID.

SIGID (Signal Identification) is to search for unknown radio signals, trying to identify them through example sounds and waterfall images using software defined radio (RTLSDR, Airspy, SDRPlay, HackRF, etc.).

So first I have downloaded all audio samples from Signal Identification Wiki database and generated a SQLite database for the python script.

Listening for about 5-10 seconds, the script can now identify a lot of known signals (about 350)! In this article you can found a direct link to download my database and save potential time and problems.

Testing my database. 347 «songs» are fingerprinted. Very noisy songs, yeah.

How can I do the same?

  • Install python, pip and dependencies:
sudo apt-get install python-tk ffmpeg portaudio19-dev python-pyaudio python-pip
sudo pip install matplotlib termcolor scipy pydub PyAudio
  • Clone and set up SQLite database:
git clone https://github.com/baliksjosay/audio_recogition_system
cd audio_recognition_system
make clean reset
  • Download my generated hash database of Sigidwiki:
https://www.dropbox.com/s/9v7eslkwglzs3ys/fingerprints2.rar?dl=1
  • Unzip file in /db/ directory.
  • Test the script! Two ways:
  • Via microphone/virtual cable (5 seconds):
 python recognize-from-microphone.py -s 5
  • Vía saved audio file (Update: we are all having troubles with this option):
 python recognize-from-file.py sampleaudiofile.mp3

Enjoy it!

PS. – Doesn’t work? Think about demodulation. The same signal sounds pretty different from, i. ex. AM to FM. Some known signals (CW, FM broadcast…) may fail.

PS.2 – Update 4/2/2020. I’ve submitted my project and Rtl-sdr.com blog published a review 😀

Le puede interesar también:   Breaches: el negocio de nuestras contraseñas

PS.3 – Update. Some users reported problems using the «recognize-from-file.py», but no problems with «recognize-from-microphone.py». If you need to work only and exclusively with «from file» option, please contact with the author of the script (and please, share with us your conclusions!). Thank you!

PS.3 – Update 12/2/2020. Dan Maloney wrote a review on Hackaday 😀




Copyright, 2020. José Carlos Rueda, abogado.

31 comentarios sobre “Radio signal identification «by ear»: Discrete Fast Fourier audio hashes comparison in Python

    1. Hi Gabriele! These instructions are for Linux, but Python is a cross-platform scripting language, so you can run this on Windows without problems 🙂

      1. Unfortunately I don’t have any programming knowledge, this software would be very convenient for me as having vision problems would be fantastic for me. Thank you all the same and I wish you a good job.

          1. Ok but that is still over the head for a lot of people who would have use for this app. Think of most Shazam users, if they had to download python and bootstrap and issue linux commands just to ID a song most would not do it or even have the skills to do it. Is there anyway you can compile this into a nice Windows binary?

  1. Well done, José! The code by Joseph Balikuddembe with your DB seems to work flawlessly (at least with several example signals download from youtube…unfortunately, I’m temporary away from my radio equipment). In the past, we discussed implementing this feature (or a deep learning variation) to Artemis due to the high demand, but we never tried because of the launch of Artemis 3. It’s cool to see that the system is working with signals as well! I’m a little bit concerned about the reliability of detection due to different demodulation but this is something related to the DB which, we hope, will be constantly updated and improved.

  2. Thank you, Marco!

    I’m thinking about create a sdr# plugin in C++ connected with this script for automate the process + connection with signal hashes database online + collaborative network of users to share captures of known signals. It’s a dream but is possible.

    I can’t wait to see your new experiments on Artemis!
    Regards.

  3. Hola Jose!

    This is very exciting. I have tried to identify using an mp3 as suggested, but get this error?
    raceback (most recent call last):
    File «recognize-from-file.py», line 1, in
    from libs.reader_microphone import FileReader
    File «/home/bob/audio_recogition_system/libs/reader_microphone.py», line 4, in
    from reader import BaseReader
    ModuleNotFoundError: No module named ‘reader’

    Is there a missing dependency perhaps.
    Thanks!
    Bob N6RFM

    1. Hi Bob!
      I think you’re missing pyPortAudio, pyAudio or something like that!
      Are you using Windows? Have you installed Pip and Conda?
      Have you tried with python, python2.7 and python3.7?

  4. Great idea, there was months that I try to find a software with that feature.
    I’ve tried the signal recognition from mp3 (recognize-from-file.py). I’m not a developer but it seems very strange that the script is only composed by following code:

    from libs.reader_microphone import FileReader
    song = None
    seconds = 5
    r = FileReader(123)
    r.recognize(seconds=seconds)
    print(song)

    can you help?
    script return the same error as indicated by another user

    1. Hi Lukxz! Please test with microphone script. Some friends told me the same issue, and no problem with mic feature.
      Well, I think the lenght isn’t strang, because in the first line is loading scripts from /libs folder with «from… import». Take a look.
      Which version of Python are you using? I’m pretty sure the problem is a dependency.
      Good luck!

  5. Python 2.7.16
    the microphone script is a little bit difficult from my side because I’m using a remote spyserver on RPi with SDR# on a Win10 machine as a client so I should to setup Python and all needed libs on windows in order to redirect audio to the script… 🙁

    talking about the script, did you mean that line:
    «from libs.reader_microphone import FileReader»
    I’ve looked at it and in the file libs/reader_microphone.py there are no classes name «FileReader» so I suspect that there are something wrong.

    many thanks for your support!
    Lukxz.

    1. Hi!
      I’ve never tested the script with file, and some people got the same problem. Personally, I think the first line of the script for file input is wrong. Maybe replacing «reader_microphone» with «reader_file» will fix it. Good luck!

  6. [email protected]:~/audio_recogition_system $ ./recognize-from-file.py python ft4.mp3
    from: can’t read /var/mail/libs.reader_microphone
    ./recognize-from-file.py: line 3: song: command not found
    ./recognize-from-file.py: line 4: seconds: command not found
    ./recognize-from-file.py: line 6: syntax error near unexpected token `(‘
    ./recognize-from-file.py: line 6: `r = FileReader(123)’

    What am I missing??
    Thank you, 73 !

    1. You’re executing a python script as a executable, in a strange order (bad syntax).
      Use instead:
      python recognize-from.file.py ft4.mp3

      P.D. Users are having troubles with «recognize-from-file». You should use from-mic instead because I think there’s a error on calling a library in the «from-file» script.

  7. Pretty darn cool!

    Consider saying you make the hashes from Fourier Transformed data. The FFT is just an algorithm for doing a Discrete Fourier Transform and not to be confused with the transform itself.

  8. Thanks for writing this awesome article. I’m a long time reader but I’ve never been compelled to leave a comment. I subscribed to your blog and shared this on my Twitter.
    Thanks again for a great article!

  9. Hi, if possible, I would like to know what software you use to capture real audio to compare with the database,and what settings especially for large signals.

  10. Im on windows, i already copied your script and the hash database and i also installed python and pip but i cannot setup/prepare the database or run the script.. Maybe you could create a windows-howto, as im not the only one? 🙂

    when im trying to run the script, the following happens in python console but also in powershell.. please help, i really want to test this useful script on my windows 10:

    python console:
    >>> C:\Users\*****\Desktop\SDR\audio_recogition_system\recognize-from-microphone.py
    File «», line 1
    C:\Users\*****\Desktop\SDR\audio_recogition_system\recognize-from-microphone.py
    ^
    SyntaxError: unexpected character after line continuation character
    >>>

    powershell:
    PS C:\Users\waiss\Desktop\SDR\audio_recogition_system> python .\recognize-from-microphone.py
    File «.\recognize-from-microphone.py», line 47
    print colored(msg, attrs=[‘dark’])
    ^
    SyntaxError: invalid syntax

    1. Hi. You’re missing “termcolor” python lib. Have you installed all dependencies with pip? Next days I’ll compile a windows binary.

  11. Insane leap here, but would there be an application
    here to record the unique fingerprint of normal
    transmissions on my local repeater then match them with
    whoever is deadkeying & using electronic voices to denigrate other users.

Deja una respuesta

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *