Skip to content

Bounding box in spectrogram from audio's annotation #2

@edwardnguyen1705

Description

@edwardnguyen1705

Dear @sebastianmenze ,

Thanks for sharing your work.

I have just started working on a sound detection project.

Given annotations of sounds in an audio file, I have to calculate the corresponding box coordinates in spectrogram. If you know, please give me a hint. Thank you for your time.

[
    {
        "filename": "20250409_140149_cf_7029720.wav",
        "fs": 100000,
        "cf": 7029720,
        "start": "2025-04-09 14:01:49.292337+00:00",
        "length": 21.0,
        "annotations": [
            {
                "start": 0.48589,
                "length": 19.88517,
                "cfoffset": 20795.0,
                "bandwidth": 4493.047619047618,
                "label": "DSB",
                "snr": 30
            },
            {
                "start": 0.01472,
                "length": 20.94979,
                "cfoffset": -4095.0,
                "bandwidth": 7369.047619047618,
                "label": "DSB",
                "snr": 19
            },
            {
                "start": 2.29366,
                "length": 12.75766,
                "cfoffset": -17184.805849344928,
                "bandwidth": 2333.7304862897295,
                "label": "USB",
                "snr": 21
            }
        ]
    }
]
import soundfile as sf
import librosa

sig, _ = sf.read(wav_path, dtype="float32")
sig = sig[:,0] + 1j*sig[:,1]
spec = librosa.stft(np.real(sig), n_fft=n_fft, hop_length=hop)
spec = librosa.amplitude_to_db(np.abs(spec))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions