EDC

View project on GitHub
For the souce code, please see here.

Feature comparison (Best viewed in color)

Image

Subgraph: (a) Log mel spectrogram; (b) Features after the self-attention; (c) Features after EDC.

A detection demo of the siren sound clip

Image

Subgraph: (a) Log mel spectrogram; (b) Bottleneck features from the trainable self-attention layer; (c) Acoustic features after EDC; (d) The probability of events predicted by the model trained with EDC.

Visualization of frame-level representations distribution

Image

Visualization of frame-level representations distribution using unsupervised t-SNE.
Please note that models in this paper are trained by clip-level weak labels in datasets of DCASE and CHiME, and the label of each audio clip is a multi-hot vector, so the label corresponding to the frame-level representation is unknowable.

The calculation procedure of EDC

Image

For the souce code, please see here.

Attenuation curves of different alpha

Image

Assuming that the attenuation starts from frame 0

Further comparison of the effects of EDC

Sample 1

Image

Sample 2

Image

Sample 3

Image

Sample 4

Image

Sample 5

Image

Sample 6

Image

Sample 7

Image