S-TRANSFORM AND GAUSSIAN MIXTURE MODEL FOR ACOUSTIC SCENE CLASSIFICATION

Santosh Kumar Srivastava

doi:10.29284/ijasis.6.1.2020.29-37

Authors

Santosh Kumar Srivastava

DOI:

https://doi.org/10.29284/ijasis.6.1.2020.29-37

Keywords:

Acoustic scene classification, time-frequency representation, S-transform, probabilistic classifiers, Gaussian mixture model.

Abstract

In this study, Acoustic Scene Classification (ASC) system is designed with the help of S-transform and Gaussian Mixture Model (GMM). The S-transform is an extension of continuous wavelet transform that combines the progressive resolution with phase information. Thus, it exhibits the amplitude response of the frequency samples in contrast to wavelet transform. The S-transform coefficients are modeled by GMM using posterior probabilities of testing features. Also, preprocessing of acoustic signals is done by a series of operations; explosion, pre-emphasis filtration and windowing approach. The number of Gaussian components which is used to model the scene is varied (GMM-4, GMM-8, GMM-16, and GMM-32) and the performance of ASC system is analyzed using TAU Urban Acoustic Scenes 2019. The results show the effectiveness of the system with average recognition rate of 77.59%, 81.58%, 87.66% and 84.50% for GMM-4, GMM-8, GMM-16, and GMM-32 respectively.

References

G.Z. Felipe, Y. Maldonado, G.d. Costa and L.G. Helal, "Acoustic scene classification using spectrograms," 36th International Conference of the Chilean Computer Science Society, 2017, pp. 1-7.

L.D. Pham, I.V. McLoughlin, H. Phan and R. Palaniappan, A Robust Framework for Acoustic Scene Classification. INTERSPEECH, 2019, pp. 3634-3638.

J. Xie and M. Zhu, Investigation of acoustic and visual features for acoustic scene classification. Expert Systems with Applications, Vol. 126, 2019, pp.20-29.

L. Yang, X. Chen, L. Tao and X. Gu. Multi-scale Fusion and Channel Weighted CNN for Acoustic Scene Classification. 2nd International Conference on Signal Processing and Machine Learning, 2019, pp. 41-45.

V. Bisot, R. Serizel, S. Essid and G. Richard. Nonnegative feature learning methods for acoustic scene classification, 2017.

D. Wang, L. Zhang, K. Xu and Y. Wang, Acoustic scene classification based on dense convolutional networks incorporating multi-channel features. In Journal of Physics: Conference Series, Vol. 1169, No. 1, 2019, pp. 012037.

T. Nguyen and F. Pernkopf. Acoustic scene classification using a convolutional neural network ensemble and nearest neighbor filters. In Proceedings of the Detection and Classification of Acoustic Scenes and Events, 2018, pp. 34-38.

K. Imoto and N. Ono Acoustic scene classification based on generative model of acoustic spatial words for distributed microphone array. 25th European Signal Processing Conference, 2017, pp. 2279-2283).

T. Zhang and J. Wu, J. Constrained learned feature extraction for acoustic scene classification. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 27, No. 8, 2019, pp.1216-1228.

K. Hussain, M. Hussain and M.G. Khan, An improved acoustic scene classification method using convolutional neural networks (CNNs). American Scientific Research Journal for Engineering, Technology, and Sciences, Vol. 44, No. 1, 2018, pp.68-76.

R.G. Stockwell, L. Mansinha and R.P. Lowe, Localization of the complex spectrum: the S transform. IEEE transactions on signal processing, Vol. 44, No. 4,1996, pp. 998-1001.

C.M. Bishop, “Pattern recognition and machine learning”, Springer, Chapter 9, Vol. 1, 2006, pp.435.

Q. Kong, T. Iqbalm, Y. Xu and M.D. Plumbley, DCASE 2018 Challenge Surrey Cross-task convolutional neural network baseline, 2018.