Improved Expectation-Maximization Algorithm for Unknown Reverberant Audio-Source Separation

Shaher Slehat

Abstract


The problem of undecided Separating reverberant audio sources is crucial for speech and audio processing. Numerous separation strategies have been developed to solve this problem; however, all of them estimate model parameters in the time–frequency domain, resulting in permutation ambiguity and poor separation performance. Additionally, one of the main challenges with existing expectation–maximization (EM) strategies is the time needed for each iterative step to update the model parameters. In this article, we offer an enhanced EM approach that combines nonnegative matrix factorization (NMF) with time differences of arrival (TDOA) estimations while eliminating time expenditure to the EM algorithm's starting values being appropriately selected. The suggested approach avoids permutation ambiguity by using the NMF source model, and acoustic localization is accomplished by converting the TDOA. Following that, model parameters are changed to improve separation outcomes. Finally, Wiener filters are used to separate the source signals. The experimental findings indicate that the suggested algorithm outperforms current blind separation approaches in terms of source separation.


Article Metrics

Abstract: 93 Viewers PDF: 54 Viewers

Keywords


TDOA; Expectation-Maximization; Audio-Source Separation; Data Mining

Full Text:

PDF


References


A. Tharwat, “Independent component analysis: An introduction,” Appl. Comput. Informatics, vol. 17, no. 2, pp. 222–249, Jan. 2021, doi: 10.1016/j.aci.2018.08.006.

S. Piepenburg, “Disc‐Based Audio‐Video Technology,” Libr. Hi Tech News, vol. 23, no. 6, pp. 27–33, Jan. 2006, doi: 10.1108/07419050610692307.

N. Liu, J. Li, Q. Liu, H. Su, and W. Wu, “Blind source separation using higher order statistics in kernel space,” COMPEL Int. J. Comput. Math. Electr. Electron. Eng., vol. 35, no. 1, pp. 289–304, Jan. 2016, doi: 10.1108/COMPEL-04-2015-0172.

D. Tesendic and D. Boberic Krsticev, “Web service for connecting visually impaired people with libraries,” Aslib J. Inf. Manag., vol. 67, no. 2, pp. 230–243, Jan. 2015, doi: 10.1108/AJIM-11-2014-0149.

A. Maity, P. Prakasam, and S. Bhargava, “Robust dual-tone multi-frequency tone detection using k-nearest neighbour classifier for a noisy environment,” Appl. Comput. Informatics, vol. ahead-of-print, no. ahead-of-print, Jan. 2021, doi: 10.1108/ACI-10-2020-0105.

A. Zimmermann and A. Lorenz, “Creating audio‐augmented environments,” Int. J. Pervasive Comput. Commun., vol. 1, no. 1, pp. 31–42, Jan. 2005, doi: 10.1108/17427370580000111.

C. Todd, S. Mallya, S. Majeed, J. Rojas, and K. Naylor, “Haptic-audio simulator for visually impaired indoor exploration,” J. Assist. Technol., vol. 9, no. 2, pp. 71–85, Jan. 2015, doi: 10.1108/JAT-06-2014-0016.

D. Zhang, X. Song, X. Wang, K. Li, W. Li, and Z. Ma, “New agent-based proactive migration method and system for Big Data Environment (BDE),” Eng. Comput., vol. 32, no. 8, pp. 2443–2466, Jan. 2015, doi: 10.1108/EC-03-2015-0050.

S. Gul, S. Bano, and T. Shah, “Exploring data mining: facets and emerging trends,” Digit. Libr. Perspect., vol. 37, no. 4, pp. 429–448, Jan. 2021, doi: 10.1108/DLP-08-2020-0078.

E. Fersini and F. Sartori, “Semantic storyboard of judicial debates: a novel multimedia summarization environment,” Program, vol. 46, no. 2, pp. 199–219, Jan. 2012, doi: 10.1108/00330331211221846.

M. D. Petković, Z. H. Perić, and A. Ž. Jovanović, “An iterative method for optimal resolution‐constrained polar quantizer design,” COMPEL - Int. J. Comput. Math. Electr. Electron. Eng., vol. 30, no. 2, pp. 574–589, Jan. 2011, doi: 10.1108/03321641111101087.

S. Spagnol, M. Geronazzo, D. Rocchesso, and F. Avanzini, “Synthetic individual binaural audio delivery by pinna image processing,” Int. J. Pervasive Comput. Commun., vol. 10, no. 3, pp. 239–254, Jan. 2014, doi: 10.1108/IJPCC-06-2014-0035.

R. R. A., S. Reddy, and V. K. V., “Multi-path selection based on fractional cuckoo search algorithm for QoS aware routing in MANET,” Sens. Rev., vol. 39, no. 2, pp. 218–232, Jan. 2019, doi: 10.1108/SR-08-2017-0170.

H. B. Valiveti, A. K. B., L. C. Duggineni, S. Namburu, and S. Kuraparthi, “Soft computing based audio signal analysis for accident prediction,” Int. J. Pervasive Comput. Commun., vol. 17, no. 3, pp. 329–348, Jan. 2021, doi: 10.1108/IJPCC-08-2020-0120.

F. J. Farsana, V. R. Devi, and K. Gopakumar, “An audio encryption scheme based on Fast Walsh Hadamard Transform and mixed chaotic keystreams,” Appl. Comput. Informatics, vol. ahead-of-print, no. ahead-of-print, Jan. 2020, doi: 10.1016/j.aci.2019.10.001.

M. Yasin and P. Akhtar, “Design and performance analysis of live model of Bessel beamformer for adaptive array system,” COMPEL Int. J. Comput. Math. Electr. Electron. Eng., vol. 33, no. 4, pp. 1434–1447, Jan. 2014, doi: 10.1108/COMPEL-04-2013-0117.

L. Xiao, H. Kim, and M. Ding, “An Introduction to Audio and Visual Research and Applications in Marketing,” in Review of Marketing Research, vol. 10, N. K. Malhotra, Ed. Emerald Group Publishing Limited, 2013, pp. 213–253.

S. Ding, A. Cichocki, J. Huang, and D. Wei, “Blind source separation of acoustic signals in realistic environments based on ICA in the time‐frequency domain,” Int. J. Pervasive Comput. Commun., vol. 1, no. 2, pp. 89–100, Jan. 2005, doi: 10.1108/17427370580000115.

G. Maguolo, M. Paci, L. Nanni, and L. Bonan, “Audiogmenter: a MATLAB toolbox for audio data augmentation,” Appl. Comput. Informatics, vol. ahead-of-print, no. ahead-of-print, Jan. 2021, doi: 10.1108/ACI-03-2021-0064.

C. Grecos and Q. Wang, “Advances in video networking: standards and applications,” Int. J. Pervasive Comput. Commun., vol. 7, no. 1, pp. 22–43, Jan. 2011, doi: 10.1108/17427371111123676.

D. N. Kanellopoulos, “Multimedia networking issues for digital video libraries,” Electron. Libr., vol. 32, no. 6, pp. 898–922, Jan. 2014, doi: 10.1108/EL-01-2013-0009.

B. Kumaraswamy and P. P G, “Recognizing ragas of Carnatic genre using advanced intelligence: a classification system for Indian music,” Data Technol. Appl., vol. 54, no. 3, pp. 383–405, Jan. 2020, doi: 10.1108/DTA-04-2019-0055.

B. J. Jansen, M. Zhang, and A. Spink, “Patterns and transitions of query reformulation during web searching,” Int. J. Web Inf. Syst., vol. 3, no. 4, pp. 328–340, Jan. 2007, doi: 10.1108/17440080710848116.

D. Kanellopoulos, “Semantic annotation and retrieval of documentary media objects,” Electron. Libr., vol. 30, no. 5, pp. 721–747, Jan. 2012, doi: 10.1108/02640471211275756.


Refbacks

  • There are currently no refbacks.



Barcode

Journal of Applied Data Sciences

2723-6471 (Online)
Organized by : MetaBright
Published by : Bright Publisher
Website : bright-journal.org/JADS
Email : info@bright-journal.org

 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0