Wavelet Decompositions, Hierarchical Encoding and Convolutional Neural Network Integrated Lossless Audio Codec
Abstract
In this paper, a lossless audio codec is proposed by leveraging Wavelet transformation, Hierarchical encoding with Convolutional Neural Network architecture. In the first phase, three level 1D wavelet decomposition is applied on the input audio for generating approximation and detail coefficients. In the next phase, the approximation and detail coefficients are transformed into binary streams by utilizing the proposed dynamic hierarchical encoding algorithm. In this encoding technique, coefficients are converted to binary by dynamically accumulating the binary path values. In the subsequent phase, the binary stream is transformed into image patterns and further compressed by reducing the dimensionality by the proposed convolutional neural network(CNN) model. The model’s effectiveness is evaluated against current conventional lossless audio benchmarks and machine learning-based methods. Experiment results demonstrate that the method shows better performance than existing lossless audio techniques.References
Nowak, N., Zabierowski, W.(2011): Meth ods of sound data compression–comparison of different standards. Radio electronics and informatics (4), 92–95.
Sharma, K., Gupta, K.(2017), Lossless data compression techniques and their perfor mance. In: 2017 International Conference on Computing, Communication and Automation (ICCCA), pp. 256–261, IEEE.
Mondal, U.K., Debnath, A.(2022), Designing a novel lossless audio compression technique with the help of optimized graph traversal (lacogt). Multimedia Tools and Applications 81(28), 40385–40411.
Mondal, U.K., Debnath, A.(2021), De veloping a dynamic cluster quantization based lossless audio compression (dcqlac). Multimedia Tools and Applications 80(6), 8257–8280.
Mondal, U.K., Debnath, A., et al.(2020), Deep learning-based lossless audio encoder (dllae). In: Intelligent Computing: Image Processing Based Applications, pp. 91– 101. Springer.
Mondal, U.K., Debnath, A., et al.(2023), Designing an iterative adaptive arithmetic coding-based lossless bio-signal compression for online patient monitoring system (iaalbc). In: Frontiers of ICT in Healthcare: Proceedings of EAIT 2022, pp. 655– 664. Springer.
Holighaus, N., Koliander, et al.(2019), Char acterization of analytic wavelet transforms 318 and a new phaseless reconstruction algorithm. IEEE Transactions on Signal 319 processing 67(15), 3894–3908.
Jmour, N., Zayen, S., Abdelkrim, A.(2018, Convolutional neural networks for image 321 classification. In: 2018 International Confer ence on Advanced Systems and 322 Electric Technologies (IC ASET), pp. 397–402,IEEE.
Reznik, Y.A.(2004), Coding of prediction residual in mpeg-4 standard for lossless audio coding (mpeg-4 als). In: 2004 IEEE Interna tional Conference on Acoustics, Speech, and Signal Processing, vol. 3, p. 1024,IEEE.
Yu, R., Lin, X., Rahardja, S., Huang, H. (2005), Mpeg-4 scalable to lossless audio coding-emerging international standard for digital audio compression. In: 2005 IEEE 7th Workshop on Multimedia Signal Processing, pp. 1–4,IEEE.
Wei, B., Wang, J., Gibson, J.D. (2001), Enhanced celp coding with discrete spectral modeling. In: Proceedings of 2001 In ternational Symposium on Intelligent Mul timedia, Video and Speech Processing. ISIMP 2001 (IEEE Cat. No. 01EX489), pp. 111–113,IEEE.
Gunawan, T.S., Zain, M.K.M., Muin, F.A., Kartiwi, M. (2017), Investigation of loss less audio compression using ieee 1857.2 ad vanced audio coding. Indonesian Journal of Electrical Engineering and Computer Science 6(2), 422–430.
Coalson, J.: Xiph. Org Foundation,“FLAC: Free lossless audio codec”. https: //x iph.org/flac/index.html. Accessed:15-10- 2023.
Tu, W., Yang, Y., Du, B., Yang, W., Zhang, X., Zheng, J.(2020), Rnn-based signal 339 classification for hybrid audio data compres sion. Computing 102(3), 813–827.
http://www.wavpack.com/. Accessed: 15- 10-2023.
Oquab, M., Bottou, L., Laptev, I., Sivic, J.(2015), Is object localization for free?- weakly supervised learning with convolu tional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 685–694.
Debnath, A., Mondal, U.K., et al. (2020), Achieving lossless audio encoder through in tegrated approaches of wavelet transform, quantization and huffman encoding (laei wqh). In: 2020 International Conference on Computer Science, Engineering and Applica tions (ICCSEA), pp. 1–5, IEEE.
D¨orfler, M., Bammer, R., Grill, T. (2017), Inside the spectrogram: Convolutional neu ral networks in audio processing. In: 2017 In ternational Conference on Sampling Theory and Applications (SampTA), pp. 152–155, IEEE.
Rim, D.N., Jang, I., Choi, H.(2021) Deep neural networks and end-to-end learn ing for audio compression. arXiv preprint arXiv:2105.11681.
Freitag, M., Amiriparian, S., et al. (2017), audeep: Unsupervised learning of representations from audio with deep recurrent neural networks. The Journal of Machine Learning Research 18(1), 6340–6344.
Mineo, T., Shouno, H.: A lossless audio codec based on hierarchical residual predic tion. (2022), In: 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 123–130, IEEE.
LeCun, Y., Boser, B., et al.(1989), Back propagation applied to handwritten zip code 362 recognition. Neural computation 1(4), 541–551.
Wang, K., Qi, X., Liu, H. (2019), Photo voltaic power forecasting based lstm convo lutional network. Energy 189, 116225.
Shannon, C.E. (1948), A mathematical theory of communication. The Bell system 366 technical journal 27(3), 379–423.
Kutter, M., Petitcolas, F.A.: Fair benchmark for image watermarking systems.(1999), In: Security and Watermarking of Multimedia Contents, vol. 3657, pp. 226–239 International Society for Optics and Photonics.
Manju, M., Abarna, P., Akila, U., Yamini, S.(2018),Peak signal to noise ratio & mean square error calculation for various im age patterns using the lossless image com pression in ccsds algorithm. International Journal of Pure and Applied Mathematics 119(12),14471–14477.
Rumelhart, D.E., Hinton, G.E., Williams, R.J. (1986), Learning representations by back propagating errors. nature 323(6088), 533–536.
Krizhevsky, A., Sutskever, I., Hinton, G.E.(2012), Image net classification with deep convolutional neural networks. Advances in neural information processing systems 25.
https://monkeysaudio.com/index.html. Accessed: 15-10-2023.
Mineo, T., Shouno, H.(2022), A lossless au dio codec based on hierarchical residual pre diction. In: 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 123–130, IEEE.
DOI:
https://doi.org/10.31449/inf.v48i4.5496Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika







