PREPARATION OF A DATA SET FOR DEEP NEURAL NETWORK TRAINING FOR EXTRACTION OF DIGITAL FINGERPRINTS OF AUDIO FILES

Main Article Content

V.N. Email: oskage.work@gmail.com
P.S. Ladygin
Ya.I. Bortsova
V.V. Karev.

Abstract

Modern computer algorithms analyze auditory information quiteeffectively. An important task for creating modern expert systems for checking music forplagiarism is the construction of a qualitative vector of audio file features. Deep neuralnetworks have become one of the most relevant tools for processing this type ofinformation. To train them, it is necessary to have a large data set that is applicable to thetask. In this paper, the criteria for selecting a data set for training a neural network forextracting digital prints of audio files are defined. The existing data sets are analyzed inaccordance with certain criteria. A data set has been collected and presented in aconvenient form for further use.

Downloads

Download data is not yet available.

Article Details

How to Cite
1. V.N., Ladygin P., Bortsova Y., Karev. V. PREPARATION OF A DATA SET FOR DEEP NEURAL NETWORK TRAINING FOR EXTRACTION OF DIGITAL FINGERPRINTS OF AUDIO FILES // ПРОБЛЕМЫ ПРАВОВОЙ И ТЕХНИЧЕСКОЙ ЗАЩИТЫ ИНФОРМАЦИИ, 2023. № 9. P. 22-27. URL: http://journal.asu.ru/ptzi/article/view/13609.
Section
Проблемы технического обеспечения информационной безопасности

References

Van Nieuwenhuizen H.A The study and implementation of shazam’s audio fingerprinting algorithm for advertisement identification / H.A. Van Nieuwenhuizen, W.C. Venter, L.M. Grobler // In Proceedings of SATNAC – 2011. – С 4.

Cano P. A review of audio fingerprinting / P. Cano, E. Batlle, T. Kalker [и др.] // Journal of VLSI signal processing systems for signal, image and video technology. – 2005. – Т.41. – № 3. – С. 271–284.

Baluja S. Audio fingerprinting: Combining computer vision & data stream processing / S. Baluja, M. Cobell // IEEE International Conference on Acoustics, Speech and Signal Processing – ICASSP’07. – 2007. – Т. 2 – С. II-213 – II-216.

Kim J.W. CREPE: A Convolutional Representation for Pitch Estimation / J.W. Kim, J. Salamon, P. Li [и др.] // Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) – 2018. – C. 161-165.

Böck S. Accurate Tempo Estimation Based on Recurrent Neural Networks and Resonating Comb Filters / S. Böck, F. Krebs, G. Widmer // ISMIR. – 2015. – С. 625 - 631

Dey S. Signet convolution Siamese network for writer independent offline signature verification / S. Dey, A. Dutta, J.I.Toledo, [и др.] // CoRR – 2017. – C. 1-7.

Schoff F. Facenet: A unified embedding for face recognition and clustering / F. Schoff, D. Kalenichenko, J. Philbin // In Proceedings of the IEEE conference on computer vision and pattern recognition – 2015. – C. 815 – 823.

Snyder D. X-Vectors: Robust DNN Embeddings for speaker Recognition Using Data Augmentation / D. Snyder, G. Garcia-Romero [и др.] // ICASSP 2018. – 2018. – С. 5329 – 5338.

Nagrani A. Voxceleb: a large-scale speaker identification dataset / A. Nagrani, J. S. Chung, A. Zisserman // Inerspeech 2017. – 2017. – С. 2616 – 2620.

Free Music Archive [Электронный ресурс]: Режим доступа: https://freemusicarchive.org/home.

Creative commons [Электронный ресурс]: Режим доступа: https://creativecommons.org/licenses/