ПРИМЕНЕНИЕ ПРЕОБРАЗОВАНИЯ РАССЕЯНИЯ НА КОЭФФИЦИЕНТАХ ДИСКРЕТНОГО ВЕЙВЛЕТ-РАЗЛОЖЕНИЯ К ЗАДАЧЕ БИОМЕТРИЧЕСКОЙ ВЕРИФИКАЦИИ ДИКТОРОВ

A.A. Lependin; D.A. Gaponov; Y.A. Filin; P.S. Ladygin

№ 8 (2020): ПРОБЛЕМЫ ПРАВОВОЙ И ТЕХНИЧЕСКОЙ ЗАЩИТЫ ИНФОРМАЦИИ

PDF (Русский)

Published: Nov 8, 2020

Keywords:

voice verification, discrete wavelet transform, scattering transform, timedelay neural network, speaker identity vector

A.A. Lependin

Altai State University, Barnaul

Email: andrey.lependin@gmail.com

D.A. Gaponov

Altai State University, Barnaul

Y.A. Filin

Altai State University, Barnaul

P.S. Ladygin

Altai State University, Barnaul

Abstract

In this paper authors propose a new approach for calculating of speechsignal features for the sake of speaker verification problem. A multilevel transformationwas applied to the signal, calculating the scattering coefficients based on discrete waveletdecomposition. The resulting feature vectors were used as input data for a time-delayneural network. On their basis, the neural network calculated the speaker identity vectors,which were directly used for biometric verification. The proposed approach was tested ondata from the VoxCeleb1 and VoxCeleb2 voice sample sets. The effectiveness of theapproach was shown in comparison with existing verification methods based on deepneural networks

Downloads

Download data is not yet available.

How to Cite

1. Lependin A., Gaponov D., Filin Y., Ladygin P. USE OF SCATTERING TRANSFORM ON DISCRETE WAVELET DECOMPOSITION COEFFICIENTS FOR BIOMETRIC SPEAKER VERIFICATION // ПРОБЛЕМЫ ПРАВОВОЙ И ТЕХНИЧЕСКОЙ ЗАЩИТЫ ИНФОРМАЦИИ, 2020. № 8. P. 35-41. URL: http://journal.asu.ru/ptzi/article/view/13934.

Issue

No 8 (2020): ПРОБЛЕМЫ ПРАВОВОЙ И ТЕХНИЧЕСКОЙ ЗАЩИТЫ ИНФОРМАЦИИ

Section

Проблемы технического обеспечения информационной безопасности

References

Rabiner L., Juang B.H. Fundamentals of speech recognition // N.-J. PrenticeHall, 1993. – 507 p.

ГОСТ Р 58624.1–2019. Информационные технологии. Биометрия. Обнаружение атаки на биометрическое предъявление. Стандарт по атакам представлением. Часть 1. Структура

Mallat S. Group Invariant Scattering [электронный ресурс] // режим доступа: http://arxiv.org/abs/1101.2286.

Anden J., Mallat S. Multiscale Scattering for Audio Classification // Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011, Miami, Florida, USA, October 24-28, 2011. pp. 657-662.

Verma P, Das PK. I-vectors in speech processing applications: a survey // International Journal of Speech Technolng. — 2015. — Vol. 18, No. 4. DOI: 10.1007/978-981-10-6626-9_18.

Snyder D., Garcia-Romero D., Sell G., Povey, D., Khudanpur S. X -Vectors: Robust DNN Embeddings for Speaker Recognition // ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). –pp. 5329-5333.

Nagrani A., Chung J.S., Zisserman A. VoxCeleb: a large scale speaker identification dataset [электронный ресурс] // режим доступа: https://arxiv.org/pdf/1706.08612

Chung J.S., Nagrani A., Zisserman A. VoxCeleb2: Deep Speaker Recognition [электронный ресурс] // режим доступа: https://arxiv.org/pdf/1806.05622

Huang X., Acero A., Hon H.-W. Spoken Language Processing. A Guide to Theory Algorithm and System Development. N.-J. Prentice Hall. – 965 p.

Lee Fugal D. Conceptual Wavelets in Digital Signal Processing // San Diego: Space & Signals Technologies. 2009. 302 p.

Kingma D., Ba J. Adam: A Method for Stochastic Optimization // Proc. of International Conference on Learning Representations [электронный ресурс] // режим доступа:: https://arxiv.org/pdf/1412.6980

Pedamonti, D. Comparison of non-linear activation functions for deep neural networks on MNIST classification task [электронный ресурс] // режим доступа::https://arxiv.org/pdf/1804.02763

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details

References