Xournals

Authors

Ananya Jain, Riya Bansal

Abstract

The most popular form of communication is audio voice, however occasionally it is employed improperly or illegally. The Forensic Speaker Identification System faces a potential challenge from voice imitation, one of the main disguises that is on the rise. There are numerous ways to conceal a human voice such as Self-disguise, impersonating someone else, or stealing their identity are all examples of this. Here, identifying impersonation using a person's real voice is crucial for establishing ownership. It is crucial to determine whether a voice is being impersonated or belongs to the actual speaker when a person disputes ownership of voice evidence that sounds like them. This research presents a novel two-stage verification approach for the mimicry voice signal. The first stage involves comparing intonation patterns using spectrograms of voice of original artists and their respective mimicry artists, while the second stage is based on differences in fundamental frequencies and pitch. Keywords: Speaker identification, voice disguise, spectrogram, pitch, fundamental frequency.

Introduction

The human voice is thought of as one of our most personal characteristics. People tend to converge towards the language they observe around them, whether it’s copying word choices, mirroring sentence structures or mimicking pronunciations. In the Odyssey (Homer, 850 BC), Helen of Troy is said to have circled the wooden horse while yelling to different Greek soldiers in the voices of their wives and sweethearts because she suspected betrayal. This tale is noteworthy for two different reasons. That is one of the first examples of voice mimicry that has been documented. It may also be the earliest instance of vocal mimicry used for deception (Singh et al., 2017).

This research study is primarily concerned with the identification of speakers whose voices have been disguised and focuses on speaker verification and recognition on the basis of the intonation pattern. The intonation model is typically evaluated based solely on the similarities and differences between several samples, without any explicit reference. Mimicry voice is using synthetic speech against speaker veriﬁcation based on the spectrum and pitch analysis (Hautamäki et al., 2017). Pitch refers to the highness or lowness of the voice and articulation is the way you pronounce individual sounds. While the generation of speech sounds is the emphasis of articulation, whether you pronounce them correctly is the subject of pronunciation. Both impressions of believability and intelligibility are influenced by the sound characteristics of articulation and pronunciation. During person-to-person interactions, your speech frequently has distinct, strong tones. Articulatory phonetics refers to the equipment used to produce speech sounds as well as the cognitive and physical parameters that specify the range of potential speech sounds and sound patterns. The size and shape of the speaker's vocal tract have an impact on the variety of sounds that can be produced by a human. The oral and nasal cavities, the glottis, the tongue, the velum or soft palate, the hard palate, the teeth, and the lips make up the vocal tract (Delvaux et al., 2017). With so many factors influencing human speech, it becomes simple for us to identify a person just by their voice. So, these speech traits effectively don't alter even when someone tries to impersonate another person, making them useful for detecting disguised voices.

Additionally, the person trying to mimic the voice of another person has their own speech traits, which will occasionally come through in the imitation. Another method of verification is carried out using the PRAAT software where we compare the spectrograms of the two speakers and look for words that both speakers use frequently to determine whether or not voice impersonation has been done.

Obviously, the situation can change. A single person or two distinct people cannot pronounce the same word or sentence with the same intonation. An expert impersonator or an artist impersonating someone else will also have some distinctions that can identify disguise (Latorre et al., 2014).

Another type of speech disguise is self-disguise, which involves concealing one's identity to avoid detection when one of one's voice recordings is being examined in court (Hautamäki et al., 2017).

In this paper, we attempt to provide a strategy for avoiding the mimicking voice, which could pose a security problem. Humans have a tendency to mimic the speaking style of some of the famous personalities. But, from a security perspective, using a mimicked voice for any voice recognition system as a stand-in for an existing voice model is a difficult problem. Mimicry voices are highly vulnerable for any speaker recognition system. Voice dialling, banking over a telephone network, database access services, security control for secret information, and remote access to computers are all areas where a mimicking attack is likely to occur. So, to verify the claim speaker, one must choose the speaker's speech file from among the various speaker models already in use in the system (Kanrar & Mandal, 2015).

References

Delvaux, Véronique, et al. “Voice Disguise vs. Impersonation: Acoustic and Perceptual Measurements of Vocal Flexibility in Non Experts.” Conference of the International Speech Communication Association, 2017.

Hautamäki, Rosa González, et al. “Acoustical and Perceptual Study of Voice Disguise by Age Modification in Speaker Verification.” Speech Communication, vol. 95, Elsevier BV, Dec. 2017, pp. 1–15.

Kanrar, Soumen, and Prasenjit Mandal. “Detect Mimicry by Enhancing the Speaker Recognition System.” Advances in Intelligent Systems and Computing, Springer Nature, 2015, pp. 21–31.

Latorre, Javier, et al. “Speech Intonation for TTS: Study on Evaluation Methodology.” Conference of the International Speech Communication Association, 2014.

Singh, Rita, et al. “Voice Disguise by Mimicry: Deriving Statistical Articulometric Evidence to Evaluate Claimed Impersonation.” IET Biometrics, vol. 6, no. 4, Institution of Engineering and Technology, Feb. 2017, pp. 282–89.

How to cite this article?

APA Style	Jain, A. & Bansal, R. (2023). Voice impersonation examination by Spectrographic analysis: A Voice Comparative Study. Academic Journal of Forensic Science, 06(01), 01-06.
Chicago Style
MLA Style
DOI
URL

Forensic Sciences

Voice Impersonation Examination by Spectrographic Analysis: A Voice Comparative Study

Authors

Ananya Jain, Riya Bansal

Abstract

Introduction

References

How to cite this article?

Support Center

International Association of Scientists & Researchers

Create Your Password

Forgot Password

Publication Tracking

Forensic Sciences

Voice Impersonation Examination by Spectrographic Analysis: A Voice Comparative Study

Authors

Ananya Jain, Riya Bansal

Abstract

Introduction

References

How to cite this article?

Support Center

International Association of Scientists & Researchers

Create Your Password

Sign In

Create Account

Forgot Password

Publication Tracking