+91 9711-843-843

KIIT Campus, Sohna Road, Gurgaon


Possible interaction and APSIPA DL at KIIT

Lecture 1: Speech Production-Perception Link via Energy Measure

In this lecture, a link between speech production and perception mechanism via a suitable energy measure will be established. To that effect, first, a brief discussion of elements of speech production and basics of human hearing as a process of detecting energy will be discussed. In this context, limitations of usual energy measure using L2  norm of a signal in traditional signal processing literature will be discussed. Then, a development of new energy measure in the context of speech production, namely, Teager Energy Operator (TEO) will be presented. A capability of TEO w.r.t. AM-FM modeling and noise suppression will be discussed. Furthermore, mathematical modeling of cochlea will be discussed along with its link with TEO to bring out lina k of TEO  for both production and perception.  Finally, various potential app applications of  in speech,  speaker  and emotion recognition, stressed speech analysis,  energy separation, etc. will be discussed.

Lecture 2: Design of Speaker Recognition in Asian Languages: A case study in Indian languages

This lecture discusses design of speaker recognition systems in Indian languages for tape recorded speech and improving their performance with emphasis on system features. The details of the experimental setup such as dialectal zones (for Marathi, Hindi, Urdu and Oriya-Indian languages) selected for data collection, corpora design and text material used for recordings in different languages are discussed. The baseline ASR system using LP-based features (such as LPC and LPCC) and filterbank- based features (such as MFCC) with polynomial classifiers of 2nd  or 3rd  approximation is described thereafter. A relative comparison  of experiments  on speaker  identification  for monolingual,  cross-lingual  and multilingual  modes is made. The spectral resolution problem associated with female speech is resolved to a large extent by employing filterbank-based features. The problem of speaker classification and language identification is identified from the standpoint of ASR and the solution to this problem is accomplished  by modifying the structure  of a polynomial  classifier. The work on speaker classification  is first supported by spectrogram analysis of voices from rural males followed by experimental results for open set and closed set modes for different Indian languages. For speaker classification, the wavelet packet cepstrum and sub-band cepstrum are employed and the performances have been compared with the performance of MFCC.   Furthermore, the effect of different speech coding standards on the performance of ASR is investigated. Finally, some conclusions and different future research

issues in speaker recognition are discussed.

Lecture 3: Spoofing Attacks in Automatic Speaker Verification (ASV)

Speech is most powerful form of communication between humans and it carries various levels of information such as linguistic content, emotion, acoustic environment, language, speaker’s identity and health conditions, etc. Automatic Speaker Verification (ASV) deals with the verifying claimed speaker’s identity with the help of machines. There are various research issues in speaker recognition such as variability in speaker microphone, intersession, acoustic noise, etc. In addition, one of the most

challenging  but  practical  research  issue  in  this  area  is  analysis  of  spoofing  attacks

and  deveopment  of  various

countermeasures  to alleviate such possible attacks. In this lecture, we will present analysis of various spoofing attacks for

ASVs. In this lecture, we will present work related to technological challenges voice conversion (VC), speech synthesis (SS), replay, twins and professional mimics including the detailed literature search and recent synergistic activities of ASV Spoof 2015 and ASV Spoof 2017 Challenge campaign in INTERSPEECH conferences.

Lecture 4: Person Recognition from Humming

Voice biometrics refers to the task of identifying or verifying a person’s identity based on his or her voice with the help of machines. In this lecture, I will present our work addressing this problem using humming signal rather than normal speech. This kind of biometric may be useful for person with disorder. In addition, this work may be useful to design person-dependent Query-by-Humming (QBH) system in the context of music information retrieval (MIR) systems. This lecture will first give brief overview speaker recognition technology along with various research issues in this area. Newly proposed feature set (by the speaker), viz., such as Variable length Teager Energy Based Features (VTMFCC) will be discussed. Furthermore, development of  a  new  feature extraction technique to  exploit  phase  spectrum  information  implicitly along with  magnitude  spectrum information from hum signal will be discussed. To that effect, we have modified structure of state-of-the-art feature set, viz., Mel Frequency Cepstral Coefficients (MFCC). In addition, a new energy measure, viz., Variable length Teager Energy Operator (VTEO) is employed to compute subband energies of different time-domain  subband signals (i.e., output of 24 triangular shaped filters used in Mel filterbank). Discriminatively-trained polynomial classifier of 2nd order approximations are used as the basis for all person recognition experiments. Proposed feature set is evaluated (and found to be better than state-of-the-art MFCC)  under  various  experimental  conditions  such  as  polynomial  classifier  order,  dimension  of  feature  vector,  signal degradation, class separability and static vs. dynamic features.   Finally, lecture will conclude future research directions and brief mention of various sponsored projects in the area of speech processing and acoustics at DA-IICT Gandhinagar.



Address : KIIT Campus, Sohna Road, Gurgaon, Haryana
Contact No : 9711-843-843, 9811-62-67-67
Phone : 0124 - 2658000/10/20/30/40/50, 2265265/66
Fax : 0124 - 2265249
Email : info@kiit.in

Contact Form

© 2014 KIIT Group of Colleges. All Rights Reserved

Best Engineering and Management College in Gurgaon among Top Engineering colleges of Delhi NCR India KIIT Gurgaon NCR College of Engineering http://kiit.in/wp/wp-content/themes/Kiit%20Latest/img/logo.png
KIIT Campus, Sohna Road, Gurgaon, Haryana India (124) 265-8000
28.362380 77.066059