Automatic speech recognition(asr)


Automatic Speech Recognition (ASR), a sub-filed of Machine Learning, is gaining huge attraction with the invention of latest technologies to communicate with machines/computers in a hands-free way, which is going to support various industrial and daily life applications. This is one of the biggest upcoming technologies through which different kinds of people get connected and access lot of information through machines interactively in coming days.Many of the illiterate people are still unable to get the right information from both government and private organisations as they are unable to use the existing  web based technology properly as they are mostly available in English and requires typing of keywords! This is the triggering point for engineering field to develop applications using voice interface in local languages which is of common man’s choice. We at Nuronics Labs developed our own ASR engine with models to create interactive voice applications by considering rural people like farmers, small business people, women and labour to get the right data from government organisations.


Currently we have the ASR System available for Telugu with the following features

  • Real Time Multi Channel TV News transcription with 
  1. More than 90% accuracy of conversational speech in studio environment
  2. Keyword Spotting/Text Highlighting (Search)
  3. Edit options
  4. Locating any word in the video with right timestamps
  5. Keyword Counting (Analytics)
  • Off-line transcription of any Telugu video/audio into text
  • Highly Customizable engine into any other language
  • Cloud Based ASR System, can be accessed from anywhere through secure logins


Speech Analytics is gaining momentum as majority of the current analytics is happening over the text files only and not over audio/video files which are getting uploaded into the internet heavily on a daily basis. Once the speech is converted into text using ASR, we can extract any kind of information as required. Mainly this is much useful for call centre type of applications which can generate the reports on an hourly /daily basis automatically by converting the audio files into text and by extracting meaningful keywords.Using Accurate Keyword Spotting and Automated Phrase Matching, our system performs various analytics including sentiment analysis on the speech data and extract meaningful insights from it.



TTS is the technology which does exactly the opposite of ASR i.e. speaks out the written text.  This helps the content owners to respond to the different needs of users by speaking in their native language. Thus TTS, in combination with ASR, will help in developing interactive voice applications to respond to the queries from the users through speech.TTS will be more helpful to people with learning disabilities, literacy difficulties, visual impairment, people who can speak but cannot read the language, and people doing multi-tasking.Using our Text-To-Speech (TTS) engine, we can deliver any specific person’s voice in Telugu.  This system is highly useful for famous personalities to connect to their audience in a personalized way and capture the mind share.  Enterprises can utilize TTS services to market products to the people in one’s favourite personality’s voice.  This improves the brand recognition and enhances customer experience



Speaker recognition is the automated method of identification of a person from characteristics of voices. Speaker recognition technology makes it possible to enable the speaker’s voice to control access to restricted services, for example, phone access to banking, database services, shopping or voice mail, and access to secure equipment. This technology require users to “enroll” in the system, that is, to give examples of their speech to a system so that it can characterise (or learn) their voice patterns.Our Telugu Speaker Recognition system identifies the speaker using various characteristics of that person’s voice. We perform both Speaker Verification (1:1) and Speaker Identification (1:N) using Text-dependent and Text-independent techniques. This tool is more useful for voice bio-metric applications.


Copyright – Nuronics Labs

Close Menu