Patent application number | Description | Published |
20090012785 | SAMPLING RATE INDEPENDENT SPEECH RECOGNITION - A sampling-rate-independent method of automated speech recognition (ASR). Speech energies of a plurality of codebooks generated from training data created at an ASR sampling rate are compared to speech energies in a current frame of acoustic data generated from received audio created at an audio sampling rate below the ASR sampling rate. A codebook is selected from the plurality of codebooks, and has speech energies that correspond to speech energies in the current frame over a spectral range corresponding to the audio sampling rate. Speech energies above the spectral range are copied from the selected codebook and appended to the current frame. | 01-08-2009 |
20090030679 | AMBIENT NOISE INJECTION FOR USE IN SPEECH RECOGNITION - A method of ambient noise injection for use with speech recognition in a production vehicle. The method includes the steps of monitoring audio including user speech, receiving an utterance from the user speech, retrieving vehicle-specific ambient noise, and prepending the vehicle-specific ambient noise to the utterance before pre-processing and decoding the utterance. | 01-29-2009 |
20090138264 | SPEECH TO DTMF GENERATION - A method of speech to DTMF generation involving ASR-enabled and DTMF-controlled communications systems. The ASR-enabled system is used to recognize speech received from the DTMF-controlled telecommunications system using sampling rate independent speech recognition. It then identifies a speech segment contained in the speech received from the DTMF-controlled system that corresponds with at least one keyword associated with user-defined data. Then, the ASR-enabled system transmits at least one DTMF signal to the DTMF-controlled system in response to the identified speech segment. This allows a user of an ASR-enabled system such as a vehicle telematics unit to at least partially automate access to the DTMF-controlled system using the telematics unit, so that voice mailbox numbers, passwords, and the like normally entered via a telephone keypad can be automatically sent to the DTMF-controlled system from the telematics unit without having to be manually input each time by the user. | 05-28-2009 |
20090164216 | IN-VEHICLE CIRCUMSTANTIAL SPEECH RECOGNITION - A method of circumstantial speech recognition in a vehicle. A plurality of parameters associated with a plurality of vehicle functions are monitored as an indication of current vehicle circumstances. At least one vehicle function is identified as a candidate for user-intended ASR control based on user interaction with the vehicle. The identified vehicle function is then used to disambiguate between potential commands contained in speech received from the user. | 06-25-2009 |
20100049516 | METHOD OF USING MICROPHONE CHARACTERISTICS TO OPTIMIZE SPEECH RECOGNITION PERFORMANCE - A system and method for tuning a speech recognition engine to an individual microphone using a database containing acoustical models for a plurality of microphones. Microphone performance characteristics are obtained from a microphone at a speech recognition engine, the database is searched for an acoustical model that matches the characteristics, and the speech recognition engine is then modified based on the matching acoustical model. | 02-25-2010 |
20100076764 | METHOD OF DIALING PHONE NUMBERS USING AN IN-VEHICLE SPEECH RECOGNITION SYSTEM - A method of dialing phone numbers using an in-vehicle speech recognition system includes receiving speech input at a vehicle, separating the speech input into a word segment and a digit segment, identifying the letters in a word segment, converting the letters in the word segment to digits, and operating an alphanumeric keypad based on the digit speech segment and the converted word segment. | 03-25-2010 |
20110010171 | Singular Value Decomposition for Improved Voice Recognition in Presence of Multi-Talker Background Noise - A system and method for providing speech recognition functionality offers improved accuracy and robustness in noisy environments having multiple speakers. The described technique includes receiving speech energy and converting the received speech energy to a digitized form. The digitized speech energy is decomposed into features that are then projected into a feature space having multiple speaker subspaces. The projected features fall either into one of the multiple speaker subspaces or outside of all speaker subspaces. A speech recognition operation is performed on a selected one of the multiple speaker subspaces to resolve the utterance to a command or data. | 01-13-2011 |
20110046953 | METHOD OF RECOGNIZING SPEECH - A method for recognizing speech involves reciting, into a speech recognition system, an utterance including a numeric sequence that contains a digit string including a plurality of tokens and detecting a co-articulation problem related to at least two potentially co-articulated tokens in the digit string. The numeric sequence may be identified using i) a dynamically generated possible numeric sequence that potentially corresponds with the numeric sequence, and/or ii) at least one supplemental acoustic model. Also disclosed herein is a system for accomplishing the same. | 02-24-2011 |
20110125500 | AUTOMATED DISTORTION CLASSIFICATION - A method of and system for automated distortion classification. The method includes steps of (a) receiving audio including a user speech signal and at least some distortion associated with the signal; (b) pre-processing the received audio to generate acoustic feature vectors; (c) decoding the generated acoustic feature vectors to produce a plurality of hypotheses for the distortion; and (d) post-processing the plurality of hypotheses to identify at least one distortion hypothesis of the plurality of hypotheses as the received distortion. The system can include one or more distortion models including distortion-related acoustic features representative of various types of distortion and used by a decoder to compare the acoustic feature vectors with the distortion-related acoustic features to produce the plurality of hypotheses for the distortion. | 05-26-2011 |
20110250933 | METHOD OF CONTROLLING DIALING MODES IN A VEHICLE - A dialing mode of a telematics unit in a vehicle is controlled by monitoring for dialing digits from a vehicle occupant, determining whether the type of dialing digits are continuous dialing digits or discrete dialing digits, establishing a continuous mode for receiving continuous dialing digits or a discrete mode for receiving discrete dialing digits based on the determination, and if the type of dialing digits changes, switching the established mode. | 10-13-2011 |
20110282663 | TRANSIENT NOISE REJECTION FOR SPEECH RECOGNITION - A method of and system for transient noise rejection for improved speech recognition. The method comprises the steps of (a) receiving audio including user speech and at least some transient noise associated with the speech, (b) converting the received audio into digital data, (c) segmenting the digital data into acoustic frames, and (d) extracting acoustic feature vectors from the acoustic frames. The method also comprises the steps of (e) evaluating the acoustic frames for transient noise on a frame-by-frame basis, (f) rejecting those acoustic frames having transient noise, (g) accepting as speech frames those acoustic frames having no transient noise and, thereafter, (h) recognizing the user speech using the speech frames. | 11-17-2011 |
20110282668 | SPEECH ADAPTATION IN SPEECH SYNTHESIS - A method of and system for speech synthesis. First and second text inputs are received in a text-to-speech system, and processed into respective first and second speech outputs corresponding to stored speech respectively from first and second speakers using a processor of the system. The second speech output of the second speaker is adapted to sound like the first speech output of the first speaker. | 11-17-2011 |
20110288867 | NAMETAG CONFUSABILITY DETERMINATION - A method of and system for managing nametags including receiving a command from a user to store a nametag, prompting the user to input a number to be stored in association with the nametag, receiving an input for the number from the user, prompting the user to input the nametag to be stored in association with the number, receiving an input for the nametag from the user, processing the nametag input, and calculating confusability of the nametag input in multiple individual domains including a nametag domain, a number domain, and a command domain. | 11-24-2011 |
20120109649 | SPEECH DIALECT CLASSIFICATION FOR AUTOMATIC SPEECH RECOGNITION - Automatic speech recognition including receiving speech via a microphone, pre-processing the received speech to generate acoustic feature vectors, classifying dialect of the received speech, selecting at least one of an acoustic model or a lexicon specific to the classified dialect, decoding the acoustic feature vectors using a processor and at least one of the selected dialect-specific acoustic model or selected lexicon to produce a plurality of hypotheses for the received speech, and post-processing the plurality of hypotheses to identify one of the plurality of hypotheses as the received speech. | 05-03-2012 |
20120149356 | METHOD OF INTELLIGENT VEHICLE DIALING - A method of operating a vehicle telematics unit includes determining the location of a vehicle equipped with a vehicle telematics unit; determining if telematics dialing software operated by the vehicle telematics unit includes a verbal dialing protocol used at the determined vehicle location; if not, identifying one or more verbal dialing protocols used at the determined location of the vehicle; requesting telematics dialing software that includes the one or more identified verbal dialing protocols; receiving the requested telematics dialing software from a central facility; and storing the received telematics dialing software at the vehicle. | 06-14-2012 |
20120150541 | MALE ACOUSTIC MODEL ADAPTATION BASED ON LANGUAGE-INDEPENDENT FEMALE SPEECH DATA - A method of generating proxy acoustic models for use in automatic speech recognition includes training acoustic models from speech received via microphone from male speakers of a first language, and adapting the acoustic models in response to language-independent speech data from female speakers of a second language, to generate proxy acoustic models for use during runtime of speech recognition of an utterance from a female speaker of the first language. | 06-14-2012 |
20120197643 | MAPPING OBSTRUENT SPEECH ENERGY TO LOWER FREQUENCIES - A speech signal processing system and method which uses the following steps: (a) receiving an utterance from a user via a microphone that converts the utterance into a speech signal; and (b) pre-processing the speech signal using a processor. The pre-processing step includes extracting acoustic data from the received speech signal, determining from the acoustic data whether the utterance includes one or more obstruents; estimating speech energy from higher frequencies associated with the identified obstruents, and mapping the estimated speech energy to lower frequencies. | 08-02-2012 |
20120323577 | SPEECH RECOGNITION FOR PREMATURE ENUNCIATION - Methods of automatic speech recognition for premature enunciation. In one method, a) a user is prompted to input speech, then b) a listening period is initiated to monitor audio via a microphone, such that there is no pause between the end of step a) and the beginning of step b), and then the begin-speaking audible indicator is communicated to the user during the listening period. In another method, a) at least one audio file is played including both a prompt for a user to input speech and a begin-speaking audible indicator to the user, b) a microphone is activated to monitor audio, after playing the prompt but before playing the begin-speaking audible indicator in step a), and c) speech is received from the user via the microphone. | 12-20-2012 |
20130080173 | CORRECTING UNINTELLIGIBLE SYNTHESIZED SPEECH - A method and system of speech synthesis. A text input is received in a text-to-speech system and, using a processor of the system, the text input is processed into synthesized speech which is established as unintelligible. The text input is reprocessed into subsequent synthesized speech and output to a user via a loudspeaker to correct the unintelligible synthesized speech. In one embodiment, the synthesized speech can be established as unintelligible by predicting intelligibility of the synthesized speech, and determining that the predicted intelligibility is lower than a minimum threshold. In another embodiment, the synthesized speech can be established as unintelligible by outputting the synthesized speech to the user via the loudspeaker, and receiving an indication from the user that the synthesized speech is not intelligible. | 03-28-2013 |
20140302895 | METHOD OF CONTROLLING DIALING MODES IN A VEHICLE - A dialing mode of a telematics unit in a vehicle is controlled by monitoring for dialing digits from a vehicle occupant, determining whether the type of dialing digits are continuous dialing digits or discrete dialing digits, establishing a continuous mode for receiving continuous dialing digits or a discrete mode for receiving discrete dialing digits based on the determination, and if the type of dialing digits changes, switching the established mode. | 10-09-2014 |