Patent application number | Description | Published |
20090119102 | SYSTEM AND METHOD OF EXPLOITING PROSODIC FEATURES FOR DIALOG ACT TAGGING IN A DISCRIMINATIVE MODELING FRAMEWORK - Disclosed are a system and method for exploiting information in an utterance for dialog act tagging. An exemplary method includes receiving a user utterance, computing at periodic intervals at least one parameter in the user utterance, quantizing the at least one parameter at each periodic interval, approximating conditional probabilities using an n-gram over a sliding window over the periodic intervals and tagging the utterance as a dialog act based on the approximated conditional probabilities. | 05-07-2009 |
20100082326 | SYSTEM AND METHOD FOR ENRICHING SPOKEN LANGUAGE TRANSLATION WITH PROSODIC INFORMATION - Disclosed herein are systems, methods, and computer readable-media for enriching spoken language translation with prosodic information in a statistical speech translation framework. The method includes receiving speech for translation to a target language, generating pitch accent labels representing segments of the received speech which are prosodically prominent, and injecting pitch accent labels with word tokens within the translation engine to create enriched target language output text. A further step may be added of synthesizing speech in the target language based on the prosody enriched target language output text. An automatic prosody labeler can generate pitch accent labels. An automatic prosody labeler can exploit lexical, syntactic, and prosodic information of the speech. A maximum entropy model may be used to determine which segments of the speech are prosodically prominent. A pitch accent label can include an indication of certainty that a respective segment of the speech is prosodically prominent and/or an indication of prosodic prominence of a respective segment of speech. | 04-01-2010 |
20100131260 | SYSTEM AND METHOD FOR ENRICHING SPOKEN LANGUAGE TRANSLATION WITH DIALOG ACTS - Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for enriching spoken language translation with dialog acts. The method includes receiving a source speech signal, tagging dialog acts associated with the received source speech signal using a classification model, dialog acts being domain independent descriptions of an intended action a speaker carries out by uttering the source speech signal, producing an enriched hypothesis of the source speech signal incorporating the dialog act tags, and outputting a natural language response of the enriched hypothesis in a target language. Tags can be grouped into sets such as statement, acknowledgement, abandoned, agreement, question, appreciation, and other. The step of producing an enriched translation of the source speech signal uses a dialog act specific translation model containing a phrase translation table. | 05-27-2010 |
20130151232 | SYSTEM AND METHOD FOR ENRICHING SPOKEN LANGUAGE TRANSLATION WITH DIALOG ACTS - Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for enriching spoken language translation with dialog acts. The method includes receiving a source speech signal, tagging dialog acts associated with the received source speech signal using a classification model, dialog acts being domain independent descriptions of an intended action a speaker carries out by uttering the source speech signal, producing an enriched hypothesis of the source speech signal incorporating the dialog act tags, and outputting a natural language response of the enriched hypothesis in a target language. Tags can be grouped into sets such as statement, acknowledgement, abandoned, agreement, question, appreciation, and other. The step of producing an enriched translation of the source speech signal uses a dialog act specific translation model containing a phrase translation table. | 06-13-2013 |
Patent application number | Description | Published |
20130030788 | SYSTEM AND METHOD FOR LOCATING BILINGUAL WEB SITES - Disclosed herein are systems, methods, and non-transitory computer-readable storage media for bootstrapping a language translation system. A system configured to practice the method performs a bidirectional web crawl to identify a bilingual website. The system analyzes data on the bilingual website to make a classification decision about whether the root of the bilingual website is an entry point for the bilingual website. The bilingual site can contain pairs of parallel pages. Each pair can include a first website in a first language and a second website in a second language, and a first portion of the first web page corresponds to a second portion of the second web page. Then the system analyzes the first and second web pages to identify corresponding information pairs in the first and second languages, and extracts the corresponding information pairs from the first and second web pages for use in a language translation model. | 01-31-2013 |
20130066632 | SYSTEM AND METHOD FOR ENRICHING TEXT-TO-SPEECH SYNTHESIS WITH AUTOMATIC DIALOG ACT TAGS - Disclosed herein are systems, methods, and non-transitory computer-readable storage media for modifying the prosody of synthesized speech based on an associated speech act. A system configured according to the method embodiment (1) receives text, (2) performs an analysis of the text to determine and assign a speech act label to the text, and (3) converts the text to speech, where the speech prosody is based on the speech act label. The analysis performed compares the text to a corpus of previously tagged utterances to find a close match, determines a confidence score from a correlation of the text and the close match, and, if the confidence score is above a threshold value, retrieving the speech act label of the close match and assigning it to the text. | 03-14-2013 |
Patent application number | Description | Published |
20150120288 | SYSTEM AND METHOD OF PERFORMING AUTOMATIC SPEECH RECOGNITION USING LOCAL PRIVATE DATA - A method of providing hybrid speech recognition between a local embedded speech recognition system and a remote speech recognition system relates to receiving speech from a user at a device communicating with a remote speech recognition system. The system recognizes a first part of speech by performing a first recognition of the first part of the speech with the embedded speech recognition system that accesses private user data, wherein the private user data is not available to the remote speech recognition system. The system recognizes the second part of the speech by performing a second recognition of the second part of the speech with the remote speech recognition system. The final recognition result is a combination of these two recognition processes. The private data can be such local information as a user location, a playlist, frequently dialed numbers or texted people, user contact list information, and so forth. | 04-30-2015 |
20150134320 | SYSTEM AND METHOD FOR TRANSLATING REAL-TIME SPEECH USING SEGMENTATION BASED ON CONJUNCTION LOCATIONS - A system, method and computer-readable storage device which balance latency and accuracy of machine translations by segmenting the speech upon locating a conjunction. The system, upon receiving speech, will buffer speech until a conjunction is detected. Upon detecting a conjunction, the speech received until that point is segmented. The system then continues performing speech recognition on the segment, searching for the next conjunction, while simultaneously initiating translation of the segment. Upon translating the segment, the system converts the translation to a speech output, allowing a user to hear an audible translation of the speech originally heard. | 05-14-2015 |