Patent application number | Description | Published |
20130144592 | Automatic Spelling Correction for Machine Translation - Methods, systems, and apparatus, including computer program products, for correcting spelling in text. A text input is received for translation. One or more suspect words in the text input are identified. For each suspect word, one or more candidate words are identified. A score for the text input and scores for each of one or more candidate inputs are determined, where each candidate input is the text input with one or more of the suspect words each replaced by a respective candidate word. If any, a candidate input whose score is highest among the scores for the candidate inputs and is greater than the text input score by at least a threshold is selected. Otherwise, the text input is selected. A translation of a selected candidate input or the selected text input is provided as the translation of the text input. | 06-06-2013 |
20130346059 | LARGE LANGUAGE MODELS IN MACHINE TRANSLATION - Systems, methods, and computer program products for machine translation are provided. In some implementations a system is provided. The system includes a language model including a collection of n-grams from a corpus, each n-gram having a corresponding relative frequency in the corpus and an order n corresponding to a number of tokens in the n-gram, each n-gram corresponding to a backoff n-gram having an order of n−1 and a collection of backoff scores, each backoff score associated with an n-gram, the backoff score determined as a function of a backoff factor and a relative frequency of a corresponding backoff n-gram in the corpus. | 12-26-2013 |
Patent application number | Description | Published |
20080262828 | Encoding and Adaptive, Scalable Accessing of Distributed Models - Systems, methods, and apparatus for accessing distributed models in automated machine processing, including using large language models in machine translation, speech recognition and other applications. | 10-23-2008 |
20100004919 | OPTIMIZING PARAMETERS FOR MACHINE TRANSLATION - Methods, systems, and apparatus, including computer program products, for language translation are disclosed. In one implementation, a method is provided. The method includes determining, for a plurality of feature functions in a translation lattice, a corresponding plurality of error surfaces for each of one or more candidate translations represented in the translation lattice; adjusting weights for the feature functions by traversing a combination of the plurality of error surfaces for phrases in a training set; selecting weighting values that minimize error counts for the traversed combination; and applying the selected weighting values to convert a sample of text from a first language to a second language. | 01-07-2010 |
20100004920 | OPTIMIZING PARAMETERS FOR MACHINE TRANSLATION - Methods, systems, and apparatus, including computer program products, for language translation are disclosed. In one implementation, a method is provided. The method includes accessing a hypothesis space, where the hypothesis space represents a plurality of candidate translations; performing decoding on the hypothesis space to obtain a translation hypothesis that minimizes an expected error in classification calculated relative to an evidence space; and providing the obtained translation hypothesis for use by a user as a suggested translation in a target translation. | 01-07-2010 |
20110202330 | Compound Splitting - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for decompounding compound words are disclosed. In one aspect, a method includes obtaining a token that includes a sequence of characters, identifying two or more candidate sub-words that are constituents of the token, and one or more morphological operations that are required to transform the sub-words into the token, where at least one of the morphological operations involves a use of a non-dictionary word, and determining a cost associated with each sub-word and a cost associated with each morphological operation. | 08-18-2011 |
20120123765 | Providing Alternative Translations - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting alternative translations. In one aspect, a method includes receiving source language text; receiving translated text corresponding to the source language text from a machine translation system; receiving segmentation data for the translated text, wherein the segmentation data includes a first segmentation of the translated text, the first segmentation dividing the translated text into two or more segments; receiving one or more alternative translations for each of the two or more segments; presenting the source text and the translated text to a user in a user interface; and in response to a user selection of a first portion of the translated text, displaying, in the user interface, one or more alternative translations for a first segment to which the first portion of translated text corresponds according to the first segmentation. | 05-17-2012 |
20130046530 | ENCODING AND ADAPTIVE, SCALABLE ACCESSING OF DISTRIBUTED MODELS - Systems, methods, and apparatus for accessing distributed models in automated machine processing, including using large language models in machine translation, speech recognition and other applications. | 02-21-2013 |
20130144593 | MINIMUM ERROR RATE TRAINING WITH A LARGE NUMBER OF FEATURES FOR MACHINE LEARNING - Systems, methods, and apparatuses including computer program products for machine learning. A method is provided that includes determining model parameters for a plurality of feature functions for a linear machine learning model, ranking the plurality of feature functions according to a quality criterion, and selecting, using the ranking, a group of feature functions from the plurality of feature functions to update with the determined model parameters. | 06-06-2013 |
20130151235 | LINGUISTIC KEY NORMALIZATION - Systems, methods, and apparatuses including computer program products are provided for training machine learning systems. In some implementations, a method is provided. The method includes receiving a collection of phrases, normalizing a plurality of phrases of the collection of phrases, the normalizing being based at least in part on lexicographic normalizing rules, and generating a normalized phrase table including a plurality of key-value pairs, each key value pair includes a key corresponding to a normalized phrase and a value corresponding to one or more un-normalized phrases associated with the normalized key, each un-normalized phrase having one or more parameters. | 06-13-2013 |
20140085215 | PROGRESS DISPLAY OF HANDWRITING INPUT - A computer-implemented method includes: receiving, at a user device, user input corresponding to handwritten text to be recognized using a recognition engine; and receiving, at the user device, a representation of the handwritten text. The representation includes the handwritten text parsed into individual handwritten characters. The method further includes: displaying, on a display of the user device, the handwritten characters using a first indicator; receiving, at the user device, an identification of a text character recognized as one of the handwritten characters; displaying, on the display, the text character; and adjusting, at the user device, the one of the handwritten characters from being displayed using the first indicator to using a second indicator in response to the received identification. The first and second indicators are different. | 03-27-2014 |
20140257787 | ENCODING AND ADAPTIVE, SCALABLE ACCESSING OF DISTRIBUTED MODELS - Systems, methods, and apparatus for accessing distributed models in automated machine processing, including using large language models in machine translation, speech recognition and other applications. | 09-11-2014 |