Patent application number | Description | Published |
20080221874 | Method and Apparatus for Fast Semi-Automatic Semantic Annotation - A method, apparatus and computer instructions is provided for fast semi-automatic semantic annotation. Given a limited annotated corpus, the present invention assigns a tag and a label to each word of the next limited annotated corpus using a parser engine, a similarity engine, and a SVM engine. A rover then combines the parse trees from the three engines and annotates the next chunk of limited annotated corpus with confidence, such that the efforts required for human annotation is reduced. | 09-11-2008 |
20080270115 | SYSTEM AND METHOD FOR DIACRITIZATION OF TEXT - A system and method for restoration of diacritics includes making classification decisions regarding an utterance in accordance with an aggregate of a plurality of information sources in a diacritization model for diacritic restoration. A best diacritic representation is determined for graphemes in the utterance based upon a best match with the diacritization model. A diacritically restored representation of the utterance is output. | 10-30-2008 |
20090018833 | MODEL WEIGHTING, SELECTION AND HYPOTHESES COMBINATION FOR AUTOMATIC SPEECH RECOGNITION AND MACHINE TRANSLATION - A translation method and system include a recognition engine having a plurality of models each being employed to decode a same utterance to provide an output. A model combiner is configured to assign probabilities to each model output and configured to assign weights to the outputs of the plurality of models based on the probabilities to provide a best performing model for the context of the utterance. | 01-15-2009 |
20090248394 | MACHINE TRANSLATION IN CONTINUOUS SPACE - A system and method for training a statistical machine translation model and decoding or translating using the same is disclosed. A source word versus target word co-occurrence matrix is created to define word pairs. Dimensionality of the matrix may be reduced. Word pairs are mapped as vectors into continuous space where the word pairs are vectors of continuous real numbers and not discrete entities in the continuous space. A machine translation parametric model is trained using an acoustic model training method based on word pair vectors in the continuous space. | 10-01-2009 |
20110191096 | GAME BASED METHOD FOR TRANSLATION DATA ACQUISITION AND EVALUATION - A method of generating a statistical machine translation database through a game in which a monolingual structure is provided to a plurality of players. A first translation attempt is received from each of the plurality of players. The first translation attempt from each of the plurality of players is compared. Feedback is provided to each of the plurality of players and the attempts are received and compared to provide feedback to iteratively converge subsequent translations from each of the plurality of players into a final translated structure. | 08-04-2011 |
20110282648 | Machine Translation with Side Information - A method of identifying and using side information available to statistical machine translation systems within an enterprise setting, the method including extracting user-specific interaction and non-interaction-based information from at least one corresponding database within the enterprise for each of a plurality of users, aggregating the user-specific interaction and non-interaction based information from a plurality of users, by using a processor on a computer, to tune and adapt background translation and language models, and updating all relevant models within the enterprise after user activity based on the tuned and adapted translation and language models. | 11-17-2011 |
20120245897 | Virtualized Abstraction with Built-in Data Alignment and Simultaneous Event Monitoring in Performance Counter Based Application Characterization and Tuning - Techniques for monitoring a set of one or more event counters of application execution are provided. The techniques include constructing a virtual performance monitoring counter (VPMC) layer as a unified abstraction of a physical performance monitoring counter (PMC) architecture, and incorporating one or more programming interfaces (PIs) in connection with the virtual performance monitoring counter, wherein the one or more programming interfaces facilitate simultaneous access and data monitoring across a set of one or more event counters. | 09-27-2012 |
20130024403 | AUTOMATICALLY INDUCED CLASS BASED SHRINKAGE FEATURES FOR TEXT CLASSIFICATION - A method and apparatus are provided for automatically inducing class based shrinkage features. The method includes clustering each word in a set of word groupings of a given type into a respective one of a plurality of classes. The method further includes selecting and extracting a set of class-based shrinkage features from the set of word groupings based on the plurality of classes. The set of class-based shrinkage features is specifically selected for an intended classification application. | 01-24-2013 |
20130073276 | MT Based Spoken Dialog Systems Customer/Machine Dialog - Operation of an automated dialog system is described using a source language to conduct a real time human machine dialog process with a human user using a target language. A user query in the target language is received and automatically machine translated into the source language. An automated reply of the dialog process is then delivered to the user in the target language. If the dialog process reaches an initial assistance state, a first human agent using the source language is provided to interact in real time with the user in the target language by machine translation to continue the dialog process. Then if the dialog process reaches a further assistance state, a second human agent using the target language is provided to interact in real time with the user in the target language to continue the dialog process. | 03-21-2013 |
Patent application number | Description | Published |
20130304451 | BUILDING MULTI-LANGUAGE PROCESSES FROM EXISTING SINGLE-LANGUAGE PROCESSES - Processes capable of accepting linguistic input in one or more languages are generated by re-using existing linguistic components associated with a different anchor language, together with machine translation components that translate between the anchor language and the one or more languages. Linguistic input is directed to machine translation components that translate such input from its language into the anchor language. Those existing linguistic components are then utilized to initiate responsive processing and generate output. Optionally, the output is directed through the machine translation components. A language identifier can initially receive linguistic input and identify the language within which such linguistic input is provided to select an appropriate machine translation component. A hybrid process, comprising machine translation components and linguistic components associated with the anchor language, can also serve as an initiating construct from which a single language process is created over time. | 11-14-2013 |
20130346066 | Joint Decoding of Words and Tags for Conversational Understanding - Joint decoding of words and tags may be provided. Upon receiving an input from a user comprising a plurality of elements, the input may be decoded into a word lattice comprising a plurality of words. A tag may be assigned to each of the plurality of words and a most-likely sequence of word-tag pairs may be identified. The most-likely sequence of word-tag pairs may be evaluated to identify an action request from the user. | 12-26-2013 |
20140222422 | SCALING STATISTICAL LANGUAGE UNDERSTANDING SYSTEMS ACROSS DOMAINS AND INTENTS - A scalable statistical language understanding (SLU) system uses a fixed number of understanding models that scale across domains and intents (i.e. single vs. multiple intents per utterance). For each domain added to the SLU system, the fixed number of existing models is updated to reflect the newly added domain. Information that is already included in the existing models and the corresponding training data may be re-used. The fixed models may include a domain detector model, an intent action detector model, an intent object detector model and a slot/entity tagging model. A domain detector identifies different domains identified within an utterance. All/portion of the detected domains are used to determine associated intent actions. For each determined intent action, one or more intent objects are identified. Slot/entity tagging is performed using the determined domains, intent actions, and intent object detector. | 08-07-2014 |
20140278355 | USING HUMAN PERCEPTION IN BUILDING LANGUAGE UNDERSTANDING MODELS - An understanding model is trained to account for human perception of the perceived relative importance of different tagged items (e.g. slot/intent/domain). Instead of treating each tagged item as equally important, human perception is used to adjust the training of the understanding model by associating a perceived weight with each of the different predicted items. The relative perceptual importance of the different items may be modeled using different methods (e.g. as a simple weight vector, a model trained using features (lexical, knowledge, slot type, . . . ), and the like). The perceptual weight vector and/or or model are incorporated into the understanding model training process where items that are perceptually more important are weighted more heavily as compared to the items that are determined by human perception as less important. | 09-18-2014 |
20140379323 | ACTIVE LEARNING USING DIFFERENT KNOWLEDGE SOURCES - Different knowledge sources are automatically accessed to identify and obtain additional data to update a conversational dialog system. One of the knowledge sources is initially selected as a seed source. Seed data from the seed source are used to identify related data in at least one other knowledge source. For example, query click logs may be accessed and searched to determine popular queries that use the seed data. A structured knowledge source may be accessed to determine related nodes to the seed data. A query click log, or some other knowledge source, may be used to determine when a node is related to the seed data. Data that is identified to be related may be used to train a language understanding model or update a schema for the SLU system. The data may be automatically annotated or manually annotated. | 12-25-2014 |
20140379326 | BUILDING CONVERSATIONAL UNDERSTANDING SYSTEMS USING A TOOLSET - Tools are provided to allow developers to enable applications for Conversational Understanding (CU) using assets from a CU service. The tools may be used to select functionality from existing domains, extend the coverage of one or more domains, as well as to create new domains in the CU service. A developer may provide example Natural Language (NL) sentences that are analyzed by the tools to assist the developer in labeling data that is used to update the models in the CU service. For example, the tools may assist a developer in identifying domains, determining intent actions, determining intent objects and determining slots from example NL sentences. After the developer tags all or a portion of the example NL sentences, the models in the CU service are automatically updated and validated. For example, validation tools may be used to determine an accuracy of the model against test data. | 12-25-2014 |
20140379353 | Environmentally aware dialog policies and response generation - Environmental conditions, along with other information, are used to adjust a response of a conversational dialog system. The environmental conditions may be used at different times within the conversational dialog system. For example, the environmental conditions can be used to adjust the dialog manager's output (e.g., the machine action). The dialog state information that is used by the dialog manager includes environmental conditions for the current turn in the dialog as well as environmental conditions for one or more past turns in the dialog. The environmental conditions can also be used after receiving the machine action to adjust the response that is provided to the user. For example, the environmental conditions may affect the machine action that is determined as well as how the action is provided to the user. The dialog manager and the response generation components in the conversational dialog system each use the available environmental conditions. | 12-25-2014 |
20150066496 | ASSIGNMENT OF SEMANTIC LABELS TO A SEQUENCE OF WORDS USING NEURAL NETWORK ARCHITECTURES - Technologies pertaining to slot filling are described herein. A deep neural network, a recurrent neural network, and/or a spatio-temporally deep neural network are configured to assign labels to words in a word sequence set forth in natural language. At least one label is a semantic label that is assigned to at least one word in the word sequence. | 03-05-2015 |