Patent application number | Description | Published |
20130103493 | Search Query and Document-Related Data Translation - The subject disclosure is directed towards developing a translation model for mapping search query terms to document-related data. By processing user logs comprising search histories into word-aligned query-document pairs, the translation model may be trained using data, such as probabilities, corresponding to the word-aligned query-document pairs. After incorporating the translation model into model data for a search engine, the translation model is used may used as features for producing relevance scores for current search queries and ranking documents/advertisements according to relevance. | 04-25-2013 |
20130124492 | Statistical Machine Translation Based Search Query Spelling Correction - Statistical Machine Translation (SMT) based search query spelling correction techniques are described herein. In one or more implementations, search data regarding searches performed by clients may be logged. The logged data includes query correction pairs that may be used to ascertain error patterns indicating how misspelled substrings may be translated to corrected substrings. The error patterns may be used to determine suggestions for an input query and to develop query correction models used to translate the input query to a corrected query. In one or more implementations, probabilistic features from multiple query correction models are combined to score different correction candidates. One or more top scoring correction candidates may then be exposed as suggestions for selection by a user and/or provided to a search engine to conduct a corresponding search using the corrected query version(s). | 05-16-2013 |
20140149429 | WEB SEARCH RANKING - A computer-implemented method and system for Web search ranking are provided herein. The method includes generating a number of training samples from clickthrough data, wherein the training samples include positive query-document pairs and negative query-document pairs. The method also includes discriminatively training a translation model based on the training samples and ranking a number of documents for a Web search based on the translation model. | 05-29-2014 |
20140222724 | GENERATION OF LOG-LINEAR MODELS USING L-1 REGULARIZATION - A log-linear model may be trained using a modified version of an original limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm. The modified version may be based on modifying the original L-BFGS algorithm using a single map-reduce implementation. In another aspect, a sparse log-linear model may be accessed. The sparse log-linear model may be trained with L1-regularization, based on data indicating past user ad selection behaviors. A probability of a user selection of an ad may be determined based on the sparse log-linear model. | 08-07-2014 |
20140365201 | TRAINING MARKOV RANDOM FIELD-BASED TRANSLATION MODELS USING GRADIENT ASCENT - Various technologies described herein pertain to training and utilizing a general, statistical framework for modeling translation via Markov random fields (MRFs). An MRF-based translation model can be employed in a statistical machine translation (SMT) system. The MRF-based translation model allows for arbitrary features extracted from a phrase pair to be incorporated as evidence. The parameters of the model are estimated using a large-scale discriminative training approach based on stochastic gradient ascent and an N-best list based expected Bilingual Evaluation Understudy (BLEU) as an objective function. | 12-11-2014 |
20150032767 | QUERY EXPANSION AND QUERY-DOCUMENT MATCHING USING PATH-CONSTRAINED RANDOM WALKS - Various technologies described herein pertain to use of path-constrained random walks for query expansion and/or query document matching. Clickthrough data from search logs is represented as a labeled and directed graph. Path-constrained random walks are executed over the graph based upon an input query. The graph includes a first set of nodes that represent queries included in the clickthrough data from search logs, a second set of nodes that represent documents included in the clickthrough data from the search logs, a third set of nodes that represent words from the queries and the documents, and edges between nodes that represent relationships between queries, documents, and words. The path-constrained random walks include traversals over edges of the graph between nodes. Further, a score for a relationship between a target node and a source node representative of the input query is computed based at least in part upon the path-constrained random walks. | 01-29-2015 |
20150074027 | Deep Structured Semantic Model Produced Using Click-Through Data - A deep structured semantic module (DSSM) is described herein which uses a model that is discriminatively trained based on click-through data, e.g., such that a conditional likelihood of clicked documents, given respective queries, is maximized, and a condition likelihood of non-clicked documents, given the queries, is reduced. In operation, after training is complete, the DSSM maps an input item into an output item expressed in a semantic space, using the trained model. To facilitate training and runtime operation, a dimensionality-reduction module (DRM) can reduce the dimensionality of the input item that is fed to the DSSM. A search engine may use the above-summarized functionality to convert a query and a plurality of documents into the common semantic space, and then determine the similarity between the query and documents in the semantic space. The search engine may then rank the documents based, at least in part, on the similarity measures. | 03-12-2015 |
20150278200 | Convolutional Latent Semantic Models and their Applications - Functionality is described herein for transforming first and second symbolic linguistic items into respective first and second continuous-valued concept vectors, using a deep learning model, such as a convolutional latent semantic model. The model is designed to capture both the local and global linguistic contexts of the linguistic items. The functionality then compares the first concept vector with the second concept vector to produce a similarity measure. More specifically, the similarity measure expresses the closeness between the first and second linguistic items in a high-level semantic space. In one case, the first linguistic item corresponds to a query, and the second linguistic item may correspond to a phrase, or a document, or a keyword, or an ad, etc. In one implementation, the convolutional latent semantic model is produced in a training phase based on click-through data. | 10-01-2015 |