Patent application number | Description | Published |
20080243481 | Large Language Models in Machine Translation - Systems, methods, and computer program products for machine translation are provided. In some implementations a system is provided. The system includes a language model including a collection of n-grams from a corpus, each n-gram having a corresponding relative frequency in the corpus and an order n corresponding to a number of tokens in the n-gram, each n-gram corresponding to a backoff n-gram having an order of n-1 and a collection of backoff scores, each backoff score associated with an n-gram, the backoff score determined as a function of a backoff factor and a relative frequency of a corresponding backoff n-gram in the corpus. | 10-02-2008 |
20080262828 | Encoding and Adaptive, Scalable Accessing of Distributed Models - Systems, methods, and apparatus for accessing distributed models in automated machine processing, including using large language models in machine translation, speech recognition and other applications. | 10-23-2008 |
20100005080 | SYSTEM AND METHOD FOR ANALYZING DATA RECORDS - A method and system for analyzing data records includes allocating groups of records to respective processes of a first plurality of processes executing in parallel. In each respective process of the first plurality of processes, for each record in the group of records allocated to the respective process, a query is applied to the record so as to produce zero or more values. Zero or more emit operators are applied to each of the zero or more produced values so as to add corresponding information to an intermediate data structure. Information from a plurality of the intermediate data structures is aggregated to produce output data. | 01-07-2010 |
20100114965 | SYSTEM AND METHOD FOR IMPROMPTU SHARED COMMUNICATION SPACES - Communications between entities who may share common interests. For entities determined to be sharing common interests (e.g., searching using the same terms or topics, browsing a page, a site or a groups of topically related sites), options for communication among the entities are provided. For example, a chat room may be dynamically created for persons who are currently searching or browsing the same or related information. As another example, a “homepage” may be created for each query and contain various types of information related to the query. A permission module controls which entities may participate, what types of information (and from what sources) an entity can (or desires to) receive, what types of information the entity may (or desires to) share. | 05-06-2010 |
20110022605 | DOCUMENT SCORING BASED ON LINK-BASED CRITERIA - A method may include receiving a document and an initial score for the document; determining that there has been a decrease in a rate or quantity of new links that point to the document over time; classifying the document as stale in response to the determining; decreasing the initial score for the document, resulting in an updated score; and ranking the document with regard to at least one other document based, at least in part, on the score. | 01-27-2011 |
20110029542 | DOCUMENT SCORING BASED ON DOCUMENT INCEPTION DATE - A system may determine a document inception date associated with a document, generate a score for the document based, at least in part, on the document inception date, and rank the document with regard to at least one other document based, at least in part, on the score. | 02-03-2011 |
20110153577 | Query Processing System and Method for Use with Tokenspace Repository - A search engine server system receives from a client system a search query and identifies a set of documents in accordance with the search query. A content snippet corresponding to content in a respective document of the identified set of documents is generated, the content snippet associated with at least one query term of the one or more query terms in the search query. A response to the search query is returned to the client system, the response including information identifying at least the respective document and including the content snippet. Generating the content snippet includes performing a first decompression operation on first token identifiers, from a compressed document repository, to provide a set of second token identifiers, and performing a second decompression operation on the set of second token identifiers to recover uncompressed content comprising a portion of the respective document. | 06-23-2011 |
20110179118 | Shared Communication Space Invitations - A computer-implemented method of providing invitations to a shared communication space, performed by a server system, includes providing the shared communication space, which includes content associated with a set of characteristics, and identifying a user, in accordance with a set of characteristics associated with the user and the set of characteristics associated with the content in the shared communication space. The method further includes sending to the identified user a invitation to participate in the shared communication space, and upon acceptance of the invitation by the user, enabling access by the user to the shared communication space by the user and enabling the user to exchange information with other participants in the shared communication space via the shared communication space. | 07-21-2011 |
20110258185 | DOCUMENT SCORING BASED ON DOCUMENT CONTENT UPDATE - A system may determine a measure of how a content of a document changes over time, generate a score for the document based, at least in part, on the measure of how the content of the document changes over time, and rank the document with regard to at least one other document based, at least in part, on the score. | 10-20-2011 |
20110264671 | DOCUMENT SCORING BASED ON DOCUMENT CONTENT UPDATE - A system may determine a measure of how a content of a document changes over time, generate a score for the document based, at least in part, on the measure of how the content of the document changes over time, and rank the document with regard to at least one other document based, at least in part, on the score. | 10-27-2011 |
20120005199 | DOCUMENT SCORING BASED ON DOCUMENT CONTENT UPDATE - A system may determine a measure of how a content of a document changes over time, generate a score for the document based, at least in part, on the measure of how the content of the document changes over time, and rank the document with regard to at least one other document based, at least in part, on the score. | 01-05-2012 |
20120016870 | DOCUMENT SCORING BASED ON QUERY ANALYSIS - A system may determine an extent to which a document is selected when the document is included in a set of search results, generate a score for the document based, at least in part, on the extent to which the document is selected when the document is included in a set of search results; and rank the document with regard to at least one other document based, at least in part, on the score. | 01-19-2012 |
20120016871 | DOCUMENT SCORING BASED ON QUERY ANALYSIS - A system may determine an extent to which a document is selected when the document is included in a set of search results, generate a score for the document based, at least in part, on the extent to which the document is selected when the document is included in a set of search results; and rank the document with regard to at least one other document based, at least in part, on the score. | 01-19-2012 |
20120016874 | DOCUMENT SCORING BASED ON QUERY ANALYSIS - A system may determine an extent to which a document is selected when the document is included in a set of search results, generate a score for the document based, at least in part, on the extent to which the document is selected when the document is included in a set of search results; and rank the document with regard to at least one other document based, at least in part, on the score. | 01-19-2012 |
20120016888 | DOCUMENT SCORING BASED ON QUERY ANALYSIS - A system may determine an extent to which a document is selected when the document is included in a set of search results, generate a score for the document based, at least in part, on the extent to which the document is selected when the document is included in a set of search results; and rank the document with regard to at least one other document based, at least in part, on the score. | 01-19-2012 |
20120016889 | DOCUMENT SCORING BASED ON QUERY ANALYSIS - A system may determine an extent to which a document is selected when the document is included in a set of search results, generate a score for the document based, at least in part, on the extent to which the document is selected when the document is included in a set of search results; and rank the document with regard to at least one other document based, at least in part, on the score. | 01-19-2012 |
20120023098 | DOCUMENT SCORING BASED ON QUERY ANALYSIS - A system may determine an extent to which a document is selected when the document is included in a set of search results, generate a score for the document based, at least in part, on the extent to which the document is selected when the document is included in a set of search results; and rank the document with regard to at least one other document based, at least in part, on the score. | 01-26-2012 |
20120209838 | DOCUMENT SCORING BASED ON QUERY ANALYSIS - A system may determine an extent to which a document is selected when the document is included in a set of search results, generate a score for the document based, at least in part, on the extent to which the document is selected when the document is included in a set of search results; and rank the document with regard to at least one other document based, at least in part, on the score. | 08-16-2012 |
20120215787 | System and Method for Analyzing Data Records - A method and system for analyzing data records includes allocating groups of records to respective processes of a first plurality of processes executing in parallel. In each respective process of the first plurality of processes, for each record in the group of records allocated to the respective process, a query is applied to the record so as to produce zero or more values. Zero or more emit operators are applied to each of the zero or more produced values so as to add corresponding information to an intermediate data structure. Information from a plurality of the intermediate data structures is aggregated to produce output data. | 08-23-2012 |
20130046530 | ENCODING AND ADAPTIVE, SCALABLE ACCESSING OF DISTRIBUTED MODELS - Systems, methods, and apparatus for accessing distributed models in automated machine processing, including using large language models in machine translation, speech recognition and other applications. | 02-21-2013 |
20130212076 | Generating Content Snippets Using a Tokenspace Repository - A search engine server system receives from a client system a search query and identifies a set of documents in accordance with the search query. A content snippet corresponding to content in a respective document of the identified set of documents is generated, the content snippet associated with at least one query term of the one or more query terms in the search query. A response to the search query is returned to the client system, the response including information identifying at least the respective document and including the content snippet. Generating the content snippet includes performing a first decompression operation on first token identifiers, from a compressed document repository, to provide a set of second token identifiers, and performing a second decompression operation on the set of second token identifiers to recover uncompressed content comprising a portion of the respective document. | 08-15-2013 |
20130346059 | LARGE LANGUAGE MODELS IN MACHINE TRANSLATION - Systems, methods, and computer program products for machine translation are provided. In some implementations a system is provided. The system includes a language model including a collection of n-grams from a corpus, each n-gram having a corresponding relative frequency in the corpus and an order n corresponding to a number of tokens in the n-gram, each n-gram corresponding to a backoff n-gram having an order of n−1 and a collection of backoff scores, each backoff score associated with an n-gram, the backoff score determined as a function of a backoff factor and a relative frequency of a corresponding backoff n-gram in the corpus. | 12-26-2013 |
20140096138 | System and Method For Large-Scale Data Processing Using an Application-Independent Framework - A large-scale data processing system and method for processing data in a distributed and parallel processing environment is disclosed. The system comprises a set of interconnected computing systems, each having one or more processors and memory. The set of interconnected computing systems include: a set of application-independent map modules for reading portions of input files containing data, and for producing intermediate data values by applying at least one user-specified, application-specific map operation to the data; a set of intermediate data structures distributed among a plurality of the interconnected computing systems for storing the intermediate data values; and a set of application-independent reduce modules, distinct from the plurality of application-independent map modules, for producing final output data by applying at least one user-specified, application-specific reduce operation to the intermediate data values. | 04-03-2014 |
20140257787 | ENCODING AND ADAPTIVE, SCALABLE ACCESSING OF DISTRIBUTED MODELS - Systems, methods, and apparatus for accessing distributed models in automated machine processing, including using large language models in machine translation, speech recognition and other applications. | 09-11-2014 |