Rakesh Agrawal, San Jose US

Rakesh Agrawal, San Jose, CA US

Patent application number	Description	Published
20080243524	System and Method for Automating Internal Controls - A computer-based system and method to enforce, monitor, and assess internal controls over financial reporting is provided. A bottom-up approach is used to model transaction-control workflows using logs of past transaction activity executions. Past workflows are reconstructed from these logs and reconstruction rules. The transaction-control workflows are compared with these reconstructed past workflows to determine whether transactions are compliant with the internal controls.	10-02-2008
20080250008	Query Specialization - A system, a method and computer-readable media for identifying and presenting potential query refinements for a user's search input. Documents are identified as being responsive to the search input. A query log is accessed to identify previously entered queries that also returned one or more of the identified documents. From these previously entered queries, a portion of the queries are selected as potential query refinements. Thereafter, the potential query refinements are displayed to the user.	10-09-2008
20080282096	SYSTEM AND METHOD FOR ORDER-PRESERVING ENCRYPTION FOR NUMERIC DATA - A system, method, and computer program product to automatically eliminate the distribution information available for reconstruction from a disguised dataset. The invention flattens input numerical values into a substantially uniformly distributed dataset, then maps the uniformly distributed dataset into equivalent data in a target distribution. The invention allows the incremental encryption of new values in an encrypted database while leaving existing encrypted values unchanged. The flattening comprises (1) partitioning, (2) mapping, and (3) saving auxiliary information about the data processing, which is encrypted and not updated. The partitioning is MDL based, and includes a growth phase for dividing a space into fine partitions and a prune phase for merging some partitions together.	11-13-2008
20090006380	System and Method for Tracking Database Disclosures - A system and method is provided for identifying the source of an unauthorized database disclosure. The system and method stores a plurality of past database queries and determines the relevance of the results of the past database queries (query results) to a sensitive table containing the unauthorized disclosed data. The system and method also ranks the past database queries based on the determined relevance. A list of the most relevant past database queries can then be generated which are ranked according to the relevance, such that the highest ranked queries on the list are most similar to said disclosed data. Three techniques used in embodiments of the invention include partial tuple matching, statistical linkage and deviation probability gain.	01-01-2009
20090006431	SYSTEM AND METHOD FOR TRACKING DATABASE DISCLOSURES - A system and method is provided for identifying the source of an unauthorized database disclosure. The system and method stores a plurality of past database queries and determines the relevance of the results of the past database queries (query results) to a sensitive table containing the unauthorized disclosed data. The system and method also ranks the past database queries based on the determined relevance. A list of the most relevant past database queries can then be generated which are ranked according to the relevance, such that the highest ranked queries on the list are most similar to said disclosed data. Three techniques used in embodiments of the invention include partial tuple matching, statistical linkage and deviation probability gain.	01-01-2009
20090228353	QUERY CLASSIFICATION BASED ON QUERY CLICK LOGS - Methods are provided for the classification of search engine queries and associated documents based on search engine query click logs. One or more seed documents or queries are provided that contain content that is representative of a category. A query click log containing information regarding queries entered by at least one user into the search engine and documents subsequently clicked in search engine results corresponding with the queries is analyzed to determine which one or more queries resulted in clicks on the seed documents. Information is stored associating the one or more queries with the category if they resulted in clicks on the seed documents.	09-10-2009
20090313286	GENERATING TRAINING DATA FROM CLICK LOGS - Data from a click log may be used to generate training data for a search engine. The pages clicked as well as the pages skipped by a user may be used to assess the relevance of a page to a query. Labels for training data may be generated based on data from the click log. The labels may pertain to the relevance of a page to a query.	12-17-2009
20090327748	SYSTEM AND METHOD FOR FAST QUERYING OF ENCRYPTED DATABASES - A system, method, computer program product, and data management service that allows any comparison operation to be applied on encrypted data, without first decrypting the operands. The encryption scheme of the invention allows equality and range queries as well as the aggregation operations of MAX, MIN, and COUNT. The GROUPBY and ORDERBY operations can also be directly applied. Query results produced using the invention are sound and complete, the invention is robust against cryptanalysis, and its security strictly relies on the choice of a private key. Order-preserving encryption allows standard database indexes to be built over encrypted tables. The invention can easily be integrated with existing systems.	12-31-2009
20100114925	CUSTOMIZED SEARCH - Techniques are disclosed herein for providing a custom search engine. In one aspect, a first search query is received from a requestor. First search results contain search result items that match the first search query are obtained. A least one sub-query is generated from the first search results. The generating is based on rules for a particular custom search engine. Second search results that match the sub-query are then obtained. A search result set is formed from a corpus that includes the first search results and the second search results. The generating of the search result set is based on the rules for the particular custom search engine. The search result set is provided to the requester. In one aspect an interface for designing a custom search engine is provided. The interface allows the designer to specify the layout of a search results page.	05-06-2010
20100153388	METHODS AND APPARATUS FOR RESULT DIVERSIFICATION - Methods, apparatus, and systems directed to receiving search queries, retrieving documents, computing the number of categories to present for a given query, computing the number of results to show in each category, computing an ordering of categories, and for all the result pages beyond the first page employing user interface elements that optionally allow the user to quickly zoom in on a specific category and get more results belonging to that category.	06-17-2010
20100185577	OBJECT CLASSIFICATION USING TAXONOMIES - As provided herein objects from a source catalog, such as a provider's catalog, can be added to a target catalog, such as an enterprise master catalog, in a scalable manner utilizing catalog taxonomies. A baseline classifier determines probabilities for source objects to target catalog classes. Source objects can be assigned to those classes with probabilities that meet a desired threshold and meet a desired rate. A classification cost for target classes can be determined for respective unassigned source objects, which can comprise determining an assignment cost and separation cost for the source objects for respective desired target classes. The separation and assignment costs can be combined to determine the classification cost, and the unassigned source objects can be assigned to those classes having a desired classification cost.	07-22-2010
20100250333	OPTIMIZING CASHBACK RATES - A method, system, and medium are provided for determining optimal sales rebate rates. Historical data, including sales data, price data, and rebate data are received, along with ongoing current data from current rebate transactions. Changes across the spectrum of data are determined and calculations are used to obtain an optimal sales rebate rate for one of more products or services utilizing statistical models, including but not limited to, a linear rebate rate model and a logarithmic-linear rebate rate model for one or more products or services. A mathematical analysis determines the appropriate model to use to obtain the optimal sales rebate rate. The optimal sales rebate rate may be applied to computing or non-computing environments, in whole or as a combination of both computing and non-computing environments.	09-30-2010
20100287060	PROVIDING TIME-SENSITIVE INFORMATION FOR PURCHASE DETERMINATIONS - A method, system, and medium are provided that are directed to providing a user with time-sensitive information that is usable to determine when to purchase a product. In accordance with embodiments of the technology, exemplary steps include using historical product information to generate time-sensitive information. Moreover, in response to receiving from a user a request to receive information describing a given product, time-sensitive information is caused to be presented. For example, time-sensitive information might be usable by the user to determine when to purchase the given product and an alternative product.	11-11-2010
20110066650	QUERY CLASSIFICATION USING IMPLICIT LABELS - Described is a technology for automatically generating labeled training data for training a classifier based upon implicit information associated with the data. For example, whether a query has commercial intent can be classified based upon whether the query was submitted at a commercial website's search portal, as logged in a toolbar log. Positive candidate query-related data is extracted from the toolbar log based upon the associated implicit information. A click log is processed to obtain negative query-related data. The labeled training data is automatically generated by separating at least some of the positive candidate query data from the remaining positive candidate query data based upon the negative query data. The labeled training data may be used to train a classifier, such as to classify an online search query as having a certain type of intent or not.	03-17-2011
20110119269	Concept Discovery in Search Logs - Described is a search (e.g., web search) technology in which concepts are returned in response to a query in addition to (or instead of) search results in the form of traditional links. Each concept generally corresponds to a set of links to content that are more directed towards a possible user intention, or information need, with respect to that query. If a user selects a concept, that concept's links are exposed to facilitate selection of a document the user finds relevant. In this manner, much more than the top ten ranked links may be provided for a query, each set of other links arranged by the concepts. Also described is processing a query log or other data store to optionally find related queries and find the concepts, e.g., by clustering a relationship graph built from the query log to find dense subgraphs representative of the concepts.	05-19-2011
20110307504	COMBINING ATTRIBUTE REFINEMENTS AND TEXTUAL QUERIES - A user submits an unstructured query that is analyzed to determine a mapping from attributes to attribute values. One or more matching items from a structured data set are determined based on the attribute values of attributes associated with the items. The matching items are displayed. One or more refinement attributes are displayed, each with one or more attribute values. The attribute values in the refinements that correspond to the attribute values of the query are shown as selected. If the user selects any of the refinement attributes, the query is revised to incorporate the attribute values of the selected refinements. New matching items are determined using the revised structured query. The revised structured query and the new matching items are displayed. This process can be iterated, by modification of the query or the refinements. The matching items, the selected refinement attribute values and the query are synchronized.	12-15-2011
20110314012	DETERMINING QUERY INTENT - A tree structure has a node associated with each category of a hierarchy of item categories. Child nodes of the tree are associated with sub-categories of the categories associated with parent nodes. Training data including received queries and indicators of a selected item category for each received query is combined with the tree structure by associating each query with the node corresponding to the selected category of the query. When a query is received, a classifier is applied to the nodes to generate a probability that the query is intended to match an item of the category associated with the node. The classifier is applied until the probability is below a threshold. One or more categories associated with the nodes that are closest to the intent of the received query are selected and indicators of items of those categories that match the received query are output.	12-22-2011
20120059739	PROVIDING TIME-SENSITIVE INFORMATION FOR PURCHASE DETERMINATIONS - A method, system, and medium are provided that are directed to providing a user with time-sensitive information that is usable to determine when to purchase a product. In accordance with embodiments of the technology, exemplary steps include using historical product information to generate time-sensitive information. Moreover, in response to receiving from a user a request to receive information describing a given product, time-sensitive information is caused to be presented. For example, time-sensitive information might be usable by the user to determine when to purchase the given product and an alternative product.	03-08-2012
20120066094	PROVIDING TIME-SENSITIVE INFORMATION FOR PURCHASE DETERMINATIONS - A method, system, and medium are provided that are directed to providing a user with time-sensitive information that is usable to determine when to purchase a product. In accordance with embodiments of the technology, exemplary steps include using historical product information to generate time-sensitive information. Moreover, in response to receiving from a user a request to receive information describing a given product, time-sensitive information is caused to be presented. For example, time-sensitive information might be usable by the user to determine when to purchase the given product and an alternative product.	03-15-2012
20120089588	SEARCH RESULT DIVERSIFICATION - Methods, apparatus, and systems directed to receiving search queries, retrieving documents, computing the number of categories to present for a given query, computing the number of results to show in each category, computing an ordering of categories, and for all the result pages beyond the first page employing user interface elements that optionally allow the user to quickly zoom in on a specific category and get more results belonging to that category.	04-12-2012
20120314941	ACCURATE TEXT CLASSIFICATION THROUGH SELECTIVE USE OF IMAGE DATA - Product images are used in conjunction with textual descriptions to improve classifications of product offerings. By combining cues from both text and image descriptions associated with products, implementations enhance both the precision and recall of product description classifications within the context of web-based commerce search. Several implementations are directed to improving those areas where text-only approaches are most unreliable. For example, several implementations use image signals to complement text classifiers and improve overall product classification in situations where brief textual product descriptions use vocabulary that overlaps with multiple diverse categories. Other implementations are directed to using text and images “training sets” to improve automated classifiers including text-only classifiers. Certain implementations are also directed to learning a number of three-way image classifiers focused only on “confusing categories” of the text signals to improve upon those specific areas where text-only classification is weakest.	12-13-2012
20130275441	COMPOSING TEXT AND STRUCTURED DATABASES - A framework is provided for composing texts about objects with structured information about these objects, and thus disclosed are methodologies for linking information from at least two data sources—one comprising a plurality of documents comprising text pertaining to at least one object, and one comprising a plurality of structured records comprising at least one characteristic of the at least one object, each characteristic comprising one property name and an associated property value corresponding to the property name for the at least one object—by determining one or more instance-based traits for each object in both data sources and associating at least one record with at least one document that refers to each object, each trait comprising one or more characteristics that identifiably distinguish each object from all other objects.	10-17-2013
20140270497	ACCURATE TEXT CLASSIFICATION THROUGH SELECTIVE USE OF IMAGE DATA - Product images are used in conjunction with textual descriptions to improve classifications of product offerings. By combining cues from both text and image descriptions associated with products, implementations enhance both the precision and recall of product description classifications within the context of web-based commerce search. Several implementations are directed to improving those areas where text-only approaches are most unreliable. For example, several implementations use image signals to complement text classifiers and improve overall product classification in situations where brief textual product descriptions use vocabulary that overlaps with multiple diverse categories. Other implementations are directed to using text and images “training sets” to improve automated classifiers including text-only classifiers. Certain implementations are also directed to learning a number of three-way image classifiers focused only on “confusing categories” of the text signals to improve upon those specific areas where text-only classification is weakest.	09-18-2014
20140324982	TOPIC IDENTIFIERS ASSOCIATED WITH GROUP CHATS - Text messages over some period of time are collected. Topic identifiers, such as hashtags, are extracted from the text messages. The text messages associated with each topic identifier are processed to identify which topic identifiers are associated with group chats based on information associated with the text messages such as the times when the text messages were generated and whether the text messages identify user accounts. The topic identifiers that are determined to be associated with the group chats are incorporated into applications that allow users to search for group chats, and to view text messages from past group chats.	10-30-2014

Patent applications by Rakesh Agrawal, San Jose, CA US

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Rakesh Agrawal, San Jose US

Rakesh Agrawal, San Jose, CA US