Zheng Chen, Beijing CN

Patent application number	Description	Published
20080208819	GUI BASED WEB SEARCH - An exemplary computer implemented graphics-based Web search system includes a search input control and a results presentation control where the search input control is configured to receive user input to establish a relationship between a query and one or more information tags associated with search results provided by a search engine in response to the query and wherein the results presentation control is configured to re-order the search results in response to the relationship. Such a system allows a user to define and refine search intent and enhance the user's search experience. Various other exemplary systems, methods, devices, etc. are also disclosed.	08-28-2008
20080208840	Diverse Topic Phrase Extraction - Systems and methods for implementing diverse topic phrase extraction are disclosed. According to one implementation, multiple word candidate phrases are extracted from a corpus and weighed. One or more documents are re-weighed to identify less obvious candidate topics using latent semantic analysis (LSA). Phrase diversification is then used to remove redundancy and select informative and distinct topic phrases.	08-28-2008
20080208841	CLICK-THROUGH LOG MINING - Click-through log mining is described. Raw search click-through log data is processed to generate ordered query keywords, utilizing an algorithm to expand user-submitted keywords to include high frequency user queries, managing the keywords for a keyword expansion file, analyzing the algorithm performance on a bidding criteria, and identifying related phrases with similar page-click behaviors for advertisements.	08-28-2008
20080215543	Graph-based search leveraging sentiment analysis of user comments - A search system and method is provided. The method includes constructing a graph-based query that is indicative of a user's preference-levels for different features of a search item (a product, for example). The constructed graph-based query is executed by comparing the user's preference-levels for the different features of the product, which are graphically represented in the query, with information related to sentiments expressed by other users regarding the product. Information related to the sentiments expressed by other users regarding the product can include system-generated product performance graphs constructed from comments regarding the product obtained from the World Wide Web (or other network). Results returned and output upon execution of the graph-based query include system-generated product performance graphs that are similar to the user-submitted query.	09-04-2008
20080215565	SEARCHING HETEROGENEOUS INTERRELATED ENTITIES - Systems and methods for searching heterogeneous interrelated entities for a heterogeneous entities search query are disclosed herein. A user may enter the heterogeneous entities search query. The search retrieves and returns multiple types of heterogeneous entities. The retrieved heterogeneous interrelated entities are searched in a unified matrix that represents relationships between one or more heterogeneous entities. The retrieved heterogeneous interrelated entities may have one or more entity types. The set of retrieved interrelated entities may also be ranked based on the similarity between each entity and the search query. Feedback may also be incorporated into the system to improve search accuracy.	09-04-2008
20080215571	PRODUCT REVIEW SEARCH - This disclosure describes various exemplary methods, computer program products, and user interfaces that provide results for a product review search with opinion snippets and opinion visual graphs. This disclosure describes identifying user opinions by extracting passages that contain subjective opinions from web pages; ranking the user opinions by incorporating sentiment orientations and sentiment topics, where the sentiment orientations are positive or negative; and generating review snippets to indicate user sentiment orientations and to describe user opinions toward product features. This disclosure improves a user product search experience from the following aspects: understanding the product review from snippets instead of browsing the web page; obtaining more information by reading reviews in a shorter time period; and obtaining overall opinions of users of the web through visualized opinion summarization.	09-04-2008
20080215574	Efficient Retrieval Algorithm by Query Term Discrimination - An exemplary method for use in information retrieval includes, for each of a plurality of terms, selecting a predetermined number of top scoring documents for the term to form a corresponding document set for the term; receiving a plurality of terms, optionally as a query; ranking the plurality of terms for importance based at least in part on the document sets for the plurality of terms where the ranking comprises using an inverse document frequency algorithm; selecting a number of ranked terms based on importance where each selected, ranked term comprises its corresponding document set wherein each document in a respective document set comprises a document identification number; forming a union set based on the document sets associated with the selected number of ranked terms; and, for a document identification number in the union set, scanning a document set corresponding to an unselected term for a matching document identification number. Various other exemplary systems, methods, devices, etc. are also disclosed.	09-04-2008
20080215997	WEBPAGE BLOCK TRACKING GADGET - An exemplary web browser system includes a selection module for selecting a webpage block and recording information about a selected webpage block; a tracking module for tracking changes to a selected webpage block based at least in part on the recorded information for that webpage block; and a display module for displaying a selected webpage block wherein the tracking module updates the display module as to changes to the selected webpage block. Various other exemplary systems, methods, devices are also disclosed.	09-04-2008
20080249762	CATEGORIZATION OF DOCUMENTS USING PART-OF-SPEECH SMOOTHING - A method and system is provided for classifying documents based on the subjectivity of the content of the documents using a part-of-speech analysis to help account for unseen words. A classification system trains a classifier using the parts of speech of training documents so that the classifier can classify unseen words based on the part of speech of the unseen word. The classification system then trains a part-of-speech model using the parts of speech of the n-grams of training data and labels of the training documents, and trains a term model using the term unigrams and labels. To classify a target document, the classification system applies the part-of-speech model to the part-of-speech n-grams of the target document and the term model to term n-grams of the target document.	10-09-2008
20080249764	Smart Sentiment Classifier for Product Reviews - A sentiment classifier is described. In one implementation, a system applies both full text and complex feature analyses to sentences of a product review. Each analysis is weighted prior to linear combination into a final sentiment prediction. A full text model and a complex features model can be trained separately offline to support online full text analysis and complex features analysis. Complex features include opinion indicators, negation patterns, sentiment-specific sections of the product review, user ratings, sequence of text chunks, and sentence types and lengths. A Conditional Random Field (CRF) framework provides enhanced sentiment classification for each segment of a complex sentence to enhance sentiment prediction.	10-09-2008
20080256444	Internet Visualization System and Related User Interfaces - Systems and methods are described for an Internet visualization system and related user interfaces. In one implementation, the system analyzes Internet search logs to determine most popular search queries across the world at a current time. A user interface displays a keyword of each of the most popular queries in a single visual display that relates each query to a geographical location of greatest popularity. The system can also filter queries according to demographics. In one implementation the user interface provides a 3-dimensional Internet visualization that adopts an ocean or seascape theme. The ocean floor displays a map of the world, and query bubbles rise from geographical locations on the map. The size and duration of each query bubble denotes the relative popularity of a given query.	10-16-2008
20080281821	Concept Network - A concept network that can be generated in response to a user query. Various embodiments include analysis of structure information, for example, where such information is based at least in part on Universal Resource Locators (URLs) of Web sites or data storage locations. A concept network may be used with a search tool where the search tool searches a plurality of sites (e.g., Web sites, data storage locations, etc.). In such an example, each site location is arranged with a node. Certain ones of the nodes are connected by at least one link. The concept network selects a portion of certain ones of the nodes based on the link, wherein the at least one link is used for content purposes.	11-13-2008
20080281834	Block tracking mechanism for web personalization - Described is a technology by which blocks of web pages may be selected, such as for building a user-personalized web page containing selected blocks. A selection mechanism, such as a browser toolbar add-on, provides a user interface for selecting blocks, and records information about selected blocks. A block tracking mechanism (e.g., a daemon program) uses the information to locate selected blocks of the web pages, including when the web page containing the block is updated with respect to content and/or layout. The block tracking mechanism may update a local gadget that when invoked, such as by browsing to a particular web page, which shows updated versions of the block on a personalized web page. Blocks may be efficiently located by processing trees representing web pages into reduced trees, and then by performing a minimum distance mapping algorithm on the reduced trees.	11-13-2008
20080288348	Ranking online advertisements using retailer and product reputations - A method for ranking online advertisements using retailer reputation and product reputation. In one implementation, a query may be received. Advertisements may be selected by determining a level of relevance between the query and each advertisement and selecting the advertisements with a level of relevance above a pre-determined level of relevance. A predicted reputation for a retailer and a predicted reputation for a product may be retrieved for each of the selected advertisements. The selected advertisements may then be ranked based on the predicted reputation for the retailer and the predicted reputation of the product. The ranking of the selected advertisements may be accomplished by calculating a ranking score for each selected advertisement based on the retailer predicted reputation and the product predicted reputation. The selected advertisements may then be displayed according to the ranking.	11-20-2008
20080288481	Ranking online advertisement using product and seller reputation - Described is a technology by which online advertisements for returning with a query response are ranked according to reputation. The reputation may correspond to a product or service and/or seller reputation. In one example, a set of relevant advertisement items are located and ranked using reputation data as a factor. For example, for each item, a ranking value is based on a mathematical combination of a product reputation score, a seller reputation score and a relevance score, with the items ranked by their computed values. The scores may be weighted differently. The reputation data may be mined from a review source, such as customer reviews available on the web. In one example implementation, a 3-gram model that considers terms in the review along with the two terms proceeding each term is used to analyze the reviews to determine whether each review is positive or negative with respect to the reputation.	11-20-2008
20080288483	Efficient retrieval algorithm by query term discrimination - Described is an efficient retrieval mechanism that quickly locates documents (e.g., corresponding to online advertisements) based on query term discrimination. A topmost subset (e.g., two) of search terms is selected according to their ranked importance, e.g., as ranked by inverted document frequency. The topmost terms are then used to narrow the number of rows of an inverted query index that are searched to find document identifiers and associated scores, such as computed offline by a BM25 algorithm. For example, for each document identifier of each important term, a fast search within each of the narrowed subset of rows (that also contain that document identifier) may be performed by comparing document identifiers to jump a pointer within each other row, followed by a binary search to locate a particular document. The scores of the set of particular documents may then be used to rank their relative importance for returning as results.	11-20-2008
20080288491	User segment suggestion for online advertising - Described is a behavioral targeting technology for online advertising, by which an original attribute is uniformly expanded. Users that meet an original attribute are aggregated into a mid-result used to determine similarity relative to candidate attribute types. The most similar candidate attributes are selected for the expanded attribute. A URL/URL pattern suggestion technology is provided, with similarity computed from users/URLs visited by the users. URLs are separated into URL tree nodes, for calculating the number of users who have visited each URL and the number of users who have visited the URL on a sub-tree whose root is the node. URL/URL patterns are output based on similarity. Domains are also suggested based on user-visits. Similarities between pairs of domains may be computed (e.g., offline), with an output for a given domain provided in based on its similarity with each other domain.	11-20-2008
20080300971	ADVERTISEMENT APPROVAL BASED ON TRAINING DATA - A system for determining whether to approve a target document (e.g., advertisement) is provided. The system trains a classifier using tuples of words from appropriate documents and tuples of words from inappropriate documents. To approve a target document, the system identifies tuples of words of the target document. The system then applies the classifier to the identified tuples to classify the document as being appropriate or inappropriate. If the document is classified as appropriate, the system automatically approves the document.	12-04-2008
20080301117	KEYWORD USAGE SCORE BASED ON FREQUENCY IMPULSE AND FREQUENCY WEIGHT - A method and system for assessing keyword usage based on frequency of usage of the keywords during various periods is provided. A keyword usage measurement system is provided with the frequency of keywords during various periods. The measurement system then calculates a recent usage score for a keyword by combining a frequency impulse score for the keyword with a frequency weight for the keyword. The frequency impulse score for a keyword indicates whether a recent change in the frequency of the keyword has occurred. The frequency weight for a keyword indicates a recent measure of the frequency of the keyword.	12-04-2008
20080313180	IDENTIFICATION OF TOPICS FOR ONLINE DISCUSSIONS BASED ON LANGUAGE PATTERNS - A topic identification system identifies topics of online discussions by iteratively identifying topic words or keywords of the online discussions and identifying language patterns associated with those keywords. The topic identification system starts out with an initial set of keywords and identifies language patterns that each include a keyword. The topic identification system then uses the identified language patterns to identify additional keywords of the online discussion that match the patterns. The topic identification system then again identifies language patterns using the keywords including the newly identified keywords. The topic identification system may repeat the process of identifying language patterns and keywords until a termination criterion is satisfied.	12-18-2008
20090006045	FORECASTING TIME-DEPENDENT SEARCH QUERIES - Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.	01-01-2009
20090006284	FORECASTING TIME-INDEPENDENT SEARCH QUERIES - Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.	01-01-2009
20090006294	IDENTIFICATION OF EVENTS OF SEARCH QUERIES - Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.	01-01-2009
20090006312	DETERMINATION OF TIME DEPENDENCY OF SEARCH QUERIES - Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.	01-01-2009
20090006313	FORECASTING SEARCH QUERIES BASED ON TIME DEPENDENCIES - Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.	01-01-2009
20090006326	REPRESENTING QUERIES AND DETERMINING SIMILARITY BASED ON AN ARIMA MODEL - Representing queries and determining similarity of queries based on an autoregressive integrated moving average (“ARIMA”) model is provided. A query analysis system represents each query by its ARIMA coefficients. The query analysis system may estimate the frequency information for a desired past or future interval based on frequency information for some initial intervals. The query analysis system may also determine the similarity of a pair of queries based on the similarity of their ARIMA coefficients. The query analysis system may use various metrics, such as a correlation metric, to determine the similarity of the ARIMA coefficients.	01-01-2009
20090006365	IDENTIFICATION OF SIMILAR QUERIES BASED ON OVERALL AND PARTIAL SIMILARITY OF TIME SERIES - Techniques for identifying similar queries based on their overall similarity and partial similarity of time series of frequencies of the queries are provided. To identify queries that are similar to a target query, the query analysis system generates, for each query, an overall similarity score for that query and the target query based on the time series of the query and the target query. The query analysis system also generates, for each query, partial similarity scores for the query and the target query based on various time sub-series of the overall time series of the queries. The query analysis system then identifies queries as being similar to the target query based on the overall similarity scores and the partial similarity scores of the queries.	01-01-2009
20090063461	USER QUERY MINING FOR ADVERTISING MATCHING - Systems and methods to determine relevant keywords from a user's search query sessions are disclosed. The described method includes identifying search session logs of a user, segmenting the search session logs into one or more search sessions. After the segmentation, the search sessions are analyzed to compose a list of semantically relevant keyword sets including at least a first keyword set and a second keyword set. The described method further includes determining a semantic relevance between the first and second keyword sets according to the frequency at which the first and second keyword sets are reported in the query results and displaying one or more semantically high relevant keyword sets after being filtered by a threshold.	03-05-2009
20090106019	METHOD AND SYSTEM FOR PRIORITIZING COMMUNICATIONS BASED ON SENTENCE CLASSIFICATIONS - A method and system for prioritizing communications based on classifications of sentences within the communications is provided. A sentence classification system may classify sentences of communications according to various classifications such as “sentence mode.” The sentence classification system trains a sentence classifier using training data and then classifies sentences using the trained sentence classifier. After the sentences of a communication are classified, a document ranking system may generate a rank for the communication based on the classifications of the sentences within the communication. The document ranking system trains a document rank classifier using training data and then calculates the rank of communications using the trained document rank classifier.	04-23-2009
20090119284	METHOD AND SYSTEM FOR CLASSIFYING DISPLAY PAGES USING SUMMARIES - A method and system for classifying display pages based on automatically generated summaries of display pages. A web page classification system uses a web page summarization system to generate summaries of web pages. The summary of a web page may include the sentences of the web page that are most closely related to the primary topic of the web page. The summarization system may combine the benefits of multiple summarization techniques to identify the sentences of a web page that represent the primary topic of the web page. Once the summary is generated, the classification system may apply conventional classification techniques to the summary to classify the web page. The classification system may use conventional classification techniques such as a Naïve Bayesian classifier or a support vector machine to identify the classifications of a web page based on the summary generated by the summarization system.	05-07-2009
20090132530	WEB CONTENT MINING OF PAIR-BASED DATA - Described herein is technology for, among other things, mining pair-based data on the web. The technology involves an online pair-based data mining system as well as an offline SVM training system. By subjecting a pair-based input data to the systems, one may grow a pool of pair-based data which share characteristics of the pair-based input data in more efficient manner.	05-21-2009
20090193047	CONTRUCTING WEB QUERY HIERARCHIES FROM CLICK-THROUGH DATA - The claimed subject matter is directed to constructing query hierarchies in response to a query request. To construct a query hierarchy, a list of related candidate queries is generated in response to the received query request. The list of related candidate queries is generated by determining the relative coverage of information shared by the candidate queries and the query request. Relationships between the submitted query request and the candidate queries in the list are determined based upon the extent of relative coverage of information shared by the candidate queries and the query request. A query hierarchy is then constructed to reflect the determined relationships between the query request and the candidate queries.	07-30-2009
20090222321	PREDICTION OF FUTURE POPULARITY OF QUERY TERMS - Disclosed is a system and method that allows a computer system the ability to predict what query terms in a search will be popular. The system creates a unified model that determines the future popularity of a query term over a period of time in the future. The unified model averages the results of three different prediction models to obtain a prediction of the future popularity of a query term. The prediction from the unified model is compared against a threshold value of popularity over a time period. When the predicted popularity of the query exceeds the threshold the term is stored. In some embodiments the period that the term exceeds the threshold may also be stored.	09-03-2009
20090299831	ADVERTISER MONETIZATION MODELING - Embodiments of the claimed subject matter provide a method and system for modeling advertiser monetization. The claimed subject matter provides a method and system from which an advertisement may be evaluated according to various metrics to determine a quality relative to other advertisements. The relative quality considers the content of the advertisement, the performance of the advertisement and the history of the advertiser's bidding behavior.	12-03-2009
20090299855	PREDICTING KEYWORD MONETIZATION - Embodiments of the claimed subject matter provide a method and system for predicting bidding keyword monetization. The claimed subject matter provides a method and system with which the value of a keyword for the purpose of relevant online advertisement may be evaluated according to various metrics to determine a bidding landscape for use in advertising campaigns. The value of the keyword considers certain attributes related to the monetization of the keyword.	12-03-2009
20090299967	USER ADVERTISEMENT CLICK BEHAVIOR MODELING - Described herein is technology for, among other things, mining similar user clusters based on user advertisement click behaviors. The technology involves methods and systems for mining similar user clusters based on log data available on an online advertising platform. By building a user linkage representation based on one or more attributes from the log data, the similar user clusters can be harvested in more efficient manner.	12-03-2009
20090313706	METHOD AND SYSTEM FOR DETECTING WHEN AN OUTGOING COMMUNICATION CONTAINS CERTAIN CONTENT - A method and system for detecting whether an outgoing communication contains confidential information or other target information is provided. The detection system is provided with a collection of documents that contain confidential information, referred to as “confidential documents.” When the detection system is provided with an outgoing communication, it compares the content of the outgoing communication to the content of the confidential documents. If the outgoing communication contains confidential information, then the detection system may prevent the outgoing communication from being sent outside the organization. The detection system detects confidential information based on the similarity between the content of an outgoing communication and the content of confidential documents that are known to contain confidential information.	12-17-2009
20090327320	CLUSTERING AGGREGATOR FOR RSS FEEDS - A method for merging really simple syndication (RSS) feeds. Stories containing one or more terms may be merged into one or more clusters based on one or more links between the stories. A cluster frequency with which the terms occur in each cluster may be determined. A diameter for each cluster may be determined. A cluster that is most similar to one of the clusters may be determined based on the cluster frequency. The most similar cluster with the one of the clusters may be determined based on each diameter, and each cluster frequency.	12-31-2009
20100023508	SEARCH ENGINE ENHANCEMENT USING MINED IMPLICIT LINKS - An implicit links enhancement system and method for search engines that generates implicit links obtained from mining user access logs to facilitate enhanced local searching of web sites and intranets. Embodiments of the implicit links search enhancement system and method includes extracting implicit links by mining users' access patterns and then using a modified link analysis algorithm to re-rank search results obtained from traditional search engines. More specifically, embodiments of the method include extracting implicit links from a user access log, generating an implicit links graph from the extracted implicit links, and computing page rankings using the implicit links graph. The implicit links are extracted from the log using a two-item sequential pattern mining technique. Search results obtained from a search engine are re-ranked based on an implicit links analysis performed using an updated implicit links graph, a modified re-ranking formula, and at least one re-ranking technique.	01-28-2010
20100057798	METHOD AND SYSTEM FOR ADAPTING SEARCH RESULTS TO PERSONAL INFORMATION NEEDS - A method and system for adapting search results of a query to the information needs of the user submitting the query is provided. A search system analyzes click-through triplets indicating that a user submitted a query and that the user selected a document from the results of the query. To overcome the large size and sparseness of the click-through data, the search system when presented with an input triplet comprising a user, a query, and a document determines a probability that the user will find the input document important by smoothing the click-through triplets. The search system then orders documents of the result based on the probability of their importance to the input user.	03-04-2010
20100088303	MINING NEW WORDS FROM A QUERY LOG FOR INPUT METHOD EDITORS - Described is a technology in which new words (including a phrase or set of Chinese characters) are mined from a query log. The new words may be added to (or otherwise supplement) an IME dictionary. A set of candidate queries may be selected from the log based upon market (e.g., the Chinese market) and/or by language. From this set, various filtering steps are performed to locate only new words that are frequently in used. For example, only frequent queries are kept for further processing, which may include filtering out queries based on length (e.g., less than two or greater than eight Chinese characters), and/or filtering out queries based on too many stop-words in the query. Processing may also include filtering out a query that is a substring of a larger query, or Vice-versa. Also described is Pinyin-based clustering and filtering, and filtering out queries already handled in the dictionary.	04-08-2010
20100150393	SENTIMENT CLASSIFICATION USING OUT OF DOMAIN DATA - Providing sentiment classification of out of domain data are disclosed herein. In some aspects, a source domain having a trained classifier is matched to a target domain having a target classifier. The trained classifier may include identifiers that may be used to predict the sentiment of opinion data for the source domain. The target classifier may use the identifiers of the trained classifier to determine the sentiment of opinion data for the target domain.	06-17-2010
20100161596	Learning Latent Semantic Space for Ranking - A tool facilitating learning latent semantics for ranking (LLSR) tailored to the ranking task via leveraging relevance information of query-document pairs to learn a tailored latent semantic space such that other documents are better ranked for the queries in the subspace. The tool applying a learning latent semantics for ranking algorithm integrating LLSR, thereby enabling learning an optimal latent semantic space (LSS) for ranking by utilizing relevance information in the training process of subspace learning. The tool enabling an optimization of the LSS as a closed form solution and facilitating reporting the learned LSS.	06-24-2010
20100161642	MINING TRANSLATIONS OF WEB QUERIES FROM WEB CLICK-THROUGH DATA - Methods and technologies providing translations of web queries based on an analysis of user behavior in click-through data. These methods and technologies generates large-scale and timely query translation pairs guided by a small set of seed word pairs from a dictionary, without relying on additional knowledge or complex models.	06-24-2010
20100169258	Scalable Parallel User Clustering in Discrete Time Window - Described is an internet user clustering technology, such as useful in behavioral targeting, in which users are clustered together based on MinHash computations that produce signatures corresponding to users' internet-related activities. In one aspect, users are clustered together based on commonality of signatures between each set of signatures associated with each user. The signature sets and/or clusters may be associated with timestamps, whereby clusters may be determined for a given discrete time window or set of discrete time windows. To facilitate efficient processing, existing, prior signature sets of a user may be incrementally updated (e.g., daily), and/or the MinHash computations for users are partitioned among parallel computing machines. The timestamps may be used to selectively determine a cluster within a continuous time, a time window or set of time windows.	07-01-2010
20100185569	Smart Attribute Classification (SAC) for Online Reviews - Techniques for identifying attributes in a sentence and determining a number of attributes to be associated with the sentence is described.	07-22-2010
20100185689	Enhancing Keyword Advertising Using Wikipedia Semantics - Disclosed are systems and methods for extracting semantic-based keywords through mining word semantics using Wikipedia's taxonomy. Described is the use a semantic bipartite graph that relates candidate keywords and topics.	07-22-2010
20100191746	Competitor Analysis to Facilitate Keyword Bidding - Disclosed herein are one or more embodiments that facilitate selection of keywords for bidding by an advertiser having a website. One or more of the disclosed embodiments may process a click-through log to determine measures of competitiveness for a plurality of websites extracted from the click-through log. Also, the one or more disclosed embodiments may, for one of the websites, determine a ranking of competing websites based at least in part on the measures of competitiveness. The ranking of competing websites may be used to facilitate selection of keywords for bidding.	07-29-2010
20100241663	PROVIDING CONTENT ITEMS SELECTED BASED ON CONTEXT - Systems, methods, and computer storage media having computer-executable instructions embodied thereon that provide content items selected based on context are provided. Contextual indicators associated with a user are identified and utilized to determine one or more content items that the user is likely to desire to access at a particular point in time. Upon receiving an indication that the user desires to perform a context-aware search, the identified content items (or references thereto) are presented automatically to the user, that is, without the user having to input any search query terms. The indication that the user desires to perform a context-aware search may be received, for instance, upon receiving an indication that a selectable context-aware search button has been selected by the user. This single-button action is particularly useful for mobile computing devices, wherein alpha-numeric textual input is relatively difficult.	09-23-2010
20110077998	CATEGORIZING ONLINE USER BEHAVIOR DATA - A method for categorizing online user behavior data, including creating a target set of users based on an advertiser query, identifying two or more users in the target set having one or more first similar behavior attributes using a Minhash algorithm; and modifying the target set according to the two or more identified users.	03-31-2011
20110078157	OPINION SEARCH ENGINE - A computer-readable storage medium having stored thereon computer-executable instructions which, when executed by a computer, cause the computer to implement an opinion search engine. The instructions to implement an opinion search engine cause the computer to collect opinion data about one or more objects from the Internet, extract metadata about the opinion data from the opinion data, remove duplicate metadata from the metadata to generate a resulting metadata, categorize the resulting metadata for similar objects according to one or more taxonomies from one or more websites on the Internet and rank the similar objects based on the categorized metadata.	03-31-2011
20110078193	QUERY EXPANSION THROUGH SEARCHING CONTENT IDENTIFIERS - Techniques and technologies for expanding a gallery by seeding the gallery with seed query results. A template is identified which is associated with the subjects of the seed queries and content identifiers are identified which include the template. These content identifiers are used to expand the gallery to include more content identifiers than before the expansion of the gallery.	03-31-2011
20110208715	AUTOMATICALLY MINING INTENTS OF A GROUP OF QUERIES - The automatic search intent mining technique described herein pertains to a technique for mining search intent from a group of queries. The automatic search intent mining technique described herein automatically mines search intents from a group of queries. The technique leverages knowledge of query log data in order to determine search intent. The automatic search intent mining technique, in one embodiment, utilizes three kinds of information sources: Web page content, Web page structure and search engine query log data to mine intents for a group of queries. In one embodiment of the technique, the three data sources are used separately to mine candidate search intents for each of the three sources. The candidate search intents extracted from each of the three sources are then integrated to form the final search intents.	08-25-2011
20110213763	WEB CONTENT MINING OF PAIR-BASED DATA - Described herein is technology for, among other things, mining pair-based data on the web. The technology involves an online pair-based data mining system as well as an offline SVM training system. By subjecting a pair-based input data to the systems, one may grow a pool of pair-based data which share characteristics of the pair-based input data in more efficient manner.	09-01-2011
20110258229	Mining Multilingual Topics - Techniques for utilizing data mining technology to extract universal topics with multilingual representations from a multilingual database, and to organize existing or new documents in different languages by analyzing their respective topic distributions.	10-20-2011
20110289025	LEARNING USER INTENT FROM RULE-BASED TRAINING DATA - The search intent co-learning technique described herein learns user search intents from rule-based training data and denoises and debiases this data. The technique generates several sets of biased and noisy training data using different rules. It trains each of a set of classifiers using different training data sets independently. The classifiers are then used to categorize the training data as well as any unlabeled data. The classified data confidently classified by one classifier is added to other training data sets, and the wrongly classified data is filtered out from the training data sets, so as to create an accurate training data set with which to train a classifier to learn a user's intent for submitting a search query string or targeting a user for on-line advertising based on user behavior.	11-24-2011
20110295774	Training SVMs with Parallelized Stochastic Gradient Descent - Techniques for training a non-linear support vector machine utilizing a stochastic gradient descent algorithm are provided. The computations of the stochastic gradient descent algorithm are parallelized via a number of processors. Calculations of the stochastic gradient descent algorithm on a particular processor may be combined according to a packing strategy before communicating the results of the calculations with the other processors.	12-01-2011
20110302031	CLICK MODELING FOR URL PLACEMENTS IN QUERY RESPONSE PAGES - A “General Click Model” (GCM) is constructed using a Bayesian network that is inherently capable of modeling “tail queries” by building the model on multiple attribute values that are shared across queries. More specifically, the GCM learns and predicts user click behavior towards URLs displayed on a query results page returned by a search engine. Unlike conventional click modeling approaches that learn models based on individual queries, the GCM learns click models from multiple attributes, with the influence of different attribute values being measured by Bayesian inference. This provides an advantage in learning that enables the GCM to achieve improved generalization and results, especially for tail queries, than conventional click models. In addition, most conventional click models consider only position and the identity of URLs when learning the model. In contrast, the GCM considers more session-specific attributes in making a final prediction for anticipated or expected user click behaviors.	12-08-2011
20110302155	RELATED LINKS RECOMMENDATION - The related links recommendation technique described herein employs combined collaborative filtering to recommend related web pages to users. The technique creates multiple collaborative filters which are combined in order to create a combined collaborative filter to recommend web pages similar to a given web page to a user. One query-based collaborative filter is created by using query search clicks (e.g., user input device selection actions on search results returned in response to a search query). Another user-behavior-based collaborative filter is created by using query search clicks and user clicks while browsing websites (e.g., user input device selection actions while a user is browsing websites). Lastly, another content-based collaborative filter based on similar content of web pages is created by finding web pages with similar content.	12-08-2011
20120078715	ADVERTISING SERVICE BASED ON CONTENT AND USER LOG MINING - A system and method are disclosed for providing documents related to a search request. The search request may include a search query of one or more keywords, or the search request may be a demographic search query including one or more demographic attributes. An index containing data crawled from publisher's websites, demographic information of registered users, along with the search history of the registered users can be created. Once a search request is received, the search request can be compared to the information stored in the index, and one or more documents related to the request can be provided.	03-29-2012
20120116875	PROVIDING ADVERTISEMENTS BASED ON USER GROUPING - Techniques for providing advertisements in association with electronic content are provided. The advertisements may be provided with electronic content that satisfies a content request made by a particular individual. The advertisements provided in association with the electronic content may depend on the results of an auction between advertisers submitting bids to have their respective advertisements provided in association with the electronic content. The bids may be submitted with respect to different value groups. Each value group includes individuals that may provide an estimated amount of value to a respective advertiser as a customer.	05-10-2012
20120123993	Action Prediction and Identification Temporal User Behavior - User behavior modeling can include determining temporal- or time-based actions performed by various users. From the mined temporal-based user actions, future actions can be predicted. Certain implementations include providing information and/or services based on the predicted future actions. Some implementations, include providing relevant information, services, and/or goods regarding the predicted future action.	05-17-2012
20120130967	CLASSIFICATION OF TRANSACTIONAL QUERIES BASED ON IDENTIFICATION OF FORMS - A method for identifying transactional queries includes associating user queries with forms clicks on by users who employ a search engine to place the queries during query sessions. A score is assigned to each user query. The score reflects a likelihood that the respective query is a transactional query. The query is classified as a transactional query if the score exceeds a threshold value.	05-24-2012
20120136855	Mobile Query Suggestions With Time-Location Awareness - The techniques describe recommending mobile query suggestions by integrating time and location in response to a query input submitted on a mobile computing device. A process constructs a bipartite graph by extracting users that submitted queries from mobile search logs and associating time and location with the submitted queries. The process determines the users are similar having submitted similar queries at similar times and at similar locations. The process receives a query input associated with a current time and a current location of a mobile computing device. Next, the process computes a relatedness of candidate queries to the query input based on a similarity between the user groups having submitted both the candidate queries and the query input, and distances of times and locations at which the user previously issued the query input and the candidate queries.	05-31-2012
20120143789	CLICK MODEL THAT ACCOUNTS FOR A USER'S INTENT WHEN PLACING A QUIERY IN A SEARCH ENGINE - A method of generating training data for a search engine begins by retrieving log data pertaining to user click behavior. The log data is analyzed based on a click model that includes a parameter pertaining to a user intent bias representing the intent of a user in performing a search in order to determine a relevance of each of a plurality of pages to a query. The relevance of the pages is then converted into training data.	06-07-2012
20120143790	RELEVANCE OF SEARCH RESULTS DETERMINED FROM USER CLICKS AND POST-CLICK USER BEHAVIOR OBTAINED FROM CLICK LOGS - Data from a click log may be used to generate training data for a search engine. User click behavior and user post-click behavior may be used to assess the relevance of a page to a query. Labels for training data may be generated based on data from the click log. The labels may pertain to the relevance of a page to a query. For example, user post-click behavior that may be examined includes the amount of time that a user remains on a target page when a user clicks one of the search results.	06-07-2012
20120151386	IDENTIFYING ACTIONS IN DOCUMENTS USING OPTIONS IN MENUS - Documents such as web pages may be regarded as offering various actions; e.g., a website for a movie theater may offer options for viewing movie listings and purchasing tickets. A user may wish to view the set of actions available for a particular document, and/or the performance of an action. However, it may be difficult to identify available actions with acceptable accuracy in an automated manner, and the set of documents (such as the entire worldwide web) may be too voluminous for human identification. In order to identify available actions, the document may be searched for menus containing options, and identifying the actions associated with each option according to an option score. Additionally, documents may be grouped into document categories (e.g., websites for movie theaters and websites for musicians) to facilitate the association options in similar documents with similar sets of actions that are often provided for such documents.	06-14-2012
20120166370	SMART ATTRIBUTE CLASSIFICATION (SAC) FOR ONLINE REVIEWS - Techniques for identifying attributes in a sentence and determining a number of attributes to be associated with the sentence is described.	06-28-2012
20120259801	TRANSFER OF LEARNING FOR QUERY CLASSIFICATION - Transfer of learning trains a new domain for the classification of search queries according to different tasks, as well as the generation of a corresponding domain-specific query classifier that may be used to classify the search queries according to the different tasks in the new domain. The transfer of learning may include preparing a new domain to receive classification knowledge from one or more source domains by populating the new domain with preliminary query patterns extracted for a search engine log. The transfer of learning may further include preparing the classification knowledge in each source domain for transfer to the new domain. The classification knowledge in each source domain may then be transferred to the new domain.	10-11-2012
20120265760	Random Walk on Query Pattern Graph for Query Task Classification - A classification process may reduce the computational resources and time required to collect and classify training data utilized to enable a user to effectively access online information. According to some implementations, training data is established by defining one or more seed queries and query patterns. A bi-partite graph may be constructed using the seed query and query pattern information. A traversal of the bi-partite graph can be performed to expand the training data to encompass sufficient data to perform classification of the present search task.	10-18-2012
20120278162	CONDUCTING AN AUCTION OF SERVICES RESPONSIVE TO POSITIONAL SELECTION - Various technologies pertaining to provision of graphical data to a client computing device responsive to receipt of a positional selection on a web page by a user of a client computing device are described herein. A computer-executable application executing on the client computing device detects that the user has selected a certain position on the web page, wherein this application is not called by code of the web page. The position is transmitted to an ad server, which conducts an auction for display space on the client computing device based at least in part upon the detection of the certain position.	11-01-2012
20120284224	BUILD OF WEBSITE KNOWLEDGE TABLES - Architecture that defines domain knowledge on networks, such as the Internet, as tables where each row is an entity in the target domain and each column is an attribute of these entities. The corresponding values for entity-attribute pairs are the domain knowledge. The architecture provides semi-automatic and systematic ways to extract network knowledge from at least an unstructured and semi-structured network (the Internet), structuralizes the knowledge in table format, and uses the domain tables to build the online updated knowledge base.	11-08-2012
20120294520	GESTURE-BASED VISUAL SEARCH - A user may perform an image search on an object shown in an image. The user may use a mobile device to display an image. In response to displaying the image, the client device may send the image to a visual search system for image segmentation. Upon receiving a segmented image from the visual search system, the client device may display the segmented image to the user who may select one or more segments including an object of interest to instantiate a search. The visual search system may formulate a search query based on the one or more selected segments and perform a search using the search query. The visual search system may then return search results to the client device for display to the user.	11-22-2012
20120323948	DIALOG-ENHANCED CONTEXTUAL SEARCH QUERY ANALYSIS - Embodiments of the present invention relate to systems, methods, and computer-storage media for a method of contextually analyzing terms within a search query. In one embodiment, a received search query is classified into a domain category. Additionally, information is assigned to a schema associated with the domain by analyzing the search query. Further, at least one search result that helps a user complete a task within the domain is provided based on the information in the schema.	12-20-2012
20130073546	Indexing Semantic User Profiles for Targeted Advertising - Embodiments facilitate greater flexibility in definition of user segments for targeted advertising, by employing indexed semantic user profiles. Semantic user profiles are built through extraction of online user behavior data such as user search queries and page views, and include user interest information that is inferred based on user behavior. Semantic user profiles are then indexed to facilitate search for a set of users that fit specified semantic search terms. Search results for semantic profiles are ranked according to a ranking model developed through machine learning. In some embodiments, building and indexing of semantic profiles and learning of the ranking model is performed offline to facilitate more efficient online processing of queries.	03-21-2013
20130138655	Web Knowledge Extraction for Search Task Simplification - Techniques are described for generating structured information from semi-structured web pages, and retrieving the structured knowledge in response to a user query that indicates a query intent. The structured information is automatically extracted offline from semi-structured web pages, through the use of an auto wrapper solution that is noise tolerant, scalable, and automatic. The structured information is stored in a knowledge base, and provided in response to a user search query that indicates a query intent. Extraction of structured information may also include clustering of pages based on their measured similarities. The clusters may be determined based on similar elements in the tag path text data of the pages. A minimum size threshold may be applied to the clusters.	05-30-2013
20130173571	CLICK NOISE CHARACTERIZATION MODEL - The techniques discussed herein consider a degree of noise associated with user clicks performed during search sessions. The techniques then generate a model that characterizes click noise so that search engines can more accurately infer document relevance.	07-04-2013
20130218824	Action Prediction and Identification of User Behavior - User behavior modeling can include determining actions performed by various users. From the mined user actions, future actions can be predicted. Certain implementations include providing information and/or services based on the predicted future actions. Some implementations, include providing relevant information, services, and/or goods regarding the predicted future action.	08-22-2013
20130226935	Context-based Search Query Formation - Searching is assisted by recognizing a selection of text from a document as an indication that a user wishes to initiate a search based on the selected text. The user is provided with query suggestions based on the selected text and the query suggestions are ranked based on a context provided by the document. The user may select the text by using a mouse, drawing a circle around the text on a touch screen, or by other input techniques. The query suggestions may be based on query reformulation or query expansion techniques applied to the selected text. Context provided by the document is used by a language model and/or an artificial intelligence system to rank the query suggestions in predicted order of relevance based on the selected text and the context.	08-29-2013
20130246435	FRAMEWORK FOR DOCUMENT KNOWLEDGE EXTRACTION - A knowledge extraction framework may iteratively enrich an ontology that is used to classify structured knowledge obtained from web pages based on structured knowledge previously acquired from other web pages. The framework may enable a user to define the ontology for extracting structured knowledge from a plurality of web pages. The framework applies the ontology using a supervised extraction algorithm to extract seed information from a set of web pages. The framework further applies an unsupervised extraction algorithm to extract the structured knowledge from an additional set of web pages. The framework subsequently maps the structured knowledge to the ontology based on the seed information to enrich the ontology.	09-19-2013
20140003714	GESTURE-BASED VISUAL SEARCH	01-02-2014
20140012842	Indexing Semantic User Profiles for Targeted Advertising - Embodiments facilitate greater flexibility in definition of user segments for targeted advertising, by employing indexed semantic user profiles. Semantic user profiles are built through extraction of online user behavior data such as user search queries and page views, and include user interest information that is inferred based on user behavior. Semantic user profiles are then indexed to facilitate search for a set of users that fit specified semantic search terms. Search results for semantic profiles are ranked according to a ranking model developed through machine learning. In some embodiments, building and indexing of semantic profiles and learning of the ranking model is performed offline to facilitate more efficient online processing of queries.	01-09-2014
20140052751	SMART USER-CENTRIC INFORMATION AGGREGATION - A smart user-centric information aggregation system allows a user to define a region of content displayed in a display of a device and performs information aggregation on behalf of the user. The smart user-centric information aggregation system searches, aggregates and groups information related to content included in the region of content for the user while the user can continue to perform his/her original course of actions without interruption. After finding information related to the desired content, the smart user-centric information aggregation system may notify the user and present the found information to the user upon receiving confirmation from the user. The smart user-centric information aggregation system may continue to find new related information and update the presentation with the newly found information periodically, in some instances without user intervention or input.	02-20-2014
20140075393	Gesture-Based Search Queries - An image-based text extraction and searching system extracts an image be selected by gesture input by a user and the associated image data and proximate textual data in response to the image selection. Extracted image data and textual data can be utilized to perform or enhance a computerized search. The system can determine one or more database search terms based on the textual data and generate at least a first search query proposal related to the image data and the textual data.	03-13-2014
20140101081	SENTIMENT CLASSIFICATION USING OUT OF DOMAIN DATA - Providing sentiment classification of out of domain data are disclosed herein. In some aspects, a source domain having a trained classifier is matched to a target domain having a target classifier. The trained classifier may include identifiers that may be used to predict the sentiment of opinion data for the source domain. The target classifier may use the identifiers of the trained classifier to determine the sentiment of opinion data for the target domain.	04-10-2014
20140223381	INVISIBLE CONTROL - An invisible control may be implemented in a client device or in an application of the client device. A user may activate the invisible control by applying a gesture on a predetermined region of the client device or the application. In response to receiving the user gesture, a predetermined action associated with the invisible control may be activated. The predetermined action may be applied to the application or some or all of the content associated with the application. An Application Programming Interface may further be provided to allow the user, an application vendor or a content provider to customize the invisible control or operating modes associated with activation of the invisible control.	08-07-2014
20140244629	Communication-Powered Search - A communication-powered searching system provides real-time personalized search assistance to a user by integrating search functionality with real-time communication. Upon submitting a query and receiving search results from the communication-powered searching system, the user may select a communication link included in the search results to activate communication with an entity associated with the communication link. The communication-powered searching system may then refine search results displayed to the user based on information exchanged between the user and the entity. The refinements may be made in real time or substantially in real time.	08-28-2014
20150046459	MINING MULTILINGUAL TOPICS - Techniques for utilizing data mining technology to extract universal topics with multilingual representations from a multilingual database, and to organize existing or new documents in different languages by analyzing their respective topic distributions.	02-12-2015
20150046462	IDENTIFYING ACTIONS IN DOCUMENTS USING OPTIONS IN MENUS - Documents such as web pages may be regarded as offering various actions; e.g., a website for a movie theater may offer options for viewing movie listings and purchasing tickets. A user may wish to view the set of actions available for a particular document, and/or the performance of an action. However, it may be difficult to identify available actions with acceptable accuracy in an automated manner, and the set of documents (such as the entire worldwide web) may be too voluminous for human identification. In order to identify available actions, the document may be searched for menus containing options, and identifying the actions associated with each option according to an option score. Additionally, documents may be grouped into document categories (e.g., websites for movie theaters and websites for musicians) to facilitate the association options in similar documents with similar sets of actions that are often provided for such documents.	02-12-2015

Patent applications by Zheng Chen, Beijing CN

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Zheng Chen, Beijing CN

Zheng Chen, Beijing CN