Patent application number | Description | Published |
20080320119 | Automatically identifying dynamic Internet protocol addresses - Dynamic IP addresses may be automatically identified and their dynamics patterns may be analyzed. Multi-user IP address blocks are determined as candidates for further analysis. An entropy score is determined for each IP address in every candidate block to distinguish between a dynamic IP and a static IP shared by multiple users. IP addresses with high entropy scores are grouped, and then analyzed, and may be used in various applications, such as spam filtering. | 12-25-2008 |
20090076794 | ADDING PROTOTYPE INFORMATION INTO PROBABILISTIC MODELS - Mechanisms are disclosed for incorporating prototype information into probabilistic models for automated information processing, mining, and knowledge discovery. Examples of these models include Hidden Markov Models (HMMs), Latent Dirichlet Allocation (LDA) models, and the like. The prototype information injects prior knowledge to such models, thereby rendering them more accurate, effective, and efficient. For instance, in the context of automated word labeling, additional knowledge is encoded into the models by providing a small set of prototypical words for each possible label. The net result is that words in a given corpus are labeled and are therefore in condition to be summarized, identified, classified, clustered, and the like. | 03-19-2009 |
20090228353 | QUERY CLASSIFICATION BASED ON QUERY CLICK LOGS - Methods are provided for the classification of search engine queries and associated documents based on search engine query click logs. One or more seed documents or queries are provided that contain content that is representative of a category. A query click log containing information regarding queries entered by at least one user into the search engine and documents subsequently clicked in search engine results corresponding with the queries is analyzed to determine which one or more queries resulted in clicks on the seed documents. Information is stored associating the one or more queries with the category if they resulted in clicks on the seed documents. | 09-10-2009 |
20090254989 | CLUSTERING BOTNET BEHAVIOR USING PARAMETERIZED MODELS - Identification and prevention of email spam that originates from botnets may be performed by finding similarity in their host property and behavior patterns using a set of labeled data. Clustering models of host properties pertaining to previously identified and appropriately tagged botnet hosts may be learned. Given labeled data, each botnet may be examined individually and a clustering model learned to reflect upon a set of selected host properties. Once a model has been learned for every botnet, clustering behavior may be used to look for host properties that fit into a profile. Such traffic can be either discarded or tagged for subsequent analysis and can also be used to profile botnets preventing them from launching other attacks. In addition, models of individual botnets can be further clustered to form superclusters, which can help understand botnet behavior and detect future attacks. | 10-08-2009 |
20090265786 | AUTOMATIC BOTNET SPAM SIGNATURE GENERATION - A framework may be used for generating URL signatures to identify botnet spam and membership. The framework may take a set of unlabeled emails as input that are grouped based on URLs contained within the emails. The framework may return a set of spam URL signatures and a list of corresponding botnet host IP addresses by analyzing the URLs within the emails that are contained within the groups. Each URL signature may be in the form of either a complete URL string or a URL regular expression. The signatures may be used to identify spam emails launched from botnets, while the knowledge of botnet host identities can help filter other spam emails also sent by them. | 10-22-2009 |
20110246286 | CLICK PROBABILITY WITH MISSING FEATURES IN SPONSORED SEARCH - Sponsored search advertising utilizes a click probability as one factor in selecting and ranking advertisements that are displayed with search results. The probability of click may also be referred to as a predicted click-through rate (“CTR”) that may be multiplied by an advertiser's bid for a particular advertisement to rank the display of advertisements. An accurate prediction of the click probability improves the potential revenue that is generated by advertisements in a pay per click system. Other advertising systems may benefit from an accurate and reliable estimate for an advertisement's probability of click in different environments and scenarios. | 10-06-2011 |
20120022952 | Using Linear and Log-Linear Model Combinations for Estimating Probabilities of Events - A method for combining multiple probability of click models in an online advertising system into a combined predictive model, the method commencing by receiving a feature set slice (e.g. corresponding to demographics or taxonomies or clusters), and using the sliced data for training multiple slice-wise predictive models. The trained slice-wise predictive models are combined by overlaying a weighted distribution model over the trained slice-wise predictive models. The combined predictive model then is used in predicting the probability of a click given a query-advertisement pair in online advertising. The method can flexibly receive slice specifications, and can overlay any one or more of a variety of distribution models, such as a linear combination or a log-linear combination. Using an appropriate weighted distribution model, the combined predictive model reliably yields predictive estimates of occurrence of click events that are at least as good as the best predictive model in the slice-wise predictive model set. | 01-26-2012 |
20120136722 | Using Clicked Slate Driven Click-Through Rate Estimates in Sponsored Search - A computer-implemented method and system for selecting a subject advertisement in a sponsored search system based on a user's commercial intent (pertaining to the subject advertisement), using techniques for determining intent-driven clicks from a historical database. The method includes steps for aggregating a training model dataset wherein the training model dataset contains a selected history of clicks. Then, selecting from the training model dataset, a clicked slate (further selection of clicks), the clicked slate comprising a set of clicked ads, and calculating an intent-driven click feedback value for the subject advertisement. The method includes techniques for selecting a clicked slate using features corresponding to clicks received within a particular time period (the time period determined statically or dynamically). A system for implementing the method includes aggregating data from a historical database using selectors such as a position selector, a click feature selector, an impression-advertiser-campaign-creative selector, and a commercial intent selector. | 05-31-2012 |
20130275235 | USING LINEAR AND LOG-LINEAR MODEL COMBINATIONS FOR ESTIMATING PROBABILITIES OF EVENTS - A system for determining predictive models associated with online advertising can include a communications interface, a processor, and a display. The communications interface can be configured to receive a partial dataset. The partial dataset may include user information. The processor can be communicatively coupled to the communications interface and configured to identify the partial dataset. The processor can also be configured to determine a first predictive model corresponding to at least part of the partial dataset and a second predictive model by combining a probability distribution with the first predictive model. The display can be communicatively coupled to the processor and configured to display the second predictive model. | 10-17-2013 |
20140358730 | Systems And Methods For Optimally Ordering Recommendations - Example systems and methods for optimally ordering recommendations or search results are described. In one implementation, a method determines a set of items as search results or candidates for recommendation. Each of the items in the set is associated with a respective first parameter representative of a measure of suitability of the respective item for recommendation. The items in the set are associated with a plurality of second parameters each of which representative of a measure of similarity between a respective one of the items and another respective one of the items. The method also determines an order in which a subset of the items is to be displayed based at least in part on the representative first parameter and the representative set of second parameters associated with the items. The method further displays the subset of the items in the determined order. | 12-04-2014 |
20140358742 | Systems And Methods For Mapping In-Store Transactions To Customer Profiles - Example systems and methods for mapping in-store transactions to customer profiles are described. In one implementation, a method receives information of a plurality of customer profiles of a plurality of online customers. Each of the customer profiles includes a plurality of types of attributes associated with a respective one of the online customers. The types of attributes include a first type of attribute. The method also receives information of a plurality of in-store transactions by a plurality of in-store customers. The information of each of the in-store transactions includes the first type of attribute associated with the respective in-store customer. The method further maps, for at least one of the in-store customers, one or more of the customer profiles of online customers to the at least one of the in-store customers based at least in part on the first type of attribute. | 12-04-2014 |
20140358771 | SYSTEMS AND METHODS FOR CLUSTERING OF CUSTOMERS USING TRANSACTION PATTERNS - Example systems and methods for clustering of customers using patterns in their transactions are described. In one implementation, a method receives customer information that includes at least a plurality of customer identifications and a plurality of payment options associated with a plurality of customers. The method identifies a subset of payment options, from among the payment options, and a subset of customer identifications, from among the customer identifications, such that each payment option of the subset of payment options is associated with more than one customer identification of the subset of customer identifications. The method then classifies each customer identification of the subset of customer identifications as either of one of more than one of the customer identifications associated with a single one of the customers or one of more than one of the customer identifications associated with more than one of the customers who are related to each other. | 12-04-2014 |