Data mining

Subclass of:

707 - Data processing: database and file management or data structures

707705000 - DATABASE AND FILE ACCESS

707758000 - Record, file, and data search and comparisons

707769000 - Database query processing

Patent class list (only not empty are listed)

Deeper subclasses:

Class / Patent application number	Description	Number of patent applications / Date published
707777000	Taxonomy discovery	42

Document	Title	Date
Entries
20100049772	EXTRACTION OF ANCHOR EXPLANATORY TEXT BY MINING REPEATED PATTERNS - A method and system for identifying explanatory text for a referenced web page based on a reference to the referenced web page contained in a repeated pattern of a referencing web page is provided. An anchor explanatory text (“AET”) system uses the hierarchical organization of the web page to identify a repeated pattern of hierarchical elements that contain references to other display pages. After the AET system identifies a repeated pattern, it identifies the dominant reference or anchor within each occurrence of the pattern. The AET system uses the explanatory text surrounding a dominant anchor as a description of the referenced web page.	02-25-2010
20100070528	Method and system for apportioning opportunity among campaigns in a CRM system - In accordance with embodiments, there are provided mechanisms and methods for providing apportioning of opportunity among campaigns in an on-demand service in a database system. These mechanisms and methods for providing apportioning of opportunity among campaigns can enable embodiments to automatically determine which campaigns are related to an opportunity and provide a filtered set of campaigns that are related to at least one opportunity. The ability of embodiments to apportion opportunity among campaigns can provide marketing information that accurately reflects the true relationship between an opportunity and a plurality of campaigns.	03-18-2010
20100082673	APPARATUS, METHOD AND PROGRAM PRODUCT FOR CLASSIFYING WEB BROWSING PURPOSES - A web browsing purpose classification apparatus, including a display unit which displays a webpage and a document retrieval unit which retrieves document data from the displayed webpage. A keyword extraction knowledge unit stores knowledge necessary for keyword extraction. This knowledge is used by a keyword extraction unit to extract keywords from the document data. A webpage format determination knowledge unit stores knowledge necessary for the determination of webpage formats which is used by a webpage format determination unit to determine webpage formats. A web browsing history storage unit stores the keywords and webpage formats as web browsing history. A browsing purpose classification knowledge unit stores knowledge necessary for the classification of browsing purposes which is used by a browsing purpose classification unit to classify browsing purposes.	04-01-2010
20100114954	REALTIME POPULARITY PREDICTION FOR EVENTS AND QUERIES - A system, media, and method for realtime popularity prediction for event and queries are provided. The popularity prediction is made by a prediction engine that is coupled to a search engine, a crawler, and a sentiment component. The prediction engine determines a change in popularity for an event or a query based on content provided by the crawler, sentiments identified by the sentiment component, and queries received in realtime by the search engine. The prediction engine may also use the content, sentiments, and queries to predict an outcome for a popularity based event.	05-06-2010
20100174747	METHODS FOR RECOMMENDING NEW INDIVIDUALS TO BE INVITED INTO A CONFIRMED SOCIAL NETWORK BASED ON MINED SOCIAL DATA - A computer-implemented method that inputs a confirmed social network of a user, performs data mining of electronically accessible data for the user to produce a mined social network including individuals having a social relationship with the user and having an electronically accessible link to the user, subtracts the confirmed social network of the user from the mined social network to produce a recommendation list, in which the recommendation list includes at least one new individual not belonging to the confirmed social network of the use, and in which the recommendation list recommends the at least one new individual not belonging to the confirmed social network for membership in the confirmed social network, and outputs the recommendation list to the user.	07-08-2010
20100185670	MINING TRANSLITERATIONS FOR OUT-OF-VOCABULARY QUERY TERMS - An approach is described for using a query expressed in a source language to retrieve information expressed in a target language. The approach uses a translation dictionary to convert terms in the query from the source language to appropriate terms in the target language. The approach determines viable transliterations for out-of-vocabulary (OOV) query terms by retrieving a body of information based on an in-vocabulary component of the query, and then mining the body of information to identify the viable transliterations for the OOV query terms. The approach then adds the viable transliterations to the translation dictionary. The retrieval, mining, and adding operations can be repeated one or more or times.	07-22-2010
20100205212	NON-CONFORMANCE ANALYSIS USING AN ASSOCIATIVE MEMORY LEARNING AGENT - According to an embodiment, a non-conformance analysis system may include at least one information storage tool that stores previously generated non-conformance information; a data mining tool that retrieves specific attributes of the previously generated non-conformance information stored in the at least one information storage tool; an associative memory subsystem that is populated with information involving a plurality of entity types, with each entity type including at least one entity, to form an associative memory; and a user input device that enables a user to input a non-conformance query into the associative memory subsystem, that causes the associative memory subsystem to generate all of the entity types and entities that include information useful for investigating the non-conformance query.	08-12-2010
20100211603	COMPUTER-AIDED METHODS AND SYSTEMS FOR PATTERN-BASED COGNITION FROM FRAGMENTED MATERIAL - A method for obtaining and analyzing information objects including generating, collecting or discovering information objects. The information objects are signified at least in part using deliberately ambiguated signifier prompts, for example, linear scale opposing negatives or positives, and/or multi-dimensional signifier prompts. The information objects may comprise text or non-text fragments, and may be generated or selected. The responses to the signifier prompts are stored with the fragments to provide a dataset of signified fragments. The signified fragments may be analyzed based on the signifiers and can be utilized as part of an explorable knowledge repository, or objective measures can be created to aid in mass opinion capture or human attitude auditing. The fragments may be represented on a graphical template. In one embodiment, fragment exemplars are identified that exemplify significant locations on the template, and the exemplar signifiers are used to automatically locate other signified fragments on the template.	08-19-2010
20100217777	System for Automatic Arrangement of Portlets on Portal Pages According to Semantical and Functional Relationship - The present invention relates to the field of network computing, and in particular to method and system for designing a Web Portal comprising a hierarchical structure of portal pages and portlets for accessing Web contents accessible via the Portal. A typical larger enterprise's portal contains large numbers, e.g., thousands of pages and portlets. Due to the complexity of an enterprise portal, manual administration is inefficient as it is time-consuming, error-prone and thus expensive. In order to overcome these disadvantages, it is proposed that a Portal according to the invention performs some mining of the portlet markup and/or that of the portlet description in order to autonomously compute and propose an enhanced portal content structure. This helps to provide a user-friendly content structure that reflects well the relationships between portlets.	08-26-2010
20100223291	Text Mining Device, Text Mining Method, and Text Mining Program - With respect to each part at which a word included in a characteristic condition defining a characteristic text set designated by a user through the input device appears in text, the characteristic condition assurance degree calculating unit of the text mining device obtains a reliability of the word from the word reliability storage unit to operate a value of a characteristic condition assurance degree for each text by predetermined operation based on all the obtained reliabilities. The characteristic condition assurance degree calculating unit executes operation such that when a value of each reliability is large, a value of a degree of assurance becomes large. The representative text output unit outputs text whose characteristic condition assurance degree is the highest among texts whose characteristic condition assurance degrees are calculated together with its characteristic condition assurance degree.	09-02-2010
20100250596	Methods and Apparatus for Identifying Conditional Functional Dependencies - Methods and apparatus are provided for discovering minimal conditional functional dependencies (CFDs). CFDs extend functional dependencies by supporting patterns of semantically related constants, and can be used as rules for cleaning relational data. A disclosed CFDMiner algorithm, based on techniques for mining closed itemsets, discovers constant minimal CFDs. A disclosed CTANE algorithm discovers general minimal CFDs based on the levelwise approach. A disclosed FastCFD algorithm discovers general minimal CFDs based on a depth-first search strategy, and an optimization technique via closed-itemset mining to reduce search space.	09-30-2010
20100250597	MODELING SEMANTIC AND STRUCTURE OF THREADED DISCUSSIONS - A simultaneous semantic and structure threaded discussion modeling system and method for generating a model of a discussion thread and using the model to mine data from the discussion thread. Embodiments of the system and method generate a model that contains both semantic terms and structure terms. The model simultaneously models both semantics and structure of the discussion thread. A model generator includes a semantic module generates two semantic terms for the model and a structure module generates two structure terms for the model. The generator combines the two semantic terms and the two structure terms to generate the simultaneous semantic and structure model. Embodiments of the system and method include an applications module, which contains three application that use the model to reconstruct reply relations among posts in the discussion thread, identify junk posts in the discussion thread, and find experts in each sub-board of web forums.	09-30-2010
20100262620	CONCEPT-BASED ANALYSIS OF STRUCTURED AND UNSTRUCTURED DATA USING CONCEPT INHERITANCE - In one embodiment, a method comprises defining a set of concepts based on a first set of structured and unstructured data objects, defining a business rule based on the set of concepts, applying the business rule to a second set of structured and unstructured data objects to make a determination associated with that set, and outputting to a display information associated with the determination.	10-14-2010
20100274807	Method and system for representing information - A preferred method and system for dynamically and/or statically identifying, manipulating, registering and comparing information are disclosed. In a preferred method, the elements and their respective associations respective to a data string or corpus are identified and/or represented through corresponding network elements and/or configurations. In addition, this disclosure further teaches the methodology of implementing the disclosed methodology of “informational networks” to perform an information application such as that of a search engine while effectively avoiding semantic irrelevance or selecting only relevant information with restrictions of a given grammar.	10-28-2010
20100274808	SYSTEM AND METHOD FOR MAKING A RECOMMENDATION BASED ON USER DATA - There is described a system and computer-implemented method for providing a recommendation based on a sparse pattern of data. An exemplary method comprises determining a likelihood that an item for which no user preference data is available will be preferred. The exemplary method also comprises determining a likelihood that an item for which user preference data is available for users other than a particular user will be preferred based on the likelihood that the item for which no user preference data is available will be preferred. The exemplary method additionally comprises predicting that an item for which no user preference data relative to the particular user is available will be preferred if the likelihood that the particular user will prefer the item exceeds a certain level.	10-28-2010
20100293195	Disambiguation and Tagging of Entities - Tagging of content items and entities identified therein may include a matching process, a classification process and a disambiguation process. Matching may include the identification of potential matching candidate entities in a content item whereas the classification process may categorize or group identified candidate entities according to known entities to which they are likely a match. In some instances, a candidate entity may be categorized with multiple known entities. Accordingly, a disambiguation process may be used to reduce the potential matches to a single known entity. In one example, the disambiguation process may include ranking potentially matching known entities according to a hierarchy of criteria.	11-18-2010
20100293196	METHOD AND SYSTEM FOR ANALYZING ORDERED DATA USING PATTERN MATCHING IN A RELATIONAL DATABASE - Several methods and a system for analyzing ordered data using pattern matching over an indefinitely long ordered sequence of rows in a relational database are disclosed. In one embodiment, a method of a server includes receiving an ordered data in a relational database. The method further includes matching a pattern specified in a query on ordered data in a relational database in a single pass in constant space for overlapping mode of results. The method also includes creating an output data in the single pass in constant space for overlapping mode of results based on the matching of the ordered data with the pattern in the relational database query.	11-18-2010
20100299360	EXTRAPOLATION OF ITEM ATTRIBUTES BASED ON DETECTED ASSOCIATIONS BETWEEN THE ITEMS - An attribute of a first item is extrapolated to a second item that is not known to have that attribute. The extrapolation occurs as a result of a substitution association detected between the first and second items. The substitution association may be detected based on an analysis of the content of the first and second items. The extrapolated attribute may be a behavioral association with a third item, in which case an inference is drawn that the second and third items are behaviorally related. The items may, for example, be products represented in an electronic catalog.	11-25-2010
20100306259	MENU SEARCHING OF A HIERARCHICAL MENU STRUCTURE - A menu search system allows a user to search through a menu structure, rather than only navigate hierarchically through the menu structure. When a user selects a menu search mode, the menu search system allows the user to enter text and, as the text is entered, searches the menu hierarchy for menu items with names that match the text. The menu search system then displays the matching menu items so that the user can select a displayed menu item of interest.	12-02-2010
20100306260	NUMBER SEQUENCES DETECTION SYSTEMS AND METHODS - Numbered sequences detection includes (i) extracting one or more numbered item token patterns from a document comprising an ordered sequence of text units, each numbered item token pattern including an incremental portion and a fixed portion that matches at least one text unit of the document and (ii) identifying at least one numbered sequence in the document conforming with a matching numbered item token pattern of the extracted one or more numbered item token patterns. The identified at least one numbered sequence comprises an ordered sub-sequence of text units of the document that match the matching numbered item token pattern. The detection may further comprise determining that a second type of numbered sequence nests in the document between consecutive text units belonging to a numbered sequence of a first type, and optimizing one or more numbered sequences of the second type based on information provided by the determining.	12-02-2010
20100306261	Localized Gesture Aggregation - Systems, methods and computer readable media are disclosed for a localized gesture aggregation. In a system where user movement is captured by a capture device to provide gesture input to the system, demographic information regarding users as well as data corresponding to how those users respectively make various gestures is gathered. When a new user begins to use the system, his demographic information is analyzed to determine a most likely way that he will attempt to make or find it easy to make a given gesture. That most likely way is then used to process the new user's gesture input.	12-02-2010
20100306262	Extending Dynamic Matrices for Improved Setup Capability and Runtime Search Performance of Complex Business Rules - A mechanism by which rule attributes of varying types and numbers can be stored and searched in an efficient manner is provided by storing attribute values of each rule in a child table of a parent rule table. The child table is normalized and contains a foreign key pointing back to the parent rule table and has attribute-value pairs as table columns of the child table. Each rule is then represented by one row of the parent rule table and one or more corresponding rows of the child rule details table. A variable and unlimited number of attribute dimensions is supported among the rules, and search performance is improved through the use of database indexes on the rule details table attribute columns. Metadata representing the structure of the child rule details table will identify the data attributes for each dimension.	12-02-2010
20100306263	APPARATUSES AND METHODS FOR DETERMINISTIC PATTERN MATCHING - Apparatuses and methods to perform pattern matching are presented. In one embodiment, an apparatus comprises a memory to store a first pattern table comprising information indicative of whether a byte of input data matches a pattern and whether to ignore other matches of the pattern occur in remaining bytes of the input data. The apparatus further comprises one-byte match logic coupled to the memory, to determine, based on the information in the first pattern table, a one-byte match event with respect to the input data. The apparatus further comprises a control unit to filter the other matches of the pattern based on the information of the first pattern table.	12-02-2010
20100306264	OPTIMIZING PUBLISH/SUBSCRIBE MATCHING FOR NON-WILDCARDED TOPICS - A method, a system and a computer program product for matching a publication to at least one subscriber are disclosed. After receiving a publication request, a matching engine accesses a hash table to determine whether there is a non-wildcarded match corresponding to the publication request. If the matching engine finds the non-wildcarded match in the hash table, the matching engine omits validating a topic of the publication and provides the non-wildcarded match to the broker device without waiting for a result of searching a wildcarded match. Otherwise, the matching engine validates the topic of the publication. The matching engine also starts to search a wildcarded match in a wildcarded subscription data store. Upon finding the wildcarded match, the matching engine provides the wildcarded match to the broker device. The matching engine provides each result of the findings asynchronously to the broker device.	12-02-2010
20100306265	DATA AND EVENT MANAGEMENT SYSTEM AND METHOD - A management system and method for facilitating the management of data and events. The system includes a server, a processor, and a feed in communication with the server. The feed provides real-time information relating to a predetermined group of people and/or facilities. The processor is operable to process the information delivered by the feed and update a first database. The system also includes a computer processing unit (CPU), an interface and a document associating circuit. The CPU processes the server and the second database so as to provide information onto a display. The document associating circuit is operable to process the first and second databases so as to associate documents stored in the first and second databases with each event displayed on the calendar and deliver the associated document with a selected event.	12-02-2010
20100332539	PRESENTING A RELATED ITEM USING A CLUSTER - An initial item is grouped into a cluster defined by a query expression applied to a description of the item. Given the initial item, its associated cluster is accessed, and another item is identified based on the initial item's cluster or from a cluster designated as similar to the initial item's cluster. Once identified, the other item is presented as related to the initial item.	12-30-2010
20100332540	CONDITION MONITORING WITH AUTOMATICALLY GENERATED ERROR TEMPLATES FROM LOG MESSAGES AND SENSOR TRENDS BASED ON TIME SEMI-INTERVALS - An approach is provided for condition monitoring from log messages and sensor trends based on time semi-intervals. The approach may be applied to machine condition monitoring. Patterns are mined from symbolic interval data that extends previous approaches by allowing semi-intervals and partially ordered patterns. The semi-interval patterns and semi-interval partial order patterns are less restrictive than patterns using Allen's relations. Combinations and adaptations of efficient algorithms from sequential pattern and itemset mining for discovery of semi-interval patterns are described.	12-30-2010
20110004624	Method for Customer Feedback Measurement in Public Places Utilizing Speech Recognition Technology - A method, a system and a computer program product for enabling a customer response speech recognition unit to dynamically receive customer feedback. The customer response speech recognition unit is positioned at a customer location. The speech recognition unit is automatically initialized when one or more spoken words are detected. The response statements of customers are dynamically received by the customer response speech recognition unit at the customer location, in real time. The customer response speech recognition unit determines when the one or more spoken words of the customer response statement are associated with a score in a database. An analysis of the words is performed to generate a score that reflects the evaluation of the subject by the customer. The score is dynamically updated as new evaluations are received, and the score is displayed within graphical user interface (GUI) to be viewed by one or more potential customers.	01-06-2011
20110004625	Multi-Interval Heuristics For Accelerating Target-Value Search - Methods and systems for solving a target value search problem using a multi-interval heuristic are presented. The methods and system identity a path, or paths, in a graph, whereby a connection graph is created and range sets are generated for each vertex in the connection graph. Range sets include one or more intervals. Thereafter, a best search is performed to identify a path, or paths, from a starting vertex to a goal vertex having a path value closest to a target value.	01-06-2011
20110004626	System and Process for Record Duplication Analysis - A system and process for record duplication analysis that relies on a multi-membership Bayesian analysis to determine the probability that records within a data set are matches. The Bayesian calculation may rely on objective data describing the data set as well as subjective assessments of the data set. In addition, a system and process for record duplication analysis may rely on the predetermination of probabilistic patterns, where the system only searches for patterns exceeding a chosen threshold. Work flow may include selecting which fields within each record should be analyzed, normalizing the values within those fields and removing default data, calculating possible patterns and their match probabilities, analyzing record pairs to determine which have patterns exceeding a chosen threshold to determine the presence of duplicates, and merging duplicates, closing transactions reflecting non-duplicates, identifying records having insufficient data to determine the existence or lack of a match, and/or rolling back accidental merges.	01-06-2011
20110010392	Checkpoint-Free In Log Mining For Distributed Information Sharing - Techniques for replicating data between database systems without taking checkpoints are provided. In an embodiment, a capture process restarts. Upon restarting, the capture process reestablishes an association with an apply process. A particular logical time maintained by the apply process is then communicated to the capture process. Upon receiving the particular logical time, the capture process restarts mining from this particular logical time.	01-13-2011
20110022634	IMAGE SEARCH DEVICE AND IMAGE SEARCH METHOD - Provided is an image search device which relatively easily searches a large amount of stored images for images that a user wishes to use for interpolation, and which includes: an interpolation range computing unit (	01-27-2011
20110055264	DATA MINING ORGANIZATION COMMUNICATIONS - Data mining for organization insights may be provided. Data from a plurality of sources, such as user communications and documents, may be collected. The collected data may be analyzed to identify an insight about users or organizations associated with the communications. The insight may be provided to a user, such as in response to a search query, an analytics tool, or an added application functionality.	03-03-2011
20110066650	QUERY CLASSIFICATION USING IMPLICIT LABELS - Described is a technology for automatically generating labeled training data for training a classifier based upon implicit information associated with the data. For example, whether a query has commercial intent can be classified based upon whether the query was submitted at a commercial website's search portal, as logged in a toolbar log. Positive candidate query-related data is extracted from the toolbar log based upon the associated implicit information. A click log is processed to obtain negative query-related data. The labeled training data is automatically generated by separating at least some of the positive candidate query data from the remaining positive candidate query data based upon the negative query data. The labeled training data may be used to train a classifier, such as to classify an online search query as having a certain type of intent or not.	03-17-2011
20110072047	Interest Learning from an Image Collection for Advertising - Described herein is a technology that facilitates learning interests for advertising based on automated analysis of images. In several embodiments a person's interests are automatically learned based on the person's photographs for targeted advertising. Techniques are described that facilitate automatically detecting a user's interest from images and suggesting user-targeted ads. As described herein, these techniques include computer-annotating images with learned tags, performing topic learning to obtain an interest model, and performing advertisement matching and ranking based on the interest model.	03-24-2011
20110078188	Mining and Conveying Social Relationships - Techniques and tools described herein mine social information from a source and store the social information in a database. Responsive to a search object, the techniques search the stored social information and determine social relationships. The techniques further provide, via a graphical user interface, the social relationships determined from the social information stored in the database. In several embodiments, the techniques enable social relationship feedback.	03-31-2011
20110078189	NETWORK GRAPH EVOLUTION RULE GENERATION - A network's evolution is characterized by graph evolution rules. A graph that represents an evolutionary network is mined to identify evolutional patterns of the network, and graph evolution rules are generated using identified evolutional patterns. The generated graph evolution rules represent the evolutional patterns of the network.	03-31-2011
20110082883	INTELLIGENT EVENT-BASED DATA MINING OF UNSTRUCTURED INFORMATION - A method, system and computer program product is disclosed for intelligent data mining. The method comprises receiving an event from an application, assigning property weights to properties of the event, and building a query from these properties based on the property weights. The method further comprises assigning search engine weights to a group of search engines, selecting at least some of the search engines based on the search engine weights, and sending the built query to the selected search engines. Results from the selected search engines are stored in a knowledge repository and used to adjust the property weights and the search engine weights. The invention may be used to provide an analysis with information about a problem, and to manage a solutions database which can be used for problem determination. The invention provides a low cost solution for collecting relevant information from online sources.	04-07-2011
20110082884	PATTERN RECOGNITION IN WEB SEARCH ENGINE RESULT PAGES - Described herein are methods and systems for pattern recognition in web search engine result pages. The input data is a result page from a web search engine as well as an integer number for the results on the page. The output is a regular expression that matches all the results on the page, capturing each result and its individual fields.	04-07-2011
20110087700	ABSTRACTING EVENTS FOR DATA MINING - An event is described herein as being representable by a quantified abstraction of the event. The event includes at least one predicate, and the at least one predicate has at least one constant symbol corresponding thereto. An instance of the constant symbol corresponding to the event is identified, and the instance of the constant symbol is replaced by a free variable to obtain an abstracted predicate. Thus, a quantified abstraction of the event is composed as a pair: the abstracted predicate and a mapping between the free variable and an instance of the constant symbol that corresponds to the predicate. A data mining algorithm is executed over abstracted, quantified events to ascertain a correlation between the event and another event.	04-14-2011
20110093501	DOMAIN INDEPENDENT SYSTEM AND METHOD OF AUTOMATING DATA AGGREGATION - A computer automated method of presenting data. The method includes the steps of inputting a set of user-defined instructions into a computer database system, inputting a user query into said computer database system, mining the computer database system for data relevant to said user query, creating a data set comprising the data relevant to the user query, and selecting at least one presentation report for compiling the data, wherein the selection is based on any of predefined and configurable rules and past user usage. At least one presentation report is then displayed to the user, wherein the displaying process further includes the step of graphically arranging the at least one presentation report based on an available viewing area of a device accessing the at least one presentation report.	04-21-2011
20110106849	NEW CASE GENERATION DEVICE, NEW CASE GENERATION METHOD, AND NEW CASE GENERATION PROGRAM - A new case whose type is the same as that of a case about information desired to be extracted can be generated with high accuracy.	05-05-2011
20110125793	METHOD FOR DETERMINING RESPONSE CHANNEL FOR A CONTACT CENTER FROM HISTORIC SOCIAL MEDIA POSTINGS - Some social networks provide message histories that record information about previous posts that users make to the social media network. From this information, a contact center determines trends in the usage of a social media network by a user. The contact center can mine the message history database for times, frequency of posts, location of the user during posts, and other information provided in the message histories. From this information or metadata about the messages, the contact center develops trends about the user's postings of messages on social media networks. The contact center can further receive subsequent posts and read metadata related to the subsequent posts. The new metadata can be used to modify the trends over time.	05-26-2011
20110125794	METHOD OF PROVIDING A CAR POOLING ASSISTANCE THROUGH A WIRELESS COMMUNICATION SYSTEM - The invention relates to a method which receives location information of a mobile terminal of a single user. One or more journeys are extracted from the location information of the single user. The corresponding journey data is stored in a journey database. From the journey data in the journey database, journey patterns for the single user are extracted. A journey pattern indicates at least the regularity of a particular journey in time, i.e. over a number of days. The journey patterns are stored in the pattern database. The journey patterns of the single user are matched with patterns of other users. If a match is found, at least one match based on the journey patterns is sent to the single user. These features enable the carpool service to find a match which takes into account the regularity across a period of days. By identifying the regularity, a better match can be made with users which travel the same route, as also the days on which the users travel are taken into account.	05-26-2011
20110131244	EXTRACTION OF CERTAIN TYPES OF ENTITIES - Certain types of entities may be extracted from a document. In one example, the entities to be recognized are cultural entities, such as the names of movies, video games, books, etc. For each such entity, a concept graph may be built that shows the relationship between the entity itself and other entities, such as the relationship between a movie and the actor(s) who act in the movie. When a candidate entity name is detected in the document, the concept graph may be used to look for other entities that appear in the context of the candidate entity. The presence of related entities in the context of the candidate may be used to disambiguate the meaning of the candidate. For example, a common word like “up” might be recognized as the name of a movie if the names of actors or characters in that movie appear near the word “up”.	06-02-2011
20110145285	SYSTEM AND METHOD FOR INTENT MINING - A method for intent mining is provided. The method includes performing a preliminary search of a constrained source using one or more seed phrases to generate multiple preliminary search results representing different ways of expressing a desired intent. The method also includes identifying each of the plurality of preliminary search results that have expressed the desired intent to generate a plurality of intent results. The method also includes producing multiple action search strings around one or more action verbs in each of the multiple intent results. The method further includes applying each of the multiple action search strings on one or more non-constrained sources to generate multiple action search results.	06-16-2011
20110153663	RECOMMENDATION ENGINE USING IMPLICIT FEEDBACK OBSERVATIONS - Systems and methods to provide a recommendation engine that uses implicit feedback observations are provided. A particular method includes receiving accessing data comprising a plurality of implicit feedback observations for a plurality of users. The plurality of users includes a first user that requested a recommendation. Each implicit feedback observation is associated with a particular user and a particular item of a plurality of items. The method includes determining a plurality of preference ratings and a plurality of confidence ratings for each user of the plurality of users for each item based on the plurality of implicit feedback observations. The method includes generating a recommendation list of one or more of the plurality of items for the first user based on the plurality of preference ratings and the plurality of confidence ratings.	06-23-2011
20110153664	Selective Storing of Mining Models for Enabling Interactive Data Mining - Computerized methods, data processing systems, and computer program products for storing of data mining models (DMMs) are provided. A new DMM is created having at least one of the following characteristics: quality and complexity. The new DMM is handled as a candidate for storing in a storage device if a predefined criterion for the characteristics is met. The sum of the sizes of the new DMM and already stored DMMs is determined In response to the sum falling below a storage limit, the new DMM is stored in the storage device. In response to the sum exceeding the storage limit, a decision is taken based on priorities of the DMMs which DMMs to store in the storage device. The priorities depend at least on access frequencies of the DMMs. Upon a data mining request, a corresponding DMM is determined and a user is requested to confirm that data mining is to proceed if quality of the determined DMM does not fulfill a further predefined criterion.	06-23-2011
20110153665	APPARATUS FOR PROVIDING SOCIAL NETWORK SERVICE USING RELATIONSHIP OF ONTOLOGY AND METHOD THEREOF - Provided are an apparatus for providing a social network service using the relationship of ontology and a method thereof. The apparatus includes: an ontology storage unit storing social ontology defining relationship information between a user and a social network subscriber, service ontology defining position and relationship information of services, and tag ontology defining tag information related to information included in the social ontology and the service ontology; when a service request is inputted from the user, an ontology analysis unit retrieving a tag corresponding to the user's current position and the service request factor by using the relationship of the ontologies stored in the ontology storage unit; a service processing unit extracting the corresponding service on the basis of the retrieved tag information; and a service providing unit providing the user with the extracted service.	06-23-2011
20110161367	TEXT MINING APPARATUS, TEXT MINING METHOD, AND COMPUTER-READABLE RECORDING MEDIUM - A text mining apparatus, a text mining method, and a program are provided that enable the influence that computer processing errors have on mining results to be reduced during text mining performed on a plurality of text data pieces including a text data piece generated by computer processing. A text mining apparatus	06-30-2011
20110161368	TEXT MINING APPARATUS, TEXT MINING METHOD, AND COMPUTER-READABLE RECORDING MEDIUM - A text mining apparatus, a text mining method, and a program are provided that accurately discriminate inherent portions of each of a plurality of text data pieces including a text data piece generated by computer processing.	06-30-2011
20110173232	STRING SEARCH SCHEME IN A DISTRIBUTED ARCHITECTURE - Methods and apparatuses for searching network data for one or more predetermined strings are disclosed. In one embodiment, the string search is a multi-stage search where the stages of the search are performed by different hardware components. In one embodiment in a first search stage, a first processor performs a comparison of blocks of incoming data to determine whether the blocks potentially represent the beginning of one of the predetermined strings. If a potential predetermined string is identified, a second processor performs a further search to determine whether the string matches one of the predetermined strings. Because the first processor searches only for the beginning of the predetermined strings, the first stage comparison can be performed quickly, which improves network performance as compared to more detailed searching. The second stage is performed by second processor, which allows the first processor to search for potential matching strings. Because many strings do not match the one or more predetermined strings, the more detailed search performed by the second processor is performed selectively, which increases network performance as compared to more detailed searches on all network data.	07-14-2011
20110184982	System and method for capturing and reporting online sessions - The present invention discloses a computer system for reporting online sessions and a computer enabled method utilizing the same. The computer system is made up of an icon that preferably appears on a user screen. The icon is capable of capturing a screen session on the user screen and saving it within a recording. The recording may then be communicated to a database server that is capable of extracting a plurality of target components from said recording, and is capable of storing them in a database. The database may contain a benchmark content of the plurality of target components. Target components may then be compared against the benchmark content in a variety of ways to determine whether the level of target components is above or below reasonable and socially accepted levels.	07-28-2011
20110184983	METHOD AND SYSTEM FOR EXTRACTING AND CHARACTERIZING RELATIONSHIPS BETWEEN ENTITIES MENTIONED IN DOCUMENTS - Methods and devices for use in gathering and analyzing data from a corpus of documents. A corpus of documents is initially scanned for words that qualify as entities according to user defined criteria. Multiple counters track the number of documents which mention specific entities. A database of entities mentioned in the documents is maintained and an entry for each entity in the corpus is placed in the entity database. The results are then presented to a user in a spiral form with the most important entity at the center of the spiral. The importance of an entity may be determined by either how many entities it is connected to or how many documents mention that entity. A connection exists between two entities if they are both mentioned in at least one document and the more documents mention two specific entities at the same time, the stronger the connection between those two specific entities. The result presentation to the user is capable of also visually representing connections between entities by connecting connected entities with lines. The strength of a connection can also be represented with the width of the line connecting two entities.	07-28-2011
20110191372	TRIBE OR GROUP-BASED ANALYSIS OF SOCIAL MEDIA INCLUDING GENERATING INTELLLIGENCE FROM A TRIBE'S WEBLOGS OR BLOGS - A computer-based method for generating intelligence from social media data, such as blog data, that is publicly available on the Internet. A server is provided that runs a tribe analysis tool, and the method includes accessing a set of the social media data with the tribe analysis tool. The social media data is associated with a plurality of network users or authors. The method continues with operating the tribe analysis tool to identify members of a tribe from the authors by processing the set of social media data to determine the authors having associated portions of the social media data that satisfies tribe membership criteria. Common interests for the identified members of the tribe are determined by processing the social media data associated with the tribe authors. A report is generated for the tribe that includes information related to the set of common interests and additional generated tribe-based intelligence.	08-04-2011
20110191373	Customized Reporting and Mining of Event Data - Event data (e.g., log messages) are represented as sets of attribute/value pairs. An index maps each attribute/value pair or attribute/value tuple to a pointer that points to event data which contains the attribute/value pair or attribute/value tuple. An attribute co-occurrence map or matrix can be generated that includes attribute names that co-occur together. Queries and custom reports can be generated by projecting event data into one or more attributes or attribute/value pairs, and then determining statistics on other attributes using a combination of the inverted index, the attribute co-occurrence map or matrix, operations on sets and/or math and statistical functions.	08-04-2011
20110196895	EXTRAPOLATION-BASED CREATION OF ASSOCIATIONS BETWEEN SEARCH QUERIES AND ITEMS - Behavior-based associations, such as item-to-item or query-to-item associations, are extrapolated to other items to create new associations. The items to which the associations are extrapolated may be “behavior deficient” items, or items for which the quantity of collected user activity data is insufficient to create meaningful or reliable behavior-based associations. The behavior-based associations are extrapolated based on content-based associations, or another type of “substitutability” association, between items. The items can be any type of item (e.g., products, web sites, documents, etc.) for which user behaviors (e.g., purchases, accesses, downloads, etc.) can be monitored and analyzed to detect behavior-based associations, and for which item content or other available information can be used to assess item substitutability.	08-11-2011
20110202561	SYSTEM AND METHOD FOR PROVIDING AN ADJUSTMENT VALUE FOR KEYWORDS RETRIEVED FROM A DATA SOURCE AND ADJUSTING AN AVM VALUE BASED ON THE ADJUSTMENT VALUE - The invention discloses a system for adjusting an automated valuation model (AVM) value. The system includes a property data source for receiving property data for a property, a data mining module for searching the property data for keywords with corresponding values, and a data matching module for recognizing the keywords, for determining an adjustment value based on the corresponding values, for receiving an AVM value representing an estimated value of the property, and for obtaining an adjusted AVM value based on the AVM value and the adjustment value.	08-18-2011
20110202562	SYSTEM AND METHOD FOR DATA MINING WITHIN INTERACTIVE MULTIMEDIA - Systems and methods are provided for data mining in the context of an interactive video. During the presentation of an interactive video, a user may interact with the interactive video by, e.g., making selections, choosing options, etc. related to one or more aspects of the interactive video. Such events and details regarding the events may be recorded, stored, and analyzed in the context of one or more campaigns associated with the interactive video, such as marketing campaigns, advertising campaigns, interactive examinations, etc. Once the details regarding the events have been stored, reports may be extracted based upon the details detailing any desired information relevant to the one or more campaigns.	08-18-2011
20110213804	SYSTEM FOR EXTRACTING RALATION BETWEEN TECHNICAL TERMS IN LARGE COLLECTION USING A VERB-BASED PATTERN - Disclosed herein is a system structure for extracting relations between technical terms within a large amount of literature information using verb-based patterns. The present invention provides a system that is capable of extracting relations based on verb-based patterns from abstract and bibliography databases in all fields of science and technology using a Tech Association Mining Appliance (TAMA) capable of detecting the technical terms of text and relations therebetween in academic literature databases in the fields of science and technology. The present invention has an advantage of providing a practical relation extraction system structure using a number of academic databases.	09-01-2011
20110225193	ACTIVE TAGS - A method for retrieving data in a data source is provided. The method includes receiving a search term; identifying an active tag associated with the search term; correlating the active tag to dynamic data that is operative to adapt to a mining context in which data is stored; and retrieving the data using the dynamic data.	09-15-2011
20110225194	APPARATUS AND METHOD FOR ANALYZING INFORMATION ABOUT FLOATING POPULATION - Disclosed herein is an apparatus and method for analyzing information about floating population. The apparatus includes an information collection unit, a data integration unit, a data mining analysis unit, and an interface unit. The information collection unit collects information about locations provided by mobile communication terminals of moving objects, information about attributes of the moving objects, and information about locations and attributes related to stationary objects. The data integration unit creates integrated data by integrating the information collected by the information collection unit, national statistical information, and map data registered previously. The data mining analysis unit extracts data, consistent with conditions input by a system user, from the integrated data, and searches the map data for based moving patterns of the moving objects using data mining analysis. The interface unit provides a map service in which search results have been applied to the map data.	09-15-2011
20110225195	SYSTEM AND METHOD FOR GATHERING ECOMMERCE DATA - The techniques introduced here provide a method of gathering ecommerce data. The techniques described here allow a system to return information about a product from several non-related ecommerce sites in response to a single search query. Using the techniques described here, a data mining system determines from the search query a product ID and retrieves from a database one or more product links that correspond to the product ID. Using the product links retrieved from the database, the data mining system traverses the links and parses the web-pages corresponding each of the links to determine up to date product information. The product information can then be returned to the application that initiated the request.	09-15-2011
20110231443	Query interface to policy server - A scalable access filter that is used together with others like it in a virtual private network to control access by users at clients in the network to information resources provided by servers in the network. Each access filter uses a local copy of an access control data base to determine whether an access request is made by a user. Each user belongs to one or more user groups and each information resource belongs to one or more information sets. Access is permitted or denied according to access policies which define access in terms of the user groups and information sets. The first access filter in the path performs the access check, encrypts and authenticates the request; the other access filters in the path do not repeat the access check. The interface used by applications to determine whether a user has access to an entity is now an SQL entity. The policy server assembles the information needed for the response to the query from various information sources, including source external to the policy server.	09-22-2011
20110231444	METHOD AND COMPUTER PROGRAM PRODUCT FOR USING DATA MINING TOOLS TO AUTOMATICALLY COMPARE AN INVESTIGATED UNIT AND A BENCHMARK UNIT - Sources of operational problems in business transactions often show themselves in relatively small pockets of data, which are called trouble hot spots. Identifying these hot spots from internal company transaction data is generally a fundamental step in the problem's resolution, but this analysis process is greatly complicated by huge numbers of transactions and large numbers of transaction variables to analyze. A suite of practical modifications are provided to data mining techniques and logistic regressions to tailor them for finding trouble hot spots. This approach thus allows the use of efficient automated data mining tools to quickly screen large numbers of candidate variables for their ability to characterize hot spots. One application is the screening of variables which distinguish a suspected hot spot from a reference set.	09-22-2011
20110246521	SYSTEM AND METHOD FOR DISCOVERING IMAGE QUALITY INFORMATION RELATED TO DIAGNOSTIC IMAGING PERFORMANCE - A system for discovering information related to diagnostic imaging performance at a medical imaging site. The system includes at least one database of stored digital diagnostic images; and a user instruction interface for obtaining an operator request for information related to image quality of the stored digital diagnostic images. A data processor is in communication with the at least one database, the data processor being programmed with instructions to use only information found within the stored digital diagnostic images themselves. A data mining engine is in communication with the data processor, the data mining engine being programmed with instructions to use only information found within the retrieved digital diagnostic images themselves.	10-06-2011
20110258229	Mining Multilingual Topics - Techniques for utilizing data mining technology to extract universal topics with multilingual representations from a multilingual database, and to organize existing or new documents in different languages by analyzing their respective topic distributions.	10-20-2011
20110270882	RESOURCE DESCRIPTION FRAMEWORK NETWORK CONSTRUCTION DEVICE AND METHOD USING AN ONTOLOGY SCHEMA HAVING CLASS DICTIONARY AND MINING RULE - RDF network construction device and method using an ontology schema having class dictionaries and mining rules are provided. The RDF network construction device includes an ontology schema storing module, a class managing module, a mining rule managing module, a mining pattern creating module, and an RDF triple creating module.	11-03-2011
20110295892	SYSTEM AND METHOD FOR WEB MINING AND CLUSTERING - A method and system for web mining and clustering is described. The method includes receiving and dividing input data into a plurality of primitive datasets. Additionally, one or more combinations of the plurality of primitive datasets may be created. Further, a model for each primitive dataset in the plurality of primitive datasets and each of the one or more combinations of the plurality of primitive datasets may be generated. Subsequently, a cost associated with a model corresponding to each primitive dataset in the plurality of primitive datasets, and each of the one or more combinations of the plurality of primitive datasets may be computed. Further, a sum of the costs associated with the models corresponding to each primitive dataset in the plurality of primitive datasets may be compared with the cost associated with each model corresponding to each of the one or more combinations of the plurality of primitive datasets. Finally, the plurality of primitive datasets may be partitioned into one or more clusters based on the comparison of the costs such that each primitive dataset is a part of a cluster in the one or more clusters or a stand-alone primitive dataset.	12-01-2011
20110295893	METHOD OF SEARCHING AN EXPECTED IMAGE IN AN ELECTRONIC APPARATUS - A method of searching an expected image in an electronic apparatus comprises the steps of inputting a hand drawing of the expected image into the electronic apparatus; determining whether or not a text description for partially characterizing the expected image is inputted; identifying and searching the expected image in the electronic apparatus according to the hand drawing if the text description is not inputted, or selecting a text label from the text description and interpreting the selected text label by the electronic apparatus if the text description is inputted; and searching a database in the electronic apparatus according to the text label, and fetching the expected image from the database if the value of the image item matches the text label. The hand drawing and/or text label inputted from a mobile phone screen are provided for arranging and searching pictures or images in the database efficiently.	12-01-2011
20110295894	SYSTEM AND METHOD FOR MATCHING PATTERN - System and method for matching a pattern are provided. The pattern matching method includes performing a sub pattern matching operation to match at least one sub data of a plurality of sub data of a target data with a pre-stored pattern data, and performing a full pattern matching operation to determine whether the target data is identical to at least the pre-stored pattern data by referring to a result of the sub pattern matching operation, and wherein the full pattern matching operation is performed or not performed according to a type of the pre-stored pattern data. Accordingly, an accurate matching operation is performed with respect to the target data of various patterns.	12-01-2011
20110320490	NAMED ENTITY DATABASE OR MINING RULE DATABASE UPDATE APPARATUS AND METHOD USING NAMED ENTITY DATABASE AND MINING RULE MERGED ONTOLOGY SCHEMA - An apparatus and method for updating a named entity dictionary or a mining rule database using the named entity dictionary and a mining rule combined with an ontology schema is provided. The apparatus includes a named entity dictionary and mining rule database storage module storing the named entity dictionary and a mining rule database; a mining pattern generation module recognizing a terminology from a text and converting the terminology into the mining pattern; a named entity and mining rule search module searching for a corresponding named entity and a mining rule from the named entity dictionary and the mining rule database using the recognized terminology and the mining pattern; and a named entity dictionary update module estimating a named entity of the terminology using the mining rule and storing the estimated named entity of the terminology in the named entity dictionary depending on a user's selection.	12-29-2011
20110320491	MODULE AND METHOD FOR SEARCHING NAMED ENTITY OF TERMS FROM THE NAMED ENTITY DATABASE USING NAMED ENTITY DATABASE AND MINING RULE MERGED ONTOLOGY SCHEMA - A module and method for determining a named entity of a terminology using a named entity dictionary and a mining rule combined with an ontology schema is provided. The module includes a named entity dictionary and mining rule database storing the named entity dictionary and a mining rule database; a mining pattern generation unit recognizing a terminology from a text and converting the terminology into a mining pattern; a named entity and mining rule search unit searching for a corresponding named entity and a mining rule respectively from the named entity dictionary and the mining rule database using the recognized terminology and the mining pattern; and a names entity selection unit selecting, if two or more named entities corresponding to the recognized terminology are searched, a named entity matching to the concept configuring the RDF triple of the searched mining rule as a named entity of the terminology among the searched named entities.	12-29-2011
20110320492	SYSTEM AND METHOD FOR TRACKING VEHICLE OPERATION THROUGH USER-GENERATED DRIVING INCIDENT REPORTS - A method and system that allows anyone to flag good or bad driving incidents by fellow motorists. Driving behavior data is captured as user-generated content, and the system includes the necessary provision to verify the authenticity and accuracy of all records submitted. The database of driving records is subsequently used to calculate a driver risk-score, allowing companies to make informed decisions related to their business when impacted by driver behavior (e.g. calculating a car insurance premium).	12-29-2011
20110320493	METHOD AND DEVICE FOR RETRIEVING DATA AND TRANSFORMING SAME INTO QUALITATIVE DATA OF A TEXT-BASED DOCUMENT - Method for extracting information from a data file comprising a first step wherein the data are transmitted to a device (	12-29-2011
20120011155	Generalized Notion of Similarities Between Uncertain Time Series - Embodiments of the invention related to a method and system for finding a distance between a plurality of time series, wherein each individual time series in the plurality of time series including a data, wherein the data is uncertain, and using such distance computed in business applications.	01-12-2012
20120011156	INTER-CLASS MOLECULAR ASSOCIATION CONNECTIVITY MAPPING - Methods, systems, devices and/or apparatuses are provided for computationally deriving molecular association connectivity maps for the study of inter-class molecular associations in toxicogenomics and drug discovery applications. The inter-class molecular associations can be between at least one bio-molecular entity and at least one therapeutic agent. The methods, systems, devices and/or apparatuses apply integrated molecular interaction network mining and text mining techniques.	01-12-2012
20120023134	PATTERN MATCHING DEVICE, PATTERN MATCHING METHOD, AND PATTERN MATCHING PROGRAM - An optimal subspace for a distance or a similarity cannot be obtained by a pattern matching device which obtains a subspace independent from the distance or the similarity used for matching. A pattern matching device includes a feature extraction unit for extracting a feature value by lowering the dimension of data using a feature extraction parameter; a calculation unit for calculating a distance or a similarity of the data to be matched using the feature value; and a parameter updating unit for comparing the distance or the similarity, and updating the feature extraction parameter so that the value of the distance or the similarity becomes closer to a matching result regarding whether or not the values of the distance or the similarity are in the same category.	01-26-2012
20120023135	METHOD FOR USING VIRTUAL FACIAL EXPRESSIONS - The method is for using a virtual face. The virtual face is provided on a screen associated with a computer system having a cursor. A user manipulates the virtual face with the cursor to show a facial expression. The computer system determines coordinates of the facial expression. The computer system searches for facial expression coordinates in a database to match the coordinates. A word or phrase is identified that is associated with the identified facial expression coordinates. The screen displays the word to the user. The user may also feed a word to the computer system that displays the facial expression associated with the word.	01-26-2012
20120036157	System and Method for Determining Valid Citation Patterns in Electronic Documents - A system and method are provided for comparing portions of document text with potential citation components, determining if individual portions correspond to a citation component, and determining if a set of portions correspond to a valid citation pattern. A set of valid citation patterns is provided. Each citation pattern may include a specified combination of citation components. The invention further relates to identifying potential citation components from text in a document, analyzing a pattern of the identified citation components by comparing the pattern to a set of stored citation patterns to determine if the potential citation is a type of citation, and if so, is it a valid (and/or invalid) citation pattern. Once citation patterns have been determined in the document, annotations may be inserted into the document, and subsequent action may be taken, for example, generating a list of citations, providing research services, error-handling, and/or providing other options related to the citations.	02-09-2012
20120041979	METHOD FOR GENERATING CONTEXT HIERARCHY AND SYSTEM FOR GENERATING CONTEXT HIERARCHY - The present disclosure relates to a method for generating a context hierarchy and a system for generating a context hierarchy, and more particularly, to a method for generating a context hierarchy from data streams configured of an infinite set of continuously transactions and a system for generating a context hierarchy from the data streams.	02-16-2012
20120047172	PARALLEL DOCUMENT MINING - A technique includes providing a collection of documents in multiple languages, identifying, from the collection of documents, a group of candidate documents, where each candidate document in the group shares multiple corresponding rare features, evaluating pairs of candidate documents in the group using multiple common features present in the collection of documents, and determining, based on evaluating the pairs of candidate documents, whether each pair of candidate documents corresponds to a translated pair of documents.	02-23-2012
20120059850	COMPUTERIZED FACE PHOTOGRAPH-BASED DATING RECOMMENDATION SYSTEM - A computer vision dating system analyzes combinations of face features of the system's user's photographs and recommends potential dating partners. A user selects preferred and not-preferred faces from a sample of other user's pictures. The system analyzes the features of the preferred and not-preferred faces comparing the combinations of features in both categories with the features of other users in the database to find the users that most match the collective features preferred by the user. These pictures are presented to the user. Data from the user's profile input are analyzed to automatically generate the sample pictures from which the user selects his/her preferences. As the users are presented pictures after their sample selection, they can continue to select and reject pictures allowing the system to learn and refine the combinations of features and better locate those that most conform to a user's most preferred photo images.	03-08-2012
20120066259	RECIPROCAL ADDITION OF ATTRIBUTE FIELDS IN ACCESS CONTROL LISTS AND PROFILES FOR FEMTO CELL COVERAGE MANAGEMENT - System(s) and method(s) provide access management to femto cell service through access control list(s) (e.g., white list(s), or black list(s)). White list(s) includes a set of subscriber station(s) identifier numbers, codes, or tokens, and also can include additional fields for femto cell access management based on desired complexity. White list(s) can have associated white list profile(s) therewith to establish logic of femto coverage access based on the white list(s). A mechanism for reciprocal addition of access field attributes in access control lists and white list profiles also is provided. The mechanism allows at least in part for a first subscriber to be added to a configured white list of a second subscriber, when the first subscriber configures a new white list, the second subscriber is reciprocally incorporated in the new white list. Such mechanism can be driven and facilitates generation of associations among groups of subscribers that share specific commonalities.	03-15-2012
20120066260	System And Method For Building Decision Trees In A Database - A computer-implemented method of creating a data mining model in a database management system comprises accepting a database language statement at the database management system, the database language statement indicating a dataset and a data mining model to be created from the dataset, and creating, in the database management system, the indicated data mining model using the indicated dataset, wherein creation and application of the data mining model does not require moving data to a separate data mining engine.	03-15-2012
20120072453	SYSTEMS, METHODS, AND MEDIA FOR DETERMINING FRAUD PATTERNS AND CREATING FRAUD BEHAVIORAL MODELS - Systems, methods, and media for analyzing fraud patterns and creating fraud behavioral models are provided herein. In some embodiments, methods for analyzing call data associated with fraudsters may include executing instructions stored in memory to compare the call data to a corpus of fraud data to determine one or more unique fraudsters associated with the call data, associate the call data with one or more unique fraudsters based upon the comparison, generate one or more voiceprints for each of the one or more identified unique fraudsters from the call data, and store the one or more voiceprints in a database.	03-22-2012
20120072454	APPARATUS AND METHOD FOR IDENTIFYING THE CREATOR OF A WORK OF ART - A method for determining the authorship of a picture, wherein the method comprises at least the following steps: —transferring the picture to be examined or parts of the picture to be examined with the aid of a digitizing means, in particular a scanner, into at least one data set, —analyzing the data set(s) and determining characteristic features or parts of characteristic features, in particular dots or lines or dot or line groups or patterns, contained in the data set in digitized form, wherein the characteristic features to be determined are stored in a database, —and wherein the database includes an additional associated data set for each of the stored characteristic features.	03-22-2012
20120084323	GEOGRAPHIC TEXT SEARCH USING IMAGE-MINED DATA - Textual information may be harvested from photos that are associated with a geographic location, and the text may be used to respond to searches. In one example, photos are taken from a vehicle that has a camera and a GPS receiver. Each of the photos is marked with the geographic location at which it was taken, and text is extracted from the photos. Thus, each piece of text is associated with a particular geographic location, and the association between text and location is stored in a database. At some point in time, a query is received from a user, where the query specifies or implies a geographic criterion. The database is then examined to determine what items in the database meet the textual and geographic constraints of the query, and those pieces of information may be provided as search results.	04-05-2012
20120089642	PROVIDING USERS WITH A PREVIEW OF TEXT MINING RESULTS FROM QUERIES OVER UNSTRUCTURED OR SEMI-STRUCTURED TEXT - The system and methods described herein provide results previewing for an interactive text mining system in order to feedback partial query results to users before all results that are responsive to a query have been found. These partial results allow the user to see the progress of their text mining query much sooner.	04-12-2012
20120089643	METADATA RECORD GENERATION - A computer implemented method and system provide for automatic selection and extraction of metadata and media content from projects in a craft tool. Automated identification, classification and management of such metadata and content is provided using including techniques such as pattern recognition for audio and visual content. The automatic tracking and centralised storage of metadata and content for compliance purposes can be facilitated, and can enable querying of organised metadata stored in a central database. In an example, metadata and media content are extracted automatically from a project in a craft tool at a client system and are forwarded to a host system for the creation of a cue sheet including timings for media files from timing metadata in a project file to create the timings on the cue sheet.	04-12-2012
20120096031	SYSTEM, METHOD, AND PROGRAM PRODUCT FOR EXTRACTING MEANINGFUL FREQUENT ITEMSET - A system and method for enabling efficient extraction of only meaningful frequent itemsets. The system includes a decision unit that decides a new itemset that becomes an investigation target in the same sequence as that of searching an itemset tree in a depth-first manner and in descending order, a frequent occurrence determining unit that registers the frequency of occurrence of the new itemset in a table if the frequency of occurrence is equal to or more than a predetermined threshold, a correlation determining unit that determines whether there is a correlation between each item in the new itemset and a subset of remaining items that were removed from the new itemset, and a registration unit that registers the new itemset in a set of meaningful frequent itemsets if the determination is positive for all items of the new itemset.	04-19-2012
20120102068	SYSTEM AND METHOD FOR IDENTIFYING NETWORKS OF TERNARY RELATIONSHIPS IN COMPLEX DATA SYSTEMS - A system and method for identifying high order associations between variables in complex systems that is particularly useful where there is no correlation or weak correlation between variables due to the influence of a third variable, a ternary relationship. The ternary relationship describes how the variation in the pattern of association between a pair of variables, including its sign and strength, is mediated by a third variable. In one embodiment applied to gene expression data, the activity of pairs of correlated genes due to the activity of one or more third genes is shown.	04-26-2012
20120110014	Method, apparatus, and program for the discovery of resources in a computing environment - Embodiments of the present invention provide a detector apparatus for detecting a physical resource employed in providing a particular virtual resource in a computer network, the computer network including a plurality of physical resources each being operable to be employed in providing virtual resources and having an environment sensor outputting sensor data representing changes in an operating property of the physical resource. A detector apparatus embodying the present invention comprises a sensor data receptor operable to receive sensor data output by the environment sensors, a pattern extractor operable to extract a pattern from the received sensor data from a physical resource, and a pattern matcher, wherein the pattern matcher is operable to compare the extracted pattern with a unique pattern known to be generated by a particular virtual resource, and to detect that the physical resource is employed in providing the particular virtual resource when a match is found.	05-03-2012
20120117114	SYSTEM AND METHOD FOR SCALABLE SEMANTIC STREAM PROCESSING - A system for collaborative analysis from different processes on different data sources. The system uses a unique approach to lightweight temporary data structures in order to allow communication of interim results among processes, and construction of semantically appropriate reports. The data structures are generated in near real time and their lightweight nature supports massive scaling, including many diverse streaming inputs.	05-10-2012
20120124089	USER INTEREST PATTERN MODELING SERVER AND METHOD FOR MODELING USER INTEREST PATTERN - A user interest pattern modeling server includes a history collection unit, a keyword extraction unit, a time pattern extraction unit, a keyword extension unit, a time pattern analysis unit and a pattern modeling unit. The history collection unit collects a user's use history of a content. The keyword extraction unit extracts a keyword from the use history of the content. The time pattern extraction unit extracts a first time pattern of the keyword. The keyword extension unit extracts an extended keyword through searching related words of the keyword. The time pattern analysis unit analyzes a second time pattern of the extended keyword based on the first time pattern. The pattern modeling unit models a user interest pattern for the keyword and the extended keyword based on the first and second time patterns.	05-17-2012
20120131055	FSTP EXPERT SYSTEM - Be TT.p the “technique teaching” of a patent or venture, RS a “reference set” of prior art “technique teachings TT.i”, any “element” of any TT described by its properties, and all this information be presented as meaningful items. Then the FSTP Expert System supports managing an analysis of TT.p over RS such that it is able to reply automatically and instantly to any query for any item in this information. These answers may describe any interrelation between any items or properties/facts or comment on such interrelations or on some insights into them achieved while generating these items by or interactively with the FSTP Expert System. By formalization of these properties it also supports determining the value of q dependably indicating TT.p as trivial/obvious over RS iff q=0 and for q>0 showing the “creative height of TT.p over RS” and quantifying the “power” of this indication.	05-24-2012
20120131056	DROP RECIPE CREATING METHOD, DATABASE CREATING METHOD AND MEDIUM - According to one embodiment, a plurality of test drop recipes are first created based on design data on a semiconductor integrated circuit. Based on a defect inspection result of a pattern of a hardening resin material, which is formed by pressing a template on which patterns of the semiconductor integrated circuit are formed onto the hardening resin material applied to a substrate to be processed by use of the test drop recipes, a drop recipe with least defects is selected per press position on the substrate to be processed from the test drop recipes. The selected drop recipes for respective press positions are collected per functional circuit block configuring the semiconductor integrated circuit, thereby to generate a drop recipe creation assistant database.	05-24-2012
20120136895	LOCATION POINT DETERMINATION APPARATUS, MAP GENERATION SYSTEM, NAVIGATION APPARATUS AND METHOD OF DETERMINING A LOCATION POINT - A location point determination apparatus comprises a geographic feature harvesting module (	05-31-2012
20120143913	Encoding Data Stored in a Column-Oriented Manner - Data stored in a column-oriented manner is encoded using a data mining algorithm for finding column patterns among a set of data tuples, where each data tuple contains a set of columns, and the data mining algorithm treats all columns and all column combinations and column ordering similarly or in the same manner when looking for column patterns. Column values are ordered occurring in the column patterns based on their frequencies into a prefix tree, where the prefix tree defines a pattern order. The data tuples are sorted according to the pattern order, resulting in sorted data tuples, and columns of the sorted data tuples are encoded using run-length encoding.	06-07-2012
20120158783	LARGE-SCALE EVENT EVALUATION USING REALTIME PROCESSORS - Large-scale event processing systems are often designed to perform data mining operations by storing a large set of events in a massive database, applying complex queries to the records of the events, and generating reports and notifications. However, because such queries are performed on very large data sets, the processing of the queries often introduces a significant delay between the occurrence of the events and the reporting or notification thereof. Instead, a large-scale event processing system may be devised as a large state machine organized according to an evaluation plan, comprising a graph of event processors that, in realtime, evaluate each event in an event stream to update an internal state of the event processor, and to perform responses when response conditions are met. The continuous monitoring and evaluation of the stream of events may therefore enable the event processing system to provide realtime responses and notifications of complex queries.	06-21-2012
20120166484	SYSTEM, METHOD AND COMPUTER PROGRAM FOR MULTI-DIMENSIONAL TEMPORAL DATA MINING - The present invention provides a system, method and computer program for multi-dimensional temporal abstraction and data mining. The invention comprises collecting and optionally cleaning multi-dimensional data, the multi-dimensional data including a plurality of data streams; temporally abstracting the multi-dimensional data; and relatively aligning the temporally abstracted multi-dimensional data based on a shared time point of interest.	06-28-2012
20120209879	REAL-TIME INFORMATION MINING - Embodiments of the invention are related to identifying a user's intent dynamically from at least a set of metadata associated with the user, wherein the set of metadata is associated with a user input, and providing to the user a set of labeled instances on determination of a user's intent, the set of labeled instances being directly related to user's intent, where the set of labeled instances are obtained in real-time from a set of information repositories.	08-16-2012
20120209880	METHOD OF CONSTRUCTING A MIXTURE MODEL - A method of constructing a general mixture model of a dataset includes partitioning the dataset into at least two subsets according to predefined criteria, generating a subset mixture model for each of the at least two subsets, and then combining the mixture models from each subset to generate a general mixture model.	08-16-2012
20120209881	METHODS, SYSTEMS AND SOFTWARE FOR SEARCHING ACRONYMS, PHRASES, AND WORD GROUPINGS IN ELECTRONIC DOCUMENTS - Methods, systems and software for searching electronic documents allow a user to enter a single or multiple letters of a word group, name, phrase, or the like, and see hits that include the data inputted. The search tool solves the problem of finding the full words and ultimately the meaning of an acronym when reading a web page, word processing document, or other electronic searchable material. The search tool also solves the problem of searching through a document for particular word groups, phrases, and names, and may be especially useful where the exact spelling is unknown. The search tool may allow consumers the ability to search based on one or more characters of each name or word independently of the remaining characters in the name, phrase or word search.	08-16-2012
20120209882	SYSTEM, METHOD, AND COMPUTER READABLE MEDIA FOR IDENTIFYING A USER-INITIATED LOG FILE RECORD IN A LOG FILE - A system, a method, and a computer readable media for identifying a user-initiated log file record in a log file are provided. The log file has a user-initiated log file record and a repeating pattern of log file records automatically generated by a software program. The system allows a user to identify first and second timestamp values corresponding to first and second times which identify a time interval of interest in the log file. The system further analyzes the log file to identify the user-initiated log file record having a timestamp value between the first and second timestamp values. The system further identifies the repeating pattern of log file records in the log file.	08-16-2012
20120221602	METHOD AND APPARATUS FOR WORD QUALITY MINING AND EVALUATING - A method and an apparatus for word quality mining and evaluating are disclosed. The method includes: calculating a Document Frequency (DF) of a word in mass categorized data; evaluating the word in multiple single-aspects according to the DF of the word; and evaluating the word in multiple aspects according to the multiple single aspect evaluations to obtain an importance weight of the word. According to the solution of the present invention, the importance of the word in the mass categorized data may be evaluated, and words with high quality may be obtained through an integrated evaluation.	08-30-2012
20120233213	NAMED ENTITY DATABASE OR MINING RULE DATABASE UPDATE APPARATUS AND METHOD USING NAMED ENTITY DATABASE AND MINING RULE MERGED ONTOLOGY SCHEMA - A mining rule database update apparatus using a named entity dictionary and a mining rule combined with an ontology schema includes: a named entity dictionary and mining rule database storage module storing the named entity dictionary and a mining rule database; a mining pattern generation module recognizing terminology from text and converting the terminology into the mining pattern; a named entity and mining rule search module searching for a corresponding named entity and a mining rule from the named entity dictionary and the mining rule database using the recognized terminology and the mining pattern; and a mining rule database update module estimating a relationship name using a named entity of the recognized terminology and the ontology schema, generating a corresponding mining rule, and storing the generated mining rule in the mining rule database depending on a user's selection.	09-13-2012
20120233214	NAMED ENTITY DATABASE OR MINING RULE DATABASE UPDATE APPARATUS AND METHOD USING NAMED ENTITY DATABASE AND MINING RULE MERGED ONTOLOGY SCHEMA - A mining rule database update apparatus using a named entity dictionary and a mining rule combined with an ontology schema includes: a named entity dictionary and mining rule database storage module storing the named entity dictionary and a mining rule database; a named entity and mining rule search module searching for a corresponding mining rule and a named entity from the mining rule database and the named entity dictionary using a terminology included in an inputted mining pattern and the mining pattern; and a mining rule database update module estimating a relationship name using a named entity of the terminology and the ontology schema, generating a corresponding mining rule, and storing the generated mining rule in the mining rule database depending on user's selection.	09-13-2012
20120246196	NETWORK GRAPH EVOLUTION RULE GENERATION - A network's evolution is characterized by graph evolution rules. A graph that represents an evolutionary network is mined to identify evolutional patterns of the network, and graph evolution rules are generated using identified evolutional patterns. The generated graph evolution rules represent the evolutional patterns of the network.	09-27-2012
20120254241	MULTIPLE CRITERIA DECISION ANALYSIS - Embodiments of the present disclosure set forth a method for selecting a preferred data set. The method includes generating a candidate data set based on a first data set having a first join attribute, and a first aggregate attribute and a second data set having a second join attribute compatible with the first join attribute, and a second aggregate attribute, wherein the candidate data set includes a total attribute having a value that is based on aggregating a value associated with the first aggregate attribute and a value associated with the second aggregate attribute; and selecting the preferred data set from the candidate data set based on the total attribute.	10-04-2012
20120254242	METHODS AND SYSTEMS FOR MINING ASSOCIATION RULES - Systems, methods, and computer-readable code stored on a non-transitory media for mining association rules include determining a minimum support threshold and a minimum confidence threshold for association rule mining; determining a sampling model; sampling transactions from a transaction dataset; mining association rules from the sampled transactions; and transmitting mined association rules.	10-04-2012
20120259890	KNOWLEDGE-BASED DATA MINING SYSTEM - In a data mining system, data is gathered into a data store using, e.g., a Web crawler. The data is classified into entities. Data miners use rules to process the entities and append respective keys to the entities representing characteristics of the entities as derived from rules embodied in the miners. With these keys, characteristics of entities as defined by disparate expert authors of the data miners are identified for use in responding to complex data requests from customers.	10-11-2012
20120278361	USING WEB-MINING TO ENRICH DIRECTORY SERVICE DATABASES AND SOLICITING SERVICE SUBSCRIPTIONS - A system and method are provided for augmenting information on business directory databases and communicating with businesses is disclosed. Using the enriched business directory database and Web mining technology, customized email message are sent inviting businesses to enter their enriched business information into the directory or even subscribe to other paid services provided by the directory service.	11-01-2012
20120290619	METHOD, SYSTEM, AND APPARATUS FOR IDENTIFYING PHARMACEUTICAL PRODUCTS - A method, system and apparatus is provided for identifying pharmaceutical products. A database of known pharmaceuticals is provided with links to virtual 3D models of each pharmaceutical. When a pill needs to be identified, an image of the pill is transmitted to the database CPU. The CPU screens out non-matching records and obtains perspective data. based on the orientation of the pill. The CPU manipulates a 3D model into the same perspective as the pill to facilitate identification.	11-15-2012
20120303661	SYSTEMS AND METHODS FOR INFORMATION EXTRACTION USING CONTEXTUAL PATTERN DISCOVERY - Described herein are methods, systems, apparatuses and products for automatically discovering patterns in a text corpus. An aspect provides extracting at least one context string related to at least one annotator from the at least one text corpus; analyzing the at least one context string for at least one sequence, the at least one sequence comprised of at least one subsequence; determining at least one sequence signature for each at least one sequence by applying applicable rules to the at least one sequence; and grouping the at least one sequence signature into at least one group.	11-29-2012
20120310980	INFERRED USER IDENTITY IN CONTENT DISTRIBUTION - Embodiments of the present invention provide a method, system and computer program product for inferred user identity in content distribution. In an embodiment of the invention, a method for inferred user identity in content distribution includes retrieving a set of data of a particular classification from a data store of a computing device of an unidentified user requesting access to content in a content distribution system. The method further includes comparing the set of data of the particular classification to known patterns of data of the particular classification corresponding to different known users. The method yet further includes inferring an identity of the unidentified user based upon at least a partial matching of the compared set of data of the particular classification and known patterns of data of the particular classification. Finally, the method includes managing user interactions of the unidentified user based upon the inferred identity.	12-06-2012
20120310981	APPARATUS AND METHOD FOR PROVIDING SEARCH PATTERN OF USER IN MOBILE TERMINAL - In a mobile terminal, user search information used in a Location Based Service (LBS) application is stored. A user search pattern is determined from the user search information. User search pattern information corresponding to search condition data is extracted and displayed when the search condition data are input in a search pattern mode of the LBS application.	12-06-2012
20120331004	ASSET MANAGING APPARATUS AND ASSET MANAGING METHOD - A search extent setting unit that identify a layer made to correspond to an asset specified by referencing a first database for recording assets made to correspond to each of users by relating each of the assets to a first layer that is a layer related to a virtual system individually used by each of the users, or to a second layer that is a layer related to hardware and software, and to set an extent for extracting information about other assets having a relationship with the specified asset according to a layer of the specified asset, and an extracting unit that extract other assets that have a relationship with the specified asset and are present in the extent set by referencing the first database and a second database for recording information indicating a relationship among the assets, and the first database based on the first asset.	12-27-2012
20130007061	APPARATUS AND ASSOCIATED METHODS - In one or more embodiments described herein, there is provided an apparatus having a processor, and at least one memory including computer program code. The memory and the computer program code are configured to, with the at least one processor, cause the apparatus to perform the following. Firstly, the apparatus is caused to identify, based on received gesture command signalling associated with two or more content items, one or more common aspects of metadata for those two or more content items. Secondly, the apparatus is caused to use an identified common aspect of said metadata to search for other content items with metadata in common to the identified common aspect of metadata.	01-03-2013
20130018917	Selective Storing of Mining Models for Enabling Interactive Data Mining - A new data mining model (DMM) is created having at least one of the following characteristics: quality and complexity. The new DMM is handled as a candidate for storing in a storage device if a predefined criterion for the characteristics is met. The sum of the sizes of the new DMM and already stored DMMs is determined. In response to the sum falling below a storage limit, the new DMM is stored in the storage device. In response to the sum exceeding the storage limit, a decision is taken based on priorities of the DMMs which DMMs to store in the storage device.	01-17-2013
20130024476	METADATA RECORD GENERATION - A computer implemented method and system provide for automatic selection and extraction of metadata and media content from projects in a craft tool. Automated identification, classification and management of such metadata and content is provided using including techniques such as pattern recognition for audio and visual content. The automatic tracking and centralised storage of metadata and content for compliance purposes can be facilitated, and can enable querying of organised metadata stored in a central database. In an example, metadata and media content are extracted automatically from a project in a craft tool at a client system and are forwarded to a host system for the creation of a cue sheet including timings for media files from timing metadata in a project file to create the timings on the cue sheet.	01-24-2013
20130046784	BEGIN ANCHOR ANNOTATION IN DFAs - Disclosed is a method and system of matching a string of symbols to a ruleset. The ruleset comprise a set of rules. The method includes ignoring begin anchor requirements when constructing a DFA from all the rules of the ruleset, annotating the accepting states of the DFA with the begin anchor information, executing the DFA, and checking begin anchor annotations to determine if begin anchor requirement are satisfied if an accepting state is reached. Embodiments also include rulesets with begin anchors on matches, rulesets with early exit information on non-accepting states, and rulesets with accept begin anchors in accepting states.	02-21-2013
20130046785	Automatic Association of Informational Entities - The invention relates to the field of data storage. In particular, it relates to a method and system for allowing flexible creation and management of associations between informational entities on a computing device, such as a work station, a desktop computer, a tablet PC, a laptop computer and/or a mobile device. A storage system configured for storing a network of informational entities is described. The system comprises a storage medium configured to store a plurality of informational entities; to store a corresponding plurality of association records; wherein an association record corresponding to an entity indicates an association and an association strength between the entity and another entity; and to store a corresponding plurality of frequency indicators, wherein a frequency indicator corresponding to the entity indicates the frequency of access to the entity. Furthermore, the system comprises a processor configured to access the plurality of informational entities.	02-21-2013
20130046786	System for Explanation-Based Auditing of Medical Records Data - A system and method is provided for automatically generating explanations for individual records in an access log.	02-21-2013
20130066912	Deriving Dynamic Consumer Defined Product Attributes from Input Queries - Methods and systems of defining product attributes may involve receiving a search query and extracting a user expectation from the search query. In addition, an attribute may be defined for a product based on the user expectation. In one example, consumer generated content such as forum content, review content, blog content and social networking content, is used to define the attribute.	03-14-2013
20130066913	DATASET RATING AND COMPARISON - Providing information about two or more datasets. The method includes accessing metadata for two or more datasets. The method further includes displaying a comparison of the two or more datasets based on metadata for the two or more datasets.	03-14-2013
20130066914	Deriving Dynamic Consumer Defined Product Attributes from Input Queries - Methods and systems of defining product attributes may involve receiving a search query and extracting a user expectation from the search query. In addition, an attribute may be defined for a product based on the user expectation. In one example, consumer generated content such as forum content, review content, blog content and social networking content, is used to define the attribute.	03-14-2013
20130086111	System and Method for Providing Information on Selected Topics to Interested Users - There are disclosed systems and methods which provide for an inter-action between unrelated databases such that information provided to a first database by a first provider can be pushed from the first database to a user based, at least in part, on data provided by that user to a second database unrelated to the first database. This then allows a user to have information pushed to him/her based upon information previously obtained from that user or about that user. In one embodiment, the user enters his/her information into one or more databases and the entered information forms the basis for information to be pushed to that user from any database even if the pushed information resides in a non-related database. In another embodiment, the data required from a provider is determined based upon data gathered from other providers for similar items. In some situations certain data is gathered without active participation from the provider.	04-04-2013
20130110874	SYSTEM AND METHOD FOR COMBINATION-BASED DATA ANALYSIS	05-02-2013
20130132435	Method And System For Providing Business Intelligence Data - A system for data mining and providing business intelligence data including a source system having a computer readable database storing data aggregated from one or more data sources, and an analytics server in communication with the source system and including a computer readable medium having an intermediate data file stored thereon. The intermediate data file consisting of the data aggregated from the one or more data sources. The analytics server includes computer readable instructions for: importing the data aggregated from the one or more data sources for storing on the intermediate data file in the form of source data; normalizing the source data using predetermined scripts into normalized data stored in normalized data tables on the computer readable medium; generating one or more dimensions from the source data, wherein the one or more dimensions define categories into which portions of the normalized data can be grouped; generating one or more measures from the source data linked to the one or more dimensions; and, generating one or more formulae for calculating information from one or both of the dimensions and the measures.	05-23-2013
20130132436	REGISTRATION AND MAINTENANCE OF ADDRESS DATA FOR EACH SERVICE POINT IN A TERRITORY - A computer system and method is disclosed for mining current and archived address data in order to identify a preferred address for each service point in a territory. The data mining system may start in response to the presentation of a candidate address for matching. The set of mined data may be prioritized by clustering like characteristics, building similarity matrices, and by constructing dendrograms with nodes joined according to common characteristics. A computer system and method for maintaining a central database of preferred addresses is also disclosed. Selected address data gathered in a queue may be scored by characteristic, grouped by consignee location, and staged for processing. The scored queue of data may be prioritized by clustering like characteristics, building similarity matrices, and by constructing dendrograms.	05-23-2013
20130138691	CUSTOMIZABLE MONITORING AND MANAGEMENT TOOL - A method, computer system and computer program product for collecting identified items of interest from text result output using a poller. The steps include identifying identified items of interest from a device through highlighting at least one portion of the text result output as data item of interest and recording a location of the at least one highlighted portion of the text result output identified as data items of interest, and at a set time interval accepted from the user, executing the poller, and if at least one data item of interest is present within the text result output, extracting and storing the at least one data item of interest in the repository.	05-30-2013
20130144907	METADATA EXTRACTION PIPELINE - The present discussion relates to patient image data workflows. One example can temporarily serially arrange a set of semantic labeling modules in a patient image data workflow pipeline responsive to receiving an event trigger. The example can also remove the set of modules from the patient image data workflow pipeline responsive to receiving an event completion trigger.	06-06-2013
20130144908	Pattern-Based Stability Analysis Of Complex Data Sets - Methods and systems for identifying stability exceptions in a data log are disclosed. In one method, at least one key that is present in the data log is determined. The data log is comprised of at least one data set, at least one of which includes a plurality of iterations indicating states of the corresponding data set at different points in time. For each data set and for each key, a map is generated. The map indicates, for each iteration of the corresponding data set, whether the corresponding key is present in the corresponding iteration. Moreover, at least one expression pattern rule that models data item stability characteristics over data set iterations is compared to each of the maps to determine whether the corresponding map satisfies the one or more expression pattern rules. Further, at least one unstable data item is identified in the data log based on the comparison.	06-06-2013
20130173663	METHOD, DISTRIBUTED ARCHITECTURE AND WEB APPLICATION FOR OVERALL EQUIPMENT EFFECTIVENESS ANALYSIS - A solution is disclosed for an overall equipment effectiveness (OEE) analysis for a manufacturing execution system (MES) allowing a comparative analysis that makes it possible to build links and relations among the values found in a report with raw data acquired from the field. The solution is a web application running on a web server and on OEE databrowsers running on clients. In order to come to a comparative analysis an OEE-server is provided with the functionality to acquire data from the MES of the production plant, to perform scheduling for reports, and to perform run time query executions. Additionally the web server gets data from a data mining support database.	07-04-2013
20130226968	METHOD AND SYSTEM OF SUBSURFACE HORIZON ASSIGNMENT - Subsurface horizon assignment. At least some of the illustrative embodiments are methods including: obtaining, by a computer system, a seismic data volume; identifying, by the computer system, a plurality of patches in the seismic data volume, and the identifying thereby creating a patch volume; displaying, on a display device, at least a portion of the seismic data volume and the plurality of patches of the patch volume; and assigning a patch of the plurality of patches to a subsurface horizon of the seismic data volume.	08-29-2013
20130246463	PREDICTION AND ISOLATION OF PATTERNS ACROSS DATASETS - Various embodiments pertain to techniques for predicting and isolating patterns or trends across datasets. In various embodiments, one or more Q-entities are extracted from a data seed, associated with one or more dimensions, and classified into one or more clusters for each dimension with which it is associated. In some embodiments, a Q-entity can exist in more than one dimension and/or more than one cluster within a dimension. Once information from the data seed is associated with a dimension and cluster, frequency analysis can be utilized to ascertain a pattern or trend in the data. In various embodiments, additional data can be processed, added to the dimensions and clusters, and frequency analysis can be performed on the updated dataset to provide additional information on the pattern or trend.	09-19-2013
20130254233	SYSTEM AND METHOD FOR CONTEXT-SENSITIVE ADDRESS BOOK - System and method to provide a self-organizing, context aware address book, the method including: receiving sensor data about a status of a user; receiving external data about a status of predetermined contacts of the user; data mining the external data based upon the sensor data and entries within the address book; calculating a respective likelihood value to at least a portion of the entries within the address book; determining a prominence for the portion of entries within the address book; and displaying the portion of entries within the address book with the prominence.	09-26-2013
20130275469	DISCOVERY OF FAMILIAR CLAIMS PROVIDERS - Aspects of the subject matter described herein relate to identity technology. In aspects, profile data is mined to determine claims providers with which a user may be familiar. These familiar claims providers are used in conjunction with claims providers that are allowed by a relying party to determine a candidate set of claims providers that are both familiar to the user and allowed by the relying party. This candidate set of claims providers is then displayed to a user so that the user may select one or more of the claims providers to obtain claims to provide to the relying party.	10-17-2013
20130311513	SYSTEMS AND METHODS FOR INDIRECT ALGEBRAIC PARTITIONING - Systems and methods for storing and accessing data. Example embodiments may perform optimization based on patterns of requests received by the system and relations between data sets identified by the system. Example embodiments may identify restrictions on a data set based on a different data set. Conditions for automatically algebraically partitioning the data set based on a constituent of a different data set may be evaluated, including evaluation of the relationship between the data sets and identification of a pattern of statements restricting the data set using the same logical structure. If the conditions are met, component data sets and a partition data set may be algebraically defined based on ranges applied to constituent(s) of the other data set. The component data sets may also be realized in storage to physically partition the data set.	11-21-2013
20130325896	Suggesting Information to be Associated with Images Based on Metadata Embedded in Images - In one embodiment, receiving, from a user of a social network, an image with embedded metadata; and suggesting, to the user, information to be associated with the image based on the embedded metadata.	12-05-2013
20130325897	SYSTEM AND METHODS FOR PROVIDING CONTENT - Method, system, and programs for generating questions for a user. A request for content from a user is received via the communication platform. The content is retrieved from a content source. A question is generated for the user based on the content requested by the user and a history of previous information accessed or posted by the user. The question is sent to the user.	12-05-2013
20130325898	LARGE-SCALE EVENT EVALUATION USING REALTIME PROCESSORS - Large-scale event processing systems are often designed to perform data mining operations by storing a large set of events in a massive database, applying complex queries to the records of the events, and generating reports and notifications. However, because such queries are performed on very large data sets, the processing of the queries often introduces a significant delay between the occurrence of the events and the reporting or notification thereof. Instead, a large-scale event processing system may be devised as a large state machine organized according to an evaluation plan, comprising a graph of event processors that, in realtime, evaluate each event in an event stream to update an internal state of the event processor, and to perform responses when response conditions are met. The continuous monitoring and evaluation of the stream of events may therefore enable the event processing system to provide realtime responses and notifications to complex queries.	12-05-2013
20130346447	SYSTEMS AND METHODS FOR BEHAVIORAL PATTERN MINING - Methods and systems of performing data mining may include receiving a plurality of web log records and a plurality of call log records; associating one or more web log records with a call log record, wherein the associated user for each of the associated one or more web log records and the call log record are the same; identifying one or more patterns among the web log records for the plurality of call log records, wherein each pattern comprises one or more web accesses, a time stamp at which each of the one or more web accesses is performed and the call topic for the call log record; identifying one or more web log records associated with a new call, and predicting a call topic for the new call based on at least one pattern and the one or more web log records.	12-26-2013
20140006447	GENERATING EPIGENENTIC COHORTS THROUGH CLUSTERING OF EPIGENETIC SUPRISAL DATA BASED ON PARAMETERS	01-02-2014
20140012878	DATA MINING SYSTEM FOR AGREEMENT COMPLIANCE CONTROLLED INFORMATION THROTTLE - Enables data mining of monitored information, activities, and agreements associated with a throttling system. An agreement includes one or more conditions to satisfy the agreement, such as one or more tasks or activities to be performed by an agreement performer or events that may be detected, and actions performed to enforce or assert the agreement such as controlling the electronic device and/or enabling or disabling or otherwise limiting, reducing or increasing the amount or type of information allowed with respect to any or all electronic devices associated with the agreement performer.	01-09-2014
20140046977	SYSTEM AND METHOD FOR MINING PATTERNS FROM RELATIONSHIP SEQUENCES EXTRACTED FROM BIG DATA - The various embodiments herein provide a system and method for mining frequent patterns in relationship space from a plurality of relationship sequences extracted from a big data The system comprises a data repository for collecting and storing the big data. An Entity Store for collecting and storing a plurality of entities from the big data, an Entity Hierarchy for representing a hierarchical structure of entities, a Relationship Store for collecting and storing relationship instances between the pluralities of entities, a Relationship Hierarchy for representing a hierarchical structure of relationship, a language/domain model for organizing entities and relationships in a hierarchical manner, a pattern query Processing Module (PQPM) for processing, a pattern query related to finding patterns in relationships and entities, and a Pattern Generation Module (PGM) to generate frequent patterns and a Frequent Pattern Display Module (FPDM) to provide a visual presentation of the mined patterns.	02-13-2014
20140052757	TECHNIQUES PROVIDING A SOFTWARE FITTING ASSESSMENT - Techniques are presented for providing a software fitting assessment. The techniques may be performed by methods, apparatus, and/or computer program products. The techniques include automatically matching on a computer system one or more specified requirements for a project with one or more software functions stored in a repository. The automatically matching includes mining the repository in order to match requirements. The repository includes software functions, requirements accumulated from previous projects, and results of stored matches between the software functions and the requirements accumulated from previous projects. The techniques include outputting by the computer system one or more results of the matching.	02-20-2014
20140052758	Techniques Providing A Software Fitting Assessment - Techniques are presented for providing a software fitting assessment. The techniques may be performed by methods, apparatus, and/or computer program products. The techniques include automatically matching on a computer system one or more specified requirements for a project with one or more software functions stored in a repository. The automatically matching includes mining the repository in order to match requirements. The repository includes software functions, requirements accumulated from previous projects, and results of stored matches between the software functions and the requirements accumulated from previous projects. The techniques include outputting by the computer system one or more results of the matching.	02-20-2014
20140074885	GENERATION OF SYNTHETIC CONTEXT OBJECTS - A processor-implemented method, system, and/or computer program product generates and utilizes synthetic context-based objects. A non-contextual data object is associated with a context object to define a synthetic context-based object, where the non-contextual data object ambiguously relates to multiple subject-matters, and where the context object provides a context that identifies a specific subject-matter, from the multiple subject-matters, of the non-contextual data object. The synthetic context-based object is then associated with at least one specific data store, which includes data that is associated with data contained in the non-contextual data object and the context object. A request for a data store that is associated with the synthetic context-based object results in the return of at least one data store that is associated with the synthetic context-based object.	03-13-2014
20140089345	Method and Apparatus for Enhancing Electronic Reading by Identifying Relationships between Sections of Electronic Text - An apparatus, method and article of manufacture of the present invention detects the presence of references to the same concept in separate sections of text, and, with no input required from the reader, presents the reader with information concerning the detected references to the concept. The information provided may comprise information related to the location of the reference to the concept in other sections of text, and the reader also is provided the ability to move from one reference to a concept directly to another reference to the same concept.	03-27-2014
20140095542	INTERACTIVE DATA MINING - A data mining system receives a data set that includes a plurality of columns of data. The system determines correlations between columns of data of the data set and displays an interactive listing of a plurality of pairs of columns based on the correlations. The listing includes preview information based on the correlations for each pair. The system receives a selection of a value from the interactive listing from a user and refines the data set in response to the selection.	04-03-2014
20140101201	DISTRIBUTED DATA WAREHOUSE - Methods and data structures are provided for allowing data mining with improved efficiency. During processing of a usage log (or multiple logs) for an activity, such as a usage logfile of network search activity, a common fact table is generated. The common fact table allows a plurality of auxiliary data structures to be formed from the common fact table. These auxiliary data structures are designed to allow users to submit queries against the contents of the data structure in order to investigate the data. The efficiency of access of the common fact table is improved by allowing users to access auxiliary data structures other than the auxiliary data structures that are associated with a user. Optionally, the common fact table and/or the auxiliary data structures can include dimension values that correspond to both pre-identified dimension values as well as dimension values that are identified during processing of the activity logfiles.	04-10-2014
20140108455	Capturing Intentions Within Online Text - A method of capturing intentions within online text comprises with a data mining device (	04-17-2014
20140115002	METHOD FOR MONITORING A NUMBER OF MACHINES AND MONITORING SYSTEM - The present disclosure is related to a method for monitoring at least one event data generating machine, including a data logging device for providing event data. The method comprises transferring logged event data from at least one of the event data generating machines to a central processor, mining a multi-dimensional sequential pattern within said transferred event data wherein at least one dimensional attribute holds information indicating said event data generating machine or the at least one event data generating machine property, and matching said mined multi-dimensional sequential pattern with patterns stored in a central pattern database.	04-24-2014
20140143276	Enterprise Data Mining in a Hosted Multi-Tenant Database - An enterprise software system connected to multi-tenant hosted software offered in a cloud computing environment having the capacity to serve a large number of users with a small number of servers, and means for collecting and reporting statistically relevant information based on an aggregation of the data within the multi-tenant database. The integrated software modules include modules for IT management, financial operations, portfolio management, project management, project budget management, resource management, and operations management. The system permits user-specific lexicography mapped to a Master terminology; ranking projects on financial and non-financial indicators; presentation of a dynamic dashboard of proposals and approved projects; provision of a service catalogue that incorporates budget and asset management processes; a multi-tenanted database that enables users to share data management resources while maintaining their data in confidence; and providing aggregate IT data for competitive intelligence purposes.	05-22-2014
20140143277	METHOD AND DEVICE FOR MATCHING FRIEND RELATIONSHIP CHAIN IN INSTANT MESSAGING TOOL - Disclosed is a method and device for matching a friend relationship chain in an instant messaging tool. The method includes: performing data analysis on data information of a user; performing data mining on data information of other mass users according to an analysis result; and performing data matching between a mining result and the analysis result of the user. The device includes: a data analyzing module, a data mining module, and a data matching module. According to the technical solutions provided in the present disclosure, other users desired by the user are automatically matched for the user. The whole matching process requires no manual operation, which reduces usage threshold for a friend relationship chain matching system. In addition, as regards matching based on the user information, the matched users have a strong correlation with the user, and the matching quality is high.	05-22-2014
20140149458	DATA MINING SHAPE BASED DATA - Embodiments of the disclosure include a method for data mining shape based data, the method includes receiving shape data for each of a plurality of data entries and creating a first abstract from the shape data for each of the plurality of data entries. The method also includes organizing the first abstracts into a plurality of groups based on a criterion and creating a second abstract for each data entry in the plurality of groups based on the criterion and information derived from the first abstract.	05-29-2014
20140164432	ONTOLOGY ENHANCEMENT METHOD AND SYSTEM - An exemplary embodiment of the present disclosure provides an ontology enhancement method. Firstly, at least an input information request is received. Then, based on an ontology, each input information request is expanded to produce at least an expanded information request of each corresponding input information request. Based on a searching model, according to each expanded information request, a file collection is searched to obtain searching results of each corresponding expanded information request. Then, according to each searching result, a plurality of candidate knowledge concepts of each corresponding searching result are extracted. Next, the candidate knowledge concepts of each searching result are selectively added into the ontology.	06-12-2014
20140164433	Internet Data Mining Method and System - A method for automatically acquiring a set of data opens a searchable Internet database; initiates an automated timed search of each one of a plurality of records, each record in the plurality of record includes common criteria with the other records; retrieves information associated with the searched record; and provides the retrieved information in a desired format.	06-12-2014
20140181145	Modular Software System for Use in an Integration Software Technology and Method of Use - A modular software system for use with an integration software technology that facilitates an integration project by providing valuable information resources that assists in the development, deployment, and management on an integrated system as well as the resolution of any integration related issues. The module software system accomplishes this through the use of a plurality of modules and a method of use. The plurality of modules save time and effort involved in collecting a plurality of technical and non technical information required to make key decisions during the integration project implementation. By providing information in a packaged and organized way, the plurality of modules saves valuable and skilled resource man hours and effort reducing the cost spent on project implementation. The method of use allows project managers, technical team and business users to spend less time on collecting and organizing information before and during the implementation of an integration projects.	06-26-2014
20140195562	DETERMINING PRODUCT CATEGORIES BY MINING INTERACTION DATA IN CHAT TRANSCRIPTS - The propensity and intent of a user to make a purchase is predicted based on product search queries and chat streams. The contents of the data sources, including search queries and chat streams, are analyzed for product names and product attributes. The results of the analyses are used to predict user needs. Product names and attributes are extracted from the data sources. The extracted information is mapped onto abstract product categories. Based on the abstract product categories, offers for products and services are made to the user.	07-10-2014
20140207820	METHOD FOR PARALLEL MINING OF TEMPORAL RELATIONS IN LARGE EVENT FILE - Disclosed herein is a method for parallel mining of temporal relations in a large event file using a MapReduce model. In the method for parallel mining of temporal relations in a large even file according to the present invention, an event file is sorted based on customer identification (ID) and event time at which each event has occurred. A set of large event types satisfying a preset support or more is generated from the event file. The event file is converted into a large event sequence including the large event type set. The large event sequence is summarized and then a time interval data file is created. Candidate temporal relations are generated from the time interval data file, and frequent temporal relations satisfying a preset support or more are derived from the candidate temporal relations. A temporal relation rule is generated from the derived frequent temporal relations.	07-24-2014
20140236996	SEARCH DEVICE, SEARCH METHOD, RECORDING MEDIUM, AND PROGRAM - A search device (	08-21-2014
20140250149	IDENTIFYING ELEMENT RELATIONSHIPS IN A DOCUMENT - An information processing apparatus includes a text mining section configured to perform text mining on text data acquired from the outside and to output extracted information; an identification section configured to search a development database storing elements constituting a product and the relationship among the elements by using the information extracted by text mining to identify an element related to the information; and a notification section configured to notify the identified element to a user, a program for use in the information processing apparatus.	09-04-2014
20140250150	METHOD AND APPARATUS FOR SEARCHING PATTERN OF SEQUENCE DATA - A method of searching a pattern of sequence data, includes setting an interest pattern model comprising a length of an interest pattern, a value of an allowed mismatch, and a minimum support, obtaining supports of similar patterns of a child pattern, each of the similar patterns having a mismatch value with the child pattern that is greater than the value of the allowed mismatch, based on mismatch values of similar patterns of a parent pattern, and determining whether a support of the child pattern fulfills a condition of the minimum support based on the supports of the similar patterns of the child pattern, and a support of the parent pattern.	09-04-2014
20140250151	DYNAMIC PATTERN MATCHING OVER ORDERED AND DISORDERED DATA STREAMS - Architecture introduces a new pattern operator referred to as called an augmented transition network (ATN), which is a streaming adaptation of non-reentrant, fixed-state ATNs for dynamic patterns. Additional user-defined information is associated with automaton states and is accessible to transitions during execution. ATNs are created that directly model complex pattern continuous queries with arbitrary cycles in a transition graph. The architecture can express the desire to ignore some events during pattern detection, and can also detect the absence of data as part of a pattern. The architecture facilitates efficient support for negation, ignorable events, and state cleanup based on predicate punctuations.	09-04-2014
20140258332	FAST DISTRIBUTED DATABASE FREQUENCY SUMMARIZATION - A mechanism is provided for computing the frequency packets in network devices. Respective packets are associated with entities in a vector, where each of the entities is mapped to corresponding ones of the respective packets, and the entities correspond to computers. Upon a network device receiving the respective packets, a count is individually increased for the respective packets in the vector respectively mapped to the entities, and computing a matrix vector product of a matrix A and the vector. The matrix A is a product of at least a first matrix and a second matrix. The first matrix includes rows and columns where each of the rows has a single random location with a one value and remaining locations with zero values. The matrix vector product is transmitted to a centralized computer for aggregating with other matrix vector products.	09-11-2014
20140258333	FAST DISTRIBUTED DATABASE FREQUENCY SUMMARIZATION - A mechanism is provided for computing the frequency packets in network devices. Respective packets are associated with entities in a vector, where each of the entities is mapped to corresponding ones of the respective packets, and the entities correspond to computers. Upon a network device receiving the respective packets, a count is individually increased for the respective packets in the vector respectively mapped to the entities, and computing a matrix vector product of a matrix A and the vector. The matrix A is a product of at least a first matrix and a second matrix. The first matrix includes rows and columns where each of the rows has a single random location with a one value and remaining locations with zero values. The matrix vector product is transmitted to a centralized computer for aggregating with other matrix vector products.	09-11-2014
20140280341	METHOD, APPARATUS, AND COMPUTER-READABLE MEDIUM FOR CONTEXTUAL DATA MINING - An apparatus, computer-readable medium, and computer-implemented method for contextual data mining using a relational data set includes monitoring one or more data sources for information relating to the relational data set, the relational data set comprising one or more data objects in one or more classes, detecting activity corresponding to a first data object in the one or more data objects based at least in part on information gathered from at least one data source, determining whether the activity exceeds a predefined threshold, identifying a second data object in the one or more data objects which is connected to the first data object based at least in part on an analysis of relationships between the one or more data objects, and transmitting information relating to the second data object based at least in part on a determination that the activity exceeds the predefined threshold.	09-18-2014
20140310313	GENERATION OF SYNTHETIC OBJECTS USING BOUNDED CONTEXT OBJECTS - A processor-implemented method, system, and/or computer program product generates and utilizes synthetic context-based objects. A non-contextual data object is associated with a context object, which comports with a predetermined set of constraints, to define a synthetic context-based object, where the non-contextual data object ambiguously relates to multiple subject-matters, and where the context object provides a context that identifies a specific subject-matter, from the multiple subject-matters, of the non-contextual data object. The synthetic context-based object is then associated with at least one specific data store, which includes data that is associated with data contained in the non-contextual data object and the context object. A request for a data store that is associated with the synthetic context-based object results in the return of at least one data store that is associated with the synthetic context-based object.	10-16-2014
20140317140	QUERY PREDICTION - Disclosed here are methods, systems, paradigms and structures for predicting queries, creating tables to store data for the predicted queries, and selecting a particular table to obtain the data from in response to a query. The methods include determining various combinations of a finite set of columns users may query on, based on (i) a list of columns users are interested in obtaining data for, and (ii) cardinality information of a column or combinations of columns in the list of columns. The methods further includes creating various tables based on the determined combinations of the columns using a meta query language. A query is responded to by selecting a table that has least number of rows, among the tables that satisfy query parameters. The methods include selecting a table that has a longest sequence of columns matching with a portion of the query parameters.	10-23-2014
20140324908	METHOD AND SYSTEM FOR INCREASING ACCURACY AND COMPLETENESS OF ACQUIRED DATA - The present disclosure relates to the use of both semantic analysis and statistical text mining to process data records, improving the completeness and accuracy of records so processed. By way of example, a data record may be iteratively processed by text mining using seeds derived from a semantic template and by validating the results based on semantic reasoning based on the semantic template.	10-30-2014
20140365524	INCREMENTAL AGGREGATION-BASED EVENT PATTERN MATCHING - Aspects of the present invention provide a solution for recognizing a pattern in a set of data, such as data streaming over a data communication system. In an embodiment, a set of data events is retrieved in the data stream. The retrieved objects each have a plurality of characteristics that can be matched to a predetermined desired characteristic, such as a key value. The retrieved data events can be evaluated with respect to a pattern, with a characteristic of data events being evaluated with respect to an aggregate value related to the pattern. This aggregate value can be updated incrementally based on the data in the characteristic. Based on the evaluation, a determination as to whether the set of data events received subsequent to the first object satisfies the pattern.	12-11-2014
20140372482	PERFORMING DATA MINING OPERATIONS WITHIN A COLUMNAR DATABASE MANAGEMENT SYSTEM - Data mining operations are performed within a columnar database management system. The columnar database management system stores input sets of data for a data mining operation. An input set of data is represented as a column of data in the columnar database management system. The columnar database management system stores instructions to perform one or more data mining operations for processing the input sets of data. The columnar database management system receives requests for performing data mining operations and performs the processing of the data mining operation within the columnar database management system. As a result, the processing of data mining operations is performed without requiring multiple data transfers between an application implementing the data mining operations and the columnar database management system.	12-18-2014
20140372483	SYSTEM AND METHOD FOR TEXT MINING - A multi-user system for text mining a large population of research documents in an efficient and cost-effective fashion includes a content repository, a text mining processor, and a derived data repository that are linked via a user-accessible, central project manager. The content repository includes a data storage device for storing the research documents and a content selection facility for receiving a user-defined query that is able to support cost-related search parameters. The query is utilized by the content selection facility to select an initial collection of documents from the data storage device. Content spread metrics are then displayed through user-intuitive reports to allow for subsequent modification of the search query to yield an optimized document collection. The optimized document collection is then parsed, tagged and clustered by the text mining processor to produce search results that are stored as a data set in the derived data repository.	12-18-2014
20150012563	DATA MINING USING ASSOCIATIVE MATRICES - A method of mining frequent items in data is described. Categorical associations between elements of data are the core of information contained in the data and are all that is needed to perform data mining. These associations are extracted from data and held in optimized associative matrices whose structure is independent of the nature and structure of the data. All data mining operations and discoveries can be performed using only these associative matrices which provides many advantages over present methods. It allows real-time interactive navigation through the information in the data, enables efficient automatic and user guided determination of the most highly correlated data components, and a winnowing navigation through a large number of automatically determined associations, as for example frequent item sets, amongst which the needle-in-the-haystack may be more easily found.	01-08-2015
20150019588	Identifying Implicit Relationships Between Social Media Users To Support Social Commerce - Assuming that an initial social network is unavailable because explicit connections between users are missing or incomplete, temporal analysis may be used to identify the implicit relationship between social media users. Temporal data may be used to extract implicit relationship regardless of their specific activities such as visiting the same web pages or commenting on the same web objects.	01-15-2015
20150081735	SYSTEM AND METHOD FOR FAST IDENTIFICATION OF VARIABLE ROLES DURING INITIAL DATA EXPLORATION - Systems and methods are provided for identifying data variable roles during initial data exploration. A variable type, unique data value count values, and an overflow count value are determined for a variable. The unique data value count values include a number of occurrences of each of a plurality of unique data values for the variable in a data set. The overflow count value is a number of occurrences of data values other than the plurality of unique data values for the variable in the data set. When a number of the plurality of unique data values is greater than a value for a high cardinality threshold, the variable is determined to be a high cardinality variable. When the variable is not determined to be the high cardinality variable, a class variable role is assigned to the variable. When the variable is determined to be the high cardinality variable, Whether or not the variable is a numeric variable type is determined based on the determined variable type. When the variable is determined to not be the numeric variable type, the overflow count value is compared to the unique data value count values to determine whether or not rare visible values occurred for the variable. When the determination is that rare visible values occurred for the variable, a record identifier variable role is assigned to the variable.	03-19-2015
20150113018	INTERACTIVE VISUAL ANALYTICS FOR SITUATIONAL AWARENESS OF SOCIAL MEDIA - An adaptive system processes social media streams in real time. The adaptive system included a data management engine that generates combined data sets by detecting and mining a plurality of text-based messages from a social networking service on the Internet. An analytics engine in communication with the data management engine monitors topics in the text-based messages and tracks topic evolution contained in the text-based messages. A visualization engine in communication with the analytics engine renders historical and current activity associated with the plurality of text-based messages.	04-23-2015
20150120777	System and Method for Mining Data Using Haptic Feedback - Data from at least one outside data source containing Big Data is translated into a virtual three-dimensional object that identifies data of interest. In an embodiment, the data is translated into a tactile three-dimensional object that can be felt, for example, with a haptic controller. Embodiments allow for navigation, mining, and structuring of the data, as well as facilitating real time analysis of the data.	04-30-2015
20150293933	METHOD AND APPARATUS FOR LARGE SCALE DATA STORAGE - A logical apparatus and associated methods provide highly scalable and flexible data storage in a network of computers. The apparatus provides flexible organizational and access control mechanisms and a practical and efficient way to work with smaller portions of a data storage system at a given time to enable sparse population, caching, paging and related functions. A data structure, called a virtual container, comprises references to objects stored in a data storage system such that the same object can be visible from different virtual containers, if such virtual containers hold references to said object. Access controls further enhance the effectiveness of the methods and structures to enable multiple simultaneous organizational schemes and selective sharing of objects. Participating nodes provide access to objects stored on said nodes and their participating peer nodes, employing the data storage apparatus, such that balance in the network is achieved by data placement decisions that may combine common constraints and a node's individual self interest.	10-15-2015
20150302084	DATA MINING APPARATUS AND METHOD - A data mining apparatus and method are provided. The method operates by receiving a keyword list, compiling the keyword list into a finite state machine (FSM), performing data mining on documents in a document repository using a scanner, wherein the scanner uses the FSM to produce a match list comprising information about locations of the keywords in the documents, and processing the match list to produce a grid document comprising information about co-occurrences of keywords from the list in the documents. The apparatus uses a compiler, a scanner, and a builder to implement the method.	10-22-2015
20150317303	TOPIC MINING USING NATURAL LANGUAGE PROCESSING TECHNIQUES - The disclosed embodiments provide a method, system and apparatus for processing data. During operation, the system obtains a set of content items containing unstructured data. Next, the system obtains a set of part-of-speech (POS) tags for lexical items in the set of content items. The system then uses a computer to match the POS tags to one or more POS tagging patterns to obtain a set of candidate topics for the set of content items and extract a set of topics for the set of content items from the set of candidate topics.	11-05-2015
20150339379	METHOD OF SEARCHING FOR RELEVANT NODE, AND COMPUTER THEREFOR AND COMPUTER PROGRAM - Embodiments of the present invention is a technique of searching for relevant nodes. This technique may include: in response to selection of a first node, displaying, as first relevant nodes, nodes having a first relevance of at least a predetermined value among nodes connected from the first node by two hops; and, in response to selection of at least one of the first relevant nodes, displaying the selected first relevant node as a second node involving the first node. This technique may further include displaying, as second relevant nodes, nodes having a second relevance of at least a predetermined value among nodes connected from the second node by two hops.	11-26-2015
20150347570	CONSOLIDATING VOCABULARY FOR AUTOMATED TEXT PROCESSING - A method includes providing a corpus of text, and using suffix manipulation to obtain a stem for at least some tokens in the corpus. The method also includes using the respective stem for each token of the at least some tokens to form groups of the at least some tokens. In addition, the method includes using the groups of tokens to select lemmas for at least some of the tokens in the groups of tokens.	12-03-2015
20150363472	COMPUTER SYSTEM AND METHOD FOR ANALYZING DATA - A computer system for analyzing data has a local computer network for storing raw data, that includes a local data mining unit for generating local analysis data by a statistical analysis based on the raw data. The computer system also has a central computer network for receiving the local analysis data from the local computer network, that includes a central data mining unit for generating central analysis data by a statistical analysis based on the local analysis data.	12-17-2015
20150363487	EXTRACTING AND MINING OF QUOTE DATA ACROSS MULTIPLE LANGUAGES - Extracting and mining of quote data across multiple languages, including: retrieving, from a plurality of quote sources, a plurality of commentary summarizations, wherein each commentary summarization is embodied as a machine-readable data structure and wherein the plurality of commentary summarizations include information in at least two or more languages; for each commentary summarization: identifying, within the commentary summarization, quote data, wherein the quote data represents a quote from a commentator; creating a quote tuple for the quote data, the quote tuple including information associated with quantifiable aspects of the quote data; and storing, in a quote tuple repository, the quote tuple; mining, for quote analysis information, the quote tuple repository; and presenting, to a user, the quote analysis information.	12-17-2015
20160034541	Operation and Method for Prediction and Management of the Validity of Subject Reported Data - A system for developing and implementing empirically derived algorithms to generate decision rules to predict invalidity of subject reported data and fraud with research protocols in surveys allows for the identification of complex patterns of variables that detect or predict subject invalidity of subject reported data and fraud with the research protocol in the survey. The present invention may also be used to monitor invalidity of subject reported data within a research protocol to determine preferred actions to be performed. Optionally, the invention may provide a spectrum of invalidity, from minor invalidity needing only corrective feedback, to significant invalidity requiring subject removal from the survey. The algorithms and decision rules can also be domain-specific, such as detecting invalidity or fraud among subjects in a workplace satisfaction survey, or demographically specific, such as taking into account gender or age. The algorithms and decision rules may be optimized for the specific sample of subjects being studied.	02-04-2016
20160048565	SYSTEMS AND/OR METHODS FOR INVESTIGATING EVENT STREAMS IN COMPLEX EVENT PROCESSING (CEP) APPLICATIONS - Certain example embodiments relate to techniques for investigating event streams in complex event processing (CEP) environments. Input events from one or more input event streams and query registration-related events from a registration event stream are received. Query registration-related events are associated with actions taken with respect to queries performed on the input event stream(s). Event-based profiles are developed by subjecting the received input events to a profiling CEP engine. Event-based profiles include data mining related and/or statistical characteristics for each input event stream. Query-based profiles are developed by subjecting the received query registration-related events to the CEP engine. Query-based profiles include data indicative of how relevant the queries performed on the input event stream(s) are and/or how those queries are relevant to the input event stream(s) on which they are performed. Query registration-related events are generated when a query on the input event stream(s) is registered, deregistered, etc.	02-18-2016
20160070810	LINK DE-NOISING IN A NETWORK - A method includes obtaining a graph representative of a given network, sampling the graph a given number of times to estimate a level of noisiness for one or more edges in the graph, and annotating the one or more edges of the graph with the respective level of noisiness.	03-10-2016
20160085818	REAL-TIME AND ADAPTIVE DATA MINING - A method of analyzing data is presented. The method includes generating a query based on a topic of interest, expanding search terms of the query, executing the query on one or more data sources, monitoring a specific data source selected from the one or more data sources. The monitoring is performed to monitor for matches to the query.	03-24-2016
20160085819	REAL-TIME AND ADAPTIVE DATA MINING - A method of analyzing data is presented. The method includes generating a query based on a topic of interest, expanding search terms of the query, executing the query on one or more data sources, monitoring a specific data source selected from the one or more data sources. The monitoring is performed to monitor for matches to the query.	03-24-2016
20160085820	REAL-TIME AND ADAPTIVE DATA MINING - A method of analyzing data is presented. The method includes generating a query based on a topic of interest, expanding search terms of the query, executing the query on one or more data sources, monitoring a specific data source selected from the one or more data sources. The monitoring is performed to monitor for matches to the query.	03-24-2016
20160085821	REAL-TIME AND ADAPTIVE DATA MINING - A method of analyzing data is presented. The method includes generating a query based on a topic of interest, expanding search terms of the query, executing the query on one or more data sources, monitoring a specific data source selected from the one or more data sources. The monitoring is performed to monitor for matches to the query.	03-24-2016
20160085822	REAL-TIME AND ADAPTIVE DATA MINING - A method of analyzing data is presented. The method includes generating a query based on a topic of interest, expanding search terms of the query, executing the query on one or more data sources, monitoring a specific data source selected from the one or more data sources. The monitoring is performed to monitor for matches to the query.	03-24-2016
20160085823	REAL-TIME AND ADAPTIVE DATA MINING - A method of analyzing data is presented. The method includes generating a query based on a topic of interest, expanding search terms of the query, executing the query on one or more data sources, monitoring a specific data source selected from the one or more data sources. The monitoring is performed to monitor for matches to the query.	03-24-2016
20160085825	REAL-TIME AND ADAPTIVE DATA MINING - A method of analyzing data is presented. The method includes generating a query based on a topic of interest, expanding search terms of the query, executing the query on one or more data sources, monitoring a specific data source selected from the one or more data sources. The monitoring is performed to monitor for matches to the query.	03-24-2016
20160085827	REAL-TIME AND ADAPTIVE DATA MINING - A method of analyzing data is presented. The method includes generating a query based on a topic of interest, expanding search terms of the query, executing the query on one or more data sources, monitoring a specific data source selected from the one or more data sources. The monitoring is performed to monitor for matches to the query.	03-24-2016
20160092515	MINING ASSOCIATION RULES IN THE MAP-REDUCE FRAMEWORK - In each iteration of the process of mining association rules from transaction data by a cluster of computing systems, each mapper node in the cluster receives a split of the transaction data. Each mapper node scans the split to count an absolute support value of each candidate itemset for current search level(s), and passes the candidate itemsets and their support values to reducer nodes in the cluster. The number of reducer nodes will be determined adaptively based on the number of the candidate itemsets and the number of maximum available resource nodes in the cluster. Each reducer node combines the absolute support value of each candidate itemset, and finds frequent itemsets among them using a minimum support threshold. For each frequent itemset it finds, the reducer node creates association rule(s) satisfying a minimum confidence threshold, and exports all discovered frequent itemsets and association rules to a file system for storage.	03-31-2016
20160092516	METRIC TIME SERIES CORRELATION BY OUTLIER REMOVAL BASED ON MAXIMUM CONCENTRATION INTERVAL - A correlation relationship between two metric time series is determined after removing the impact of outlying metric values (“outliers”) that are unimportant for analytical purposes. Each of the metric time series can represent values of different system metrics obtained by mining data gathered through the monitoring of cloud deployments. The outliers can be determined based on a maximum concentration interval of the data. Removing the impact of the outliers enhances the correlation of the metric time series and provides a better representation of the correlation relationship.	03-31-2016
20160110428	METHOD AND SYSTEM FOR FINDING LABELED INFORMATION AND CONNECTING CONCEPTS - It is possible to partially or fully automate analysis of synthetic data to find labeled information and authored connecting concepts. This can help individuals to find experts in relevant domains, to identify non-obvious solutions to their R&D problems, to serve as a catalyst (input) for innovation, or to categorize prior art relevant to a technological concept seeking venture capital funding, a scientific area for new product development, and/or a patent application in question.	04-21-2016
20160110429	Systems and Methods for Social Media Data Mining - Systems and methods are provided to collect, analyze and report social media aggregated from a plurality of social media websites. Social media is retrieved from social media websites, analyzed for sentiment, and categorized by topic and user demographics. The data is then archived in a data warehouse and various interfaces are provided to query and generate reports on the archived data. In some embodiments, the system further recognizes alert conditions and sends alerts to interested users. In some embodiments, the system further recognizes situations where users can be influenced to view a company or its products in a more favorable light, and automatically posts responsive social media to one or more social media websites.	04-21-2016
20160125040	DIGITAL CURRENCY MINING CIRCUITRY HAVING SHARED PROCESSING LOGIC - An integrated circuit may be provided with cryptocurrency mining capabilities. The integrated circuit may include control circuitry and a number of processing cores that complete a Secure Hash Algorithm 256 (SHA-256) function in parallel. Logic circuitry may be shared between multiple processing cores. Each processing core may perform sequential rounds of cryptographic hashing operations based on a hash input and message word inputs. The control circuitry may control the processing cores to complete the SHA-256 function over different search spaces. The shared logic circuitry may perform a subset of the sequential rounds for multiple processing cores. If desired, the shared logic circuitry may generate message word inputs for some of the sequential rounds across multiple processing cores. By sharing logic circuitry across cores, chip area consumption and power efficiency may be improved relative to scenarios where the cores are formed using only dedicated logic.	05-05-2016
20160162554	METHODS FOR APPLYING TEXT MINING TO IDENTIFY AND VISUALIZE INTERACTIONS WITH COMPLEX SYSTEMS - A method of detecting textual and behavioral commonalities in warranty reported data. Extracting, by a processor, records of verbatim data from a memory storage unit. A first set of basewords is identified for comparison with the extracted records. A binary flag is set in response to an occurrence of a respective baseword in a respective record. An occurrence matrix is generated that includes entries identifying a number of times basewords are identified in each record. The occurrence matrix is formatted to a format as identified by the user.	06-09-2016
20160162575	MINING MULTI-LINGUAL DATA - Technology is disclosed for mining training data to create machine translation engines. Training data can be mined as translation pairs from single content items that contain multiple languages; multiple content items in different languages that are related to the same or similar target; or multiple content items that are generated by the same author in different languages. Locating content items can include identifying potential sources of translation pairs that fall into these categories and applying filtering techniques to quickly gather those that are good candidates for being actual translation pairs. When actual translation pairs are located, they can be used to retrain a machine translation engine as in-domain for social media content items.	06-09-2016
20160188675	NETWORK FOR DIGITAL EMULATION AND REPOSITORY - A network includes a processor; a memory location; a database stored in the memory location; a fielded entity in communication with the memory location; and a virtual replica of the fielded entity. The database includes historical data associated the fielded entity and the processor is configured to analyze the data.	06-30-2016
20160196299	Determining Answer Stability in a Question Answering System	07-07-2016
20160196312	Corpus Augmentation System	07-07-2016
20160196334	Corpus Augmentation System	07-07-2016
20190147096	SYSTEMS AND METHODS FOR INTERACTIVE ANALYSIS	05-16-2019
20220138242	CONTENT MANAGEMENT SYSTEMS PROVIDING AUTOMATED GENERATION OF CONTENT SUMMARIES - Systems for generating content summaries in a web content management service, wherein in one embodiment a digital page editor and a component browser are launched to enable selection of a first content item. A summary of the first content item is automatically generated according to parameters that may have default values or values set by a user. The parameters may specify a size for the summary as a percentage of the first content item's size, as a particular number of lines, characters or words, as a size for a particular type of device, etc. The automatically generated summary is provided to the digital page editor, which can edit it and add it to the digital page. The summary is stored in a content repository as an independent summary content item with its own metadata.	05-05-2022

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Data mining

Subclass of:

707 - Data processing: database and file management or data structures

707705000 - DATABASE AND FILE ACCESS

707758000 - Record, file, and data search and comparisons

707769000 - Database query processing

Patent class list (only not empty are listed)

Deeper subclasses: