Entries |
Document | Title | Date |
20100042664 | MUSIC ARTIST RETRIEVAL SYSTEM AND METHOD OF RETRIEVING MUSIC ARTIST - The present invention provides a music artist retrieval system which makes it possible for users to automatically retrieve an unknown music artist similar to the user's favorite artist while actually reproducing and confirming a piece of music of the unknown artist. A music artist similarity map storing section ( | 02-18-2010 |
20100070512 | ORGANISING AND STORING DOCUMENTS - A data handling device has access to a store of existing metadata pertaining to existing documents having associated metadata terms. It selects metadata assigned to documents deemed to be of interest to a user and analyses the metadata to generate statistical data as to the co-occurrence of pairs of terms in the metadata of one and the same document. When a fresh document is received, it is analysed to assign to it a set of terms and determine for each a measure of their strength of association with the document. Then, a score is generated for the document, for each term of the set, the score being a monotonically increasing function of (a) the strength of association with the document and of (b) the relative frequency of co-occurrence of that term and another term that occurs in the set. The score represents the relevance of the document to the users and can be used (following comparison with a threshold, or with the scores of other such documents) to determine whether the document is to be reported to the user, and/or retrieved. | 03-18-2010 |
20100076986 | Recruitment Vendor Management System and Method - A computer system and method for identifying a matching resume for a job description. The system receives and stores the job description that includes job requirements, each including a required skill or experience-related phrase and a required term of experience. The system receives and stores resumes that include skill or experience-related phrases. When the skill or experience-related phrases include the required skill or experience-related phrase for a job requirement, the system computes a term of experience for the required skill or experience-related phrase. To compute the term of experience, the system associates a contextual use and an experience range with each skill or experience-related phrase. A resume is a match when it includes the required skill or experience-related phrase for each job requirement and the term of experience for the required skill or experience-related phrase in the resume is greater than or equal to the required term of experience. | 03-25-2010 |
20100082643 | Computer Implemented Method and Program for Fast Estimation of Matrix Characteristic Values - A term-by-document (or part-by-collection) matrix can be used to index documents (or collections) for information retrieval applications. Reducing the rank of the indexing matrix can further reduce the complexity of information retrieval. A method for index matrix rank reduction can involve computing a singular value decomposition and then retaining singular values based on the singular values corresponding to singular values of multiple topics. The expected singular values corresponding to a topic can be determined using the roots of a specially formed characteristic polynomial. The coefficients of the special characteristic polynomial can be based on computing the determinants of a Gram matrix of term (or part) probabilities, a method of recursion, or a method of recursion further weighted by the probability of document (or collection) lengths. | 04-01-2010 |
20100169327 | TRACKING SIGNIFICANT TOPICS OF DISCOURSE IN FORUMS - Users in public forums often mention certain topics in the course of their discussions. Member's comments in messages to other members are analyzed to obtain terms that co-occur with topics. Frequencies of co-occurrence of a term with topics are normalized based on frequency of the term in a random sample of message. The terms are ranked by their normalized frequency of co-occurrence with a topic in messages. The top terms are selected based on their rank. Analysis of demographic information associated with members that mentioned top terms associated with a topic is displayed in graphical format that highlights the relationship between the age, gender, and usage of the top terms over time. The demographic information presented includes average age of members that mentioned a top term or their gender information within a selected time interval. | 07-01-2010 |
20100174726 | Open Profile Content Identification - Open profile data in a user profile, e.g., free-form fields in a user profile, are processed to identify interests and preferences of the user. The interests and preferences are utilized to identify categories associated with the user profile, and content items, e.g., advertisements, can be identified based on the categories. | 07-08-2010 |
20100191745 | MECHANISMS FOR RANKING XML TAGS - XML Schema design often involves repeating work already done by others. The XML modeling of an object may already be represented by one or more XML tags in a collection of documents. Rather than re-inventing what has been done before, or in order to be consistent with what others have already done, schema designers benefit from a system helps designers choose from a set of candidate XML tags that are already in use by others. | 07-29-2010 |
20100191746 | Competitor Analysis to Facilitate Keyword Bidding - Disclosed herein are one or more embodiments that facilitate selection of keywords for bidding by an advertiser having a website. One or more of the disclosed embodiments may process a click-through log to determine measures of competitiveness for a plurality of websites extracted from the click-through log. Also, the one or more disclosed embodiments may, for one of the websites, determine a ranking of competing websites based at least in part on the measures of competitiveness. The ranking of competing websites may be used to facilitate selection of keywords for bidding. | 07-29-2010 |
20100191747 | METHOD AND APPARATUS FOR PROVIDING RELATED WORDS FOR QUERIES USING WORD CO-OCCURRENCE FREQUENCY - A system and a method is provided for extracting, classifying, and displaying related words for a query, if an internet or local computer or a mobile appliance user inputs the query. In a system for providing related words composed of a client and a server connected to the client through a network, the method of providing related words for a query includes the steps of the server searching for the co-occurrence frequency by words from source data, the server storing the result of the search in a database, the server receiving an input of a query from the client, and the server extracting related words related to the input query from the database. | 07-29-2010 |
20100191748 | Method and System for Creating a Data Profile Engine, Tool Creation Engines and Product Interfaces for Identifying and Analyzing Files and Sections of Files - A data profile engine identifies, classifies, analyzes, searches, compares and cross-references entire files and sections of files, records and other forms of electronic media, and a tool creation engine in combination with the data profile engine builds custom solutions and product interfaces. | 07-29-2010 |
20100198841 | SYSTEMS AND METHODS FOR AUTOMATICALLY IDENTIFYING AND LINKING NAMES IN DIGITAL RESOURCES - The present invention provides systems and methods for automatically identifying name-like-strings in digital resources, matching these name-like-string against a set of names held in an expertly curated database, and for those name-like-strings found in said database, enhancing the content by associating additional matter with the name, wherein said matter includes information about the names that is held within said database and pointers to other digital resources which include the same name and it synonyms. | 08-05-2010 |
20100205184 | USING SPECIFICITY MEASURES TO RANK DOCUMENTS - A method of ranking documents by specificity values includes specifying a reference set of documents, each document including one or more terms, and specifying a first document that includes one or more terms that are included in the reference set of documents. The method includes determining, from the reference set of documents, one or more term-specificity values for the one or more terms of the first document by calculating frequencies of terms within the reference set of documents, wherein a larger term-specificity value corresponds to a lower likelihood relative to the reference set of documents, and determining a document-specificity value for the first document by combining the one or more term-specificity values for the first document, wherein larger term-specificity values correspond to a larger document-specificity value. | 08-12-2010 |
20100217768 | Query System for Biomedical Literature Using Keyword Weighted Queries - An information retrieval system for biomedical information uses a supervised machine learning system to identify keywords to improve search efficiency. The supervised machine learning system may be trained using a set of clinical questions whose keywords have been extracted, for example, by trained individuals. Weighting of search terms in the document query process is based at least in part on keywords identification. | 08-26-2010 |
20100228742 | Categorizing Queries and Expanding Keywords with a Coreference Graph - A method and apparatus is provided for determining related keywords to narrow a query, and/or for categorizing a query. A keyword graph connects keyword nodes to each other based on degrees of cross-reference indicating how frequently keywords associated with the nodes appear in searches. A domain node representing a category hooks to a category-matching node in the keyword graph. Based at least in part on a degree of cross-reference between another node and the category-matching node, the domain node hooks to the other node. Alternately, the domain node hooks to nodes that match user-identified keywords. At query time, the query is categorized by the domain node closest to a node matching the query. Keywords related to the category may be determined from the nodes that are hooked to the domain node. The related keywords can be used to narrow a search or expand the metadata of a document. | 09-09-2010 |
20100228743 | DOMAIN-BASED RANKING IN DOCUMENT SEARCH - In one example, documents that are examined by a search process may be scored in a manner that is specific to a domain. A domain may be a substantive area, such as medicine, sports, etc. Different scoring approaches that take aspects of the domain into account may be applied to the documents, thereby producing different scores than might have been produced by a simple comparison of the terms in the query with the terms in the documents. These domain-based approaches may take a query into account in scoring the documents, or may be query-independent. Each approach may be implemented by a scorer. The combined output of the scorers may be used to generate a score for each document. Documents then may be ranked based on the scores, and search results may be provided. | 09-09-2010 |
20100293170 | SOCIAL NETWORK MESSAGE CATEGORIZATION SYSTEMS AND METHODS - Systems and methods of identifying and categorizing social network messages that are relevant to selected categories and text terms are provided. The frequency of text terms appearing in social network messages are calculated for multiple categories. Based on the calculated text term frequency, social network messages can be identified and/or categorized that match a provided set of text terms. Selecting and/or associating text terms and categories are determined by repeatedly analyzing social network messages. | 11-18-2010 |
20110035386 | System and Method to Manage and Utilize "Social Dynamic Rating" for Contacts Stored by Mobile Device Users - Various implementations of the disclosure relate to social dynamic ratings for contacts stored at a mobile device. For example, a mobile device may be configured to communicate to a server rating information for the contacts. The rating information may include a frequency of access of a contact by the user at the mobile device and/or explicit ratings provided by the user. The server may use the rating information to generate a social dynamic rating for an entity associated with the contact. The server may communicate the social dynamic rating to the mobile device. The mobile device may display the social dynamic rating whenever the contact is displayed at the mobile device or whenever the user accesses a user interface remote from the mobile device configured to display the contact and/or an entity associated with the contact. | 02-10-2011 |
20110040769 | Query-URL N-Gram Features in Web Ranking - In one embodiment, access one or more pairs of search query and clicked Uniform Resource Locator (URL). For each of the pairs of search query and clicked URL, segment the search query into one or more query segments and the clicked URL into one or more URL segments; construct one or more query-URL n-grams, each of which comprises a query part comprising at least one of the query segments and a URL part comprising at least one of the URL segments; and calculate one or more association scores, each of which for one of the query-URL n-grams and represents a similarity between the query part and the URL part of the query-URL n-gram and is based on a first frequency of the query part and the URL part, a second frequency of the query part, and a third frequency of the URL part. | 02-17-2011 |
20110055227 | CONFERENCE RELAY APPARATUS AND CONFERENCE SYSTEM - When having obtained sound singles transmitted/received among plural terminal apparatuses, a sound recognition unit of a conference server performs a sound recognition processing on the obtained sound signals and then generates text information. A language analyzing unit of the conference server performs a language analysis processing on the text information generated by the sound recognition unit and then disassembles the text information into words. A deciding unit of the conference server compares the disassembled words and a keyword database, and decides whether or not any keyword stored in the keyword database is contained in each sentence of speech contents of respective speakers. A totalizing unit calculates a cumulative score based on the decision result of the deciding unit which represents a state of a discussion, and then it is decided in accordance with the calculated cumulative score whether or not the discussion matches to the subject. | 03-03-2011 |
20110055228 | COOCCURRENCE DICTIONARY CREATING SYSTEM, SCORING SYSTEM, COOCCURRENCE DICTIONARY CREATING METHOD, SCORING METHOD, AND PROGRAM THEREOF - A cooccurrence dictionary creating system includes: a language analyzing section which subjects a text to a morpheme analysis, a clause specification, and a modification relationship analysis between clauses, a cooccurrence relationship collecting section which collects cooccurrences of nouns in each clause of the text, modification relationships of nouns and declinable words, and modification relationships between declinable words as cooccurrence relationships, a cooccurrence score calculating section which calculates a cooccurrence score of the cooccurrence relationship based on a frequency of the collected cooccurrence relationship, and a cooccurrence dictionary storage section which stores a cooccurrence dictionary in which a correspondence between the calculated cooccurrence score and the cooccurrence relationship is described. | 03-03-2011 |
20110060746 | MATCHING REVIEWS TO OBJECTS USING A LANGUAGE MODEL - A method is provided to associate reviews that have unknown correspondences to tangible entities to structured objects that have known correspondences to tangible entities comprising: transforming a respective review and text from a respective structured object to a collection of words that intersect the respective review and text from the respective structured object; determining a measure of a likelihood of a match as a function of respective probabilities of occurrences of respective words of such intersecting collection within generic review text and respective probabilities of occurrences of respective words of such intersecting collection within structured object text. | 03-10-2011 |
20110060747 | Rapid Automatic Keyword Extraction for Information Retrieval and Analysis - Methods and systems for rapid automatic keyword extraction for information retrieval and analysis. Embodiments can include parsing words in an individual document by delimiters, stop words, or both in order to identify candidate keywords. Word scores for each word within the candidate keywords are then calculated based on a function of co-occurrence degree, co-occurrence frequency, or both. Based on a function of the word scores for words within the candidate keyword, a keyword score is calculated for each of the candidate keywords. A portion of the candidate keywords are then extracted as keywords based, at least in part, on the candidate keywords having the highest keyword scores. | 03-10-2011 |
20110072025 | RANKING ENTITY RELATIONS USING EXTERNAL CORPUS - Exemplary methods and apparatuses are disclosed that may be used to provide or otherwise support ranking entity relations utilizing the vocabulary of at least one external corpus for use in search engine information management systems. | 03-24-2011 |
20110078160 | RECOMMENDING ONE OR MORE CONCEPTS RELATED TO A CURRENT ANALYTIC ACTIVITY OF A USER - Methods and apparatus are provided for recommending one or more concepts related to a current analytic activity of a user. One or more concepts related to a current analytic activity of a user are recommended by maintaining a logical record of analytic activity of the user by recording one or more visual analytic actions performed by a user; generating a context model for a plurality of the existing notes containing the concepts, wherein the context model for a given existing note represents information interests of the user; determining a weight for each of the plurality of concepts, wherein a given weight characterizes a relevance of a corresponding concept to the current analytic activity; and recommending one or more concepts based on the determined weight. The weight for a given concept is based on the context model for the given concept and a context model for the current analytic activity. The context model for the given concept represents the information interests of the user at a time surrounding the point when the user recorded the corresponding existing note. | 03-31-2011 |
20110113043 | Creation Of A Category Tree With Respect To The Contents Of A Data Stock - Methods for the automatic creation of a category tree with respect to the contents of a data stock, wherein a taxonomy of the data stock will be created on the base of co-occurrences. Another object of the present invention is furthermore a data processing system comprising data which represent information in at least one data stock which is accessible via at least one data source, which is designed and/or adapted to at least partially carry out a method according to the invention. Another object of the present invention is furthermore a data processing device for the electronic processing of data, comprising a control and/or computer unit, an input unit and an output unit, which is designed and/or adapted to at least partially carry out a method according to the invention, preferably using at least a part of a data processing system according to the invention. | 05-12-2011 |
20110191355 | METHOD FOR MONITORING ABNORMAL STATE OF INTERNET INFORMATION - A method for monitoring abnormal state of Internet information by monitoring the change of hot words frequency in the Internet information. The method includes the following steps: 1) obtaining the current date word frequency data of common words appearing in the current date web pages; 2) combining with the hot words dictionary that the user focuses on to determine the current date keywords set of the Internet information; 3) determining the weight of each current date keyword; 4) determining the abnormal threshold of the current date keywords; 5) detecting the abnormal level of the current date keywords to determine the current date hot Internet information. The present invention calculates the abnormal level of keywords by monitoring the change of hot words frequency in the Internet information, predicts and gives alarm for the abnormal level of hot words frequency change, which makes the Internet information user react at the first moment. | 08-04-2011 |
20110208754 | Organization of Data Within a Database - A computer implemented method is provided for processing data representing a data entity having sub entities. The method includes analyzing queries to the data entity for deriving information about sets of the sub entities frequently queried together, and grouping the sub entities to a number of banks, each bank having a maximum width, based on the information about sets of sub entities frequently queried together, in order to reduce an average number of banks to be accessed for data retrieval. | 08-25-2011 |
20110219013 | DETECTING DUPLICATES IN A SHARED KNOWLEDGE BASE - Methods and systems supporting curation of items in a searchable knowledge base are provided. The methods and systems include mining one or more search queries of the searchable knowledge base, where each of the search queries includes a plurality of the items. The method further includes determining one or more pairs of items using a processor, where each of the pairs of items includes a correlation value exceeding a threshold. The correlation values for the pairs of items are based upon the frequency the items of the pairs of items co-occur within the search queries. The method further includes providing the pairs of items to a curator, where the curator reviews the pairs of items. | 09-08-2011 |
20110225174 | MEDIA VALUE ENGINE - Exemplary embodiments are directed to determining a media value associated mentions of an entity in one or more documents based on a sentiment attributed to the mentions of the entity and/or a frequency with which the entity is mentioned. Exemplary embodiments can include a media value engine that can identify mentions of an entity in documents, attribute sentiment to the mentions of the entity; determine a polarity of the sentiment, and calculate a media value attributed to the entity based on the sentiment. | 09-15-2011 |
20110246486 | Methods and Systems for Extracting Domain Phrases - Methods and systems for extracting domain phrases are provided. First, a domain phrase database including a plurality of domain phrases is provided. For a candidate phrase, it is determined whether the candidate phrase is a domain phrase according to an occurrence condition of at least one part of the candidate phrase in the domain phrases of the domain phrase database and the occurrence condition of the at least one part of the candidate phrase at different relative positions in respective domain phrases in respective domain phrases. | 10-06-2011 |
20110252045 | LARGE SCALE CONCEPT DISCOVERY FOR WEBPAGE AUGMENTATION USING SEARCH ENGINE INDEXERS - Disclosed is a method and system for retrieving data; extracting information from the data; learning to disambiguate the extracted information such that a particular sense of each phrase within the extracted information is determined; generating a disambiguation classifier from the learning to disambiguate step, the disambiguation classifier configured to determine a sense of a phrase within a document; learning to select a portion of the information as being relevant to a theme of the data; generating a selection classifier from the learning to select step, the selection classifier configured to select a topic in a document that is relevant to a theme of the document; and using the disambiguation classifier and the selection classifier by an indexing computer to determine a set of topics from a web document retrieved by the indexing computer. | 10-13-2011 |
20110264673 | ESTABLISHING SEARCH RESULTS AND DEEPLINKS USING TRAILS - Search and browse trails are temporally-ordered sequences of web pages visited by a user during post-search query navigation beginning with a page associated with one of the search results. The trails can provide useful information for a number of search-related purposes. For example, these trails can be used to leverage the post-query behavior of other users to help the current user search more effectively and allow them to make more informed search interaction decisions. The trails can also be used to establish search results and refine search result rankings, select and evaluate deeplinks, and recommend multi-step trails as an alternative to or enhancement for existing search result presentation techniques. | 10-27-2011 |
20110302176 | DOCUMENT RANKING SYSTEM AND METHOD BASED ON CONTRIBUTION SCORING - Disclosed are a document ranking system and method based on contribution scoring. The document ranking system includes: a content score calculating unit for calculating content scores for documents with respect to at least one word contained in the documents, with regard to each such word; a contribution score calculating unit for calculating contribution scores for the documents with respect to jointly occurring words; and a ranking unit for ranking the documents with respect to the at least one word, with regard to each such word, by using the content scores and the contribution scores. | 12-08-2011 |
20110307499 | SYSTEMS AND METHODS FOR ANALYZING PATENT RELATED DOCUMENTS - Methods and systems are disclosed that analyze patent-related documents having at least one property type. In one implementation, a method involves displaying, in a first graphical element, identifiers of the patent-related documents. The method also involves analyzing the patent-related documents to determine at least one property value for the property type. The property value includes a string of one or more words describing subject matter associated with the patent-related documents and occurring in a subset of the patent-related documents. The method also displays a second graphical element associated with the property type. The second graphical element includes the property value. The method receives, at the second graphical element, a user selection of the property value. The method displays, in the first graphical element, identifiers of the subset of the patent-related documents in which the property value occurs. | 12-15-2011 |
20120011132 | SYSTEM, METHOD AND COMPUTER PROGRAM FOR PREPARING DATA FOR ANALYSIS - A method of preparing data for analysis, comprising the steps of receiving an initial data set including a plurality of records, each of the plurality of records including an identifier attribute and an associative attribute that identifies a further one or more records;
| 01-12-2012 |
20120072434 | INFORMATION RETRIEVAL METHOD, INFORMATION RETRIEVAL APPARATUS, AND COMPUTER PRODUCT - An information retrieval apparatus includes an acquiring unit that acquires a numerical value defining a boundary of a numerical range; a detecting unit that detects a number of places in and a head numeral of the numerical value; an extracting unit that extracts from a bit string group, a bit string indicating whether a numerical value in a numerical value group having the number of places and the head numeral is present in files subject to retrieval; a specifying unit that specifies a file corresponding to a bit in the extracted bit string, the bit indicating the presence of a numerical value of the numerical value group; a determining unit that determines whether a numerical value in the specified file meets the boundary condition; and a designating unit that, based on a determination by the determining unit designates the specified file to have a numerical value within the numerical range. | 03-22-2012 |
20120109976 | METHOD FOR ASSISTING IN MAKING A DECISION ON BIOMETRIC DATA - The present invention relates to a method for assisting a user in making a decision to compare biometric data of an individual with data from a database relating to a large number of individuals, and biometric data is acquired for an individual concerned, that this data is encoded, that the data items are compared in pairs with corresponding data from the database, that, for each comparison score the duplicate occurrence frequency/non-duplicate occurrence frequency ration is established, that the product of all the available ratios is calculated, that this product is standardized, that the standardized ratio is compared to a pre-set threshold, that the values greater than the pre-set threshold are kept and that this result is submitted to the user for him to validate it as appropriate. | 05-03-2012 |
20120109977 | KEYWORD DETERMINATION BASED ON A WEIGHT OF MEANINGFULNESS - Example embodiments relate to keyword determination based on a weight of meaningfulness. In example embodiments, a computing device may determine a number of occurrences of a word in a particular document and may then determine a weight of meaningfulness for the word based on the number of occurrences. The computing device may then add the word to a set of keywords for the document based on the weight of meaningfulness. | 05-03-2012 |
20120109978 | AUGMENTING QUERIES WITH SYNONYMS FROM SYNONYMS MAP - Methods, systems, and apparatus, including computer program products, operable to perform operations including receiving through a user interface with an interface language a search query having query terms; using the interface language to select one or more mappings and using the selected mappings to simplify each query term; and applying each simplified query term to a synonyms map to identify possible synonyms with which to augment the search query. In alternative embodiments, the operations include generating a synonyms map from a corpus of documents; where the synonyms map maps each of multiple keys to one or more corresponding variants, where each variant is associated with one or more of document languages. In alternative embodiments, the operations include generating a synonyms map from documents by applying document language-dependent mappings to words in the documents to generate keys for the map. | 05-03-2012 |
20120131021 | Phrase Based Snippet Generation - Disclosed herein is a method, a system and a computer product for generating a snippet for an entity, wherein each snippet comprises a plurality of sentiments about the entity. One or more textual reviews associated with the entity is selected. A plurality of sentiment phrases are identified based on the one or more textual reviews, wherein each sentiment phrase comprises a sentiment about the entity. One or more sentiment phrases from the plurality of sentiment phrases are selected to generate a snippet. | 05-24-2012 |
20120143881 | METHOD FOR POPULATION OF OBJECT PROPERTY ASSERTIONS - Relay of information from technical documentation by contact center workers to assist clients is limited by industry standard storage formats and query mechanisms. A method is disclosed for processing technical documents and tagging them against a Telecom Hardware domain ontology. The method comprises classical ontological Natural Language Processing (NLP) approaches to extract information from both text segments and tables, identifying text segments, named entities and relations between named entities described by an existing T-Box. A method for scoring candidate object property assertions derived from text before populating the Telecom Hardware ontology is also disclosed. | 06-07-2012 |
20120173550 | SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR IMPROVING MESSAGES CONTENT USING USER'S TAGGING FEEDBACK - The present invention is a system and method to improve the impact of marketing messages broadcasted to various web communities. Marketing communication keywords that are predefined are matched against tags set by private and public user's tagging communities. Semantic analysis is applied on the keywords and the tags and resulting associations allow determining relevance of marketing keywords. Matches indicate where marketing people have met their goals while matching gaps indicate marketing messages have not been perceived by the companies or the market. Valuable feedback is thus obtained to help re-enforce the initial messages that were not received or to replace the message wording by the one perceived from the identified market tags. | 07-05-2012 |
20120179696 | System and Process for Concept Tagging and Content Retrieval - A system and process for tagging electronic documents or other electronic content with concepts mentioned, contained, or otherwise described in that content. Once tagged, the content may be searchable, indexable, and retrievable in order to provide that content to an end user or another recipient. The system may be configured to handle a considerable number of asset files and a large number of users, workflows, and access applications simultaneously. The system may auto-tag the content and also may include a user interface for confirming and updating those tags and for manually creating new or additional tags. Content may include documents such as medical documents relating to procedures, diagnoses, medications or other domains. Alternatively, the content may include information about various care providers, in order to allow a user to locate a physician meeting one or more desired criteria. | 07-12-2012 |
20120197910 | METHOD AND SYSTEM FOR PERFORMING CLASSIFIED DOCUMENT RESEARCH - A system and method for efficiently and accurately identifying relevant document classifications is contemplated. The document analysis system receives classified reference documents along with a relevancy indicator for each document and generates sensory indicators that assist a researcher in identifying relevant classifications that have not been previously researched. In one aspect, the document analysis system generates a table of classifications, the classifications being determined by scoring of each classification cited within each relevant document. The system then determines a sensory indicator (e.g. a color) for each classification that indicates the extent to which the classification has been previously searched. The classification analysis window thus allows the researcher to quickly determine (e.g. by visual inspection) which classification codes have been cited most frequently as well as which classification codes require further search. In this manner the researcher may quickly determine where to direct a next iteration of a search. | 08-02-2012 |
20120209861 | NAVIGATION SYSTEM WITH RULE BASED POINT OF INTEREST CLASSIFICATION MECHANISM AND METHOD OF OPERATION THEREOF - A method of operation of a navigation system includes: generating a point of interest term from an uncategorized point of interest; applying a statistical rule to the point of interest term to generate a category score for the point of interest term; determining a normalized category score based on the category score and on matching the point of interest term and the uncategorized point of interest; and generating a category identifier for the uncategorized point of interest based on the normalized category score being highly ranked for displaying on a device. | 08-16-2012 |
20120215796 | SYSTEM AND METHOD FOR RANKING SEARCH RESULTS WITHIN CITATION INTENSIVE DOCUMENT COLLECTIONS - Systems and methods facilitate a search and identify documents and associated metadata reflecting content of the documents. In one implementation, a method receives a query comprising a set of search terms, identifies a stored document in response to the query, and determines a score value for the retrieved document based on a similarity between one or more of the query search terms and metadata associated with the identified document. The method locates the identified document in a citation network of baseline query results, the citation network comprising a first set of documents that cite to the identified document and a second set of documents cited to by the identified document. The method further determines a new score value of the identified document as a function of the score value and a quantity and a quality of documents within the first and second set of documents. | 08-23-2012 |
20120246177 | CONTENT ITEM SELECTION - A content item, e.g., an icon or advertisement content, is selected for placement in a display environment (e.g., on a map or adjacent to a map) in response to a request for the display environment based on a probability that the content item is relevant to a user that is requesting the display environment. The selection is facilitated by content targeting data (e.g., feature selection and query submission) that can be received from user devices while the map space is presented. | 09-27-2012 |
20120317126 | SYSTEM AND METHOD FOR NEAR AND EXACT DE-DUPLICATION OF DOCUMENTS - A system, method and computer program product for identifying near and exact-duplicate documents in a document collection, including for each document in the collection, reading textual content from the document; filtering the textual content based on user settings; determining N most frequent words from the filtered textual content of the document; performing a quorum search of the N most frequent words in the document with a threshold M; and sorting results from the quorum search based on relevancy. Based on the values of N and M near and exact-duplicate documents are identified in the document collection. | 12-13-2012 |
20130007020 | METHOD AND SYSTEM OF EXTRACTING CONCEPTS AND RELATIONSHIPS FROM TEXTS - An exemplary embodiment of the present techniques extracts concepts and relationships from a text. Concepts may be generated from the text using singular value decomposition, and ranked based on a term weight and a distance metric. The concepts that are ranked above a particular threshold may be iteratively extracted, and the concepts may be merged to form larger concepts until the generation of concepts has stabilized. Relationships may be generated based on the concepts using singular value decomposition, then ranked based on various metrics. The relationships that are ranked above a particular threshold may be extracted. | 01-03-2013 |
20130007021 | LINKAGE INFORMATION OUTPUT APPARATUS, LINKAGE INFORMATION OUTPUT METHOD AND COMPUTER-READABLE RECORDING MEDIUM - A linkage information output apparatus includes: a linkage information retrieval unit for acquiring, upon receiving source information, destination information linked with the source information, a frequency of occurrence of the source information, a frequency of occurrence of linked each of the destination information, and a frequency of occurrence of a link of the source information and each of the destination information from a linkage information accumulation unit; a recognition degree calculation unit calculating, based on each acquired frequency of occurrence, a recognition degree of the source information, a recognition degree of each acquired destination information, and a recognition degree of each link; and a high interest information narrowing unit selecting destination information to output from among each destination information based on a combination of two or more among a recognition degree of the source information, a recognition degree of the destination information, and a recognition degree of the link. | 01-03-2013 |
20130041906 | SYSTEM AND METHOD FOR PROFILING CLIENTS WITHIN A SYSTEM FOR HARVESTING COMMUNITY KNOWLEDGE - A privacy-preserving system and method is disclosed for profiling clients within a system for knowledge management. The method of the present invention discloses steps for generating a client profile in support of receiving and processing messages using scoring techniques and/or filtering techniques. The method of the present invention further includes steps for generating a client profile in support of a method for generating and obtaining responses to messages using scoring techniques and/or filtering techniques. The system of the present invention, includes all means for implementing the method. | 02-14-2013 |
20130086086 | INFORMATION GENERATING COMPUTER PRODUCT, APPARATUS, AND METHOD; AND INFORMATION SEARCH COMPUTER PRODUCT, APPARATUS, AND METHOD - A computer-readable recording medium stores a program causing a computer to execute an information generating process that includes tabulating an appearance frequency for each designated word in an object file group in which character strings are described; identifying for each designated word and based on the appearance frequency tabulated for the designated word, a rank in descending order up to a target appearance rate for the designated words; detecting in an object file selected from the object file group, specific designated words among the identified ranks; and generating for each of the detected specific designated words, index information that indicates the presence/absence of the specific designated word in each object file among the object file group. | 04-04-2013 |
20130091151 | METHODS AND SYSTEMS FOR PERFORMING TIME-PARTITIONED COLLABORATIVE FILTERING - In accordance with disclosed embodiments, there are provided methods, systems, and apparatuses for performing time-partitioned collaborative filtering in an on-demand service environment including, for example, receiving as input, a plurality of access requests for data stored within the host organization and a corresponding plurality of actions for the data to which access is requested; accessing an input table having a time field, action field, item field, and agent field therein; recording time data and agent data for each of the received plurality of access requests and the corresponding plurality of actions; recording an item within the item field and an action within the action field for each of the received plurality of access requests and the corresponding plurality of actions based on the action performed on an item of the data to which access is requested; and analyzing the input table to generate one or more pairs of first actions and items to second actions and items and a time based score for each of the one or more pairs, in which the time based score is dependent upon a time between the actions for each of the one or more pairs. Other related embodiments are disclosed. | 04-11-2013 |
20130124541 | COLLABORATIVE BOOKMARKING - A method and system for collaborating tags in a bookmarking system wherein the bookmarking system includes a plurality of tags applied to content items by a plurality of users, the method and system including, examining all the tags that are applied to all the content items, determining whether two tags have been assigned to the same content item, if two tags have been assigned to the same content item, computing the relative strength of each of the two tags with respect to each other. | 05-16-2013 |
20130132407 | Robust Fitting of Surfaces from Noisy Data - Various embodiments of methods and apparatus for fitting a surface to a data set are disclosed. A frequency distribution of an input data set is determined. Determining the frequency distribution includes assigning each data point of the input data set to a category representing a value of a variable for the respective data point. Responsive to identifying one or more discontinuities of the frequency distribution, a continuous section of the frequency distribution is identified as a first data set. A first equation is fit to the first data set. | 05-23-2013 |
20130151538 | ENTITY SUMMARIZATION AND COMPARISON - An entity summarization system is described herein that mines the Internet and other data source to provide answers to questions such as the relative sentiment of users towards various brands. The system uses a controlled vocabulary list describing a specific aspect of entities of interest. Given an entity name, the system scans the whole content corpus to collect statistics on the words that occur most frequently in the context of the entity name, taking into account proximity information, to produce a weighted list of vocabulary terms describing the entity. Two entities can be compared by normalizing and comparing their weighted term lists. In some embodiments, the system performs these procedures efficiently by leveraging an N-gram web model. Thus, the system provides an automated way to compare two entities to derive information about how users feel about the entities at any given time. | 06-13-2013 |
20130232154 | SOCIAL NETWORK MESSAGE CATEGORIZATION SYSTEMS AND METHODS - Systems and methods of identifying and categorizing social network messages that are relevant to selected categories and text terms are provided. The frequency of text terms appearing in social network messages are calculated for multiple categories. Based on the calculated text term frequency, social network messages can be identified and/or categorized that match a provided set of text terms. Selecting and/or associating text terms and categories are determined by repeatedly analyzing social network messages. | 09-05-2013 |
20130262481 | ASSISTED HYBRID MOBILE BROWSER - A system and a method are disclosed for identifying video files on a webpage and streaming video files to a client device. A server receives browsing data including uniform resource locator for a webpage and identifies missing videos on the webpage. The server identifies a source file for the missing videos including identifying a location for each missing video. The server retrieves a thumbnail for each missing video and provides it to a client device. Additionally, the server transcodes the video file responsive to a user input provided by a user. The transcoded video is streamed to the client device. | 10-03-2013 |
20130297622 | Measuring informative content of words in documents in a document collection relative to a probability function including a concavity control parameter - Processing methods and systems are provided for representing documents relative to importance of words in the document. A processor comprising a weighting model of word importance in a document in a collection relative to an importance of the word in other documents in the collection computes a deviation of distribution of the word from a probability distribution of the word in other documents in the collection, where the deviation distribution is weighted in accordance with a concavity control function. A concavity control parameter is adjustable relative to word frequency. | 11-07-2013 |
20130318104 | METHOD AND SYSTEM FOR ANALYZING DATA IN ARTIFACTS AND CREATING A MODIFIABLE DATA NETWORK - Computer-implemented systems, methods, and computer-readable media for analyzing data in one or more artifacts and creating a modifiable data network includes: extracting the key elements from the one or more artifacts; identifying relationship among the key elements for each of the one or more artifacts; determining a first frequency of each of the key elements; determining a second frequency for each relationship among the key elements; creating a data network showing the key elements and the relationship among the key elements; and enabling a user to modify the data network based on one or more of: the key elements; the relationship among the key elements; the first frequency; and the second frequency. | 11-28-2013 |
20130346424 | COMPUTING TF-IDF VALUES FOR TERMS IN DOCUMENTS IN A LARGE DOCUMENT CORPUS - Technologies pertaining to computing a respective TF-IDF value for each term in each document of a relative large document corpus are described herein. TF-IDF values are computed with respect to terms in documents of a large document corpus by in a single pass over the document corpus. Secondary sorting functionality of a distributed computing framework is exploited to compute TF-IDF values efficiently. | 12-26-2013 |
20140012862 | INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, PROGRAM, AND INFORMATION PROCESSING SYSTEM - An information processing apparatus includes a calculation unit and a generation unit. The calculation unit is configured to calculate a frequency function which is a function relating to an appearance frequency of one or more attribute values of a database having a predetermined attribute and the one or more attribute values relating to the attribute. The generation unit is configured to generate sample data in accordance with the appearance frequency relating to the database on the basis of the frequency function calculated, the sample data including at least a part of the one or more attribute values as one or more sample attribute values. | 01-09-2014 |
20140012863 | SYSTEM AND METHOD FOR TOPIC EXTRACTION AND OPINION MINING - Technique for topic extraction and opinion mining are described. For example, a document that is pertinent to a topic is selected based on searching, using a key phrase, a plurality of documents. A subtopic referenced in the document is identified. A feature of the subtopic is identified based on the document. A rating of the feature of the subtopic is determined based on the document. Using at least one processor, a sentiment of the document is determined based in part on the feature and the rating of the feature. | 01-09-2014 |
20140059058 | CANDIDATE GENERATION FOR PREDICTIVE INPUT USING INPUT HISTORY - A computing device maintains an input history in memory. This input history includes input strings that have been previously entered into the computing device. When the user begins entering characters of an input string, a predictive input engine is activated. The predictive input engine receives the input string and the input history to generate a candidate list of predictive inputs which are presented to the user. The user can select one of the inputs from the list, or otherwise continue entering characters. The computing device generates the candidate list by combining frequency and recency information of the matching strings from the input history. Additionally, the candidate list can be manipulated to present a variety of candidates. By using a combination of frequency, recency and variety, a favorable user experience is provided. | 02-27-2014 |
20140067832 | ESTABLISHING "IS A" RELATIONSHIPS FOR A TAXONOMY - Disclosed are methods for returning to a user an answer to the question “what is .” Concepts and classes to which the concepts belong are determined from a corpus, such as taxonomy. The concepts are mapped to categories according to the structure of the taxonomy. Homonyms for words are collected and scored according to likeliness of use. Concept vectors are assembled for the identified concepts based on articles in the corpus and social media usage. Words are evaluated for generic-ness and a generic score is associated therewith. In responding to a query, the generic-ness of the terms of the query is evaluated and additional context solicited if the terms are generic. Candidate homonym concepts for a string in the query are selected according to context vectors for the homonym concepts. One or more homonym concepts are selected and the one or more categories corresponding to these concepts are returned. | 03-06-2014 |
20140081995 | Method and System for Creating a Data Profile Engine, Tool Creation Engines and Product Interfaces for Identifying and Analyzing File and Sections of Files - A data profile engine identifies, classifies, analyzes, searches, compares and cross-references entire files and sections of files, records and other forms of electronic media, and a tool creation engine in combination with the data profile engine builds custom solutions and product interfaces. | 03-20-2014 |
20140101172 | Configurable Dynamic Matching System - A system is provided that that dynamically matches data originating from one or more data sources. The system analyzes a matching configuration file, where the matching configuration file includes one or more matching configurations. The system modifies a probabilistic matching algorithm of a matching engine at runtime based on the one or more matching configurations and based on two or more data records of the plurality of data records that require matching. The system compares two data records of a plurality of data records using the modified probabilistic matching algorithm. The system generates a match score for the two data records based on the match weight for each data record field. | 04-10-2014 |
20140101173 | METHOD OF PROVIDING INFORMATION OF MAIN KNOWLEDGE STREAM AND APPARATUS FOR PROVIDING INFORMATION OF MAIN KNOWLEDGE STREAM - A method for providing information about a main knowledge stream is disclosed. According to an embodiment of the present invention, the method includes obtaining reference links representing reference relationships among reference documents in each of a plurality of documents stored in a database, determining one or more basic paths connecting the reference links, calculating probability values of the reference links by overlapping the determined basic paths, determining a first document among the documents and an input reference link associated with the first document, and performing a Markov chain model using a probability value of the input reference link, and calculating information about the main knowledge stream associated with the first document using the result obtained by performing the Markov chain model. | 04-10-2014 |
20140136551 | Method and Apparatus of Generating Update Parameters and Displaying Correlated Keywords - Provided is a method of generating updating parameters. The method obtains search keywords used by users within a predetermined time period; counts the search keywords to obtain primary keywords, related keywords, co-search frequencies of each primary keyword and the respective related keywords being searched together, and search frequencies of the primary keywords being searched alone; computes first feature values based on the search frequencies of the primary keywords being searched alone; and then computes second feature values based on the first feature values and the co-search frequencies of the primary keywords and the respective related keywords. The second feature values serve as updating parameters for determining displaying modes of the related keywords. An apparatus of generating updating parameters, and a method and an apparatus of displaying related keywords according to the updating parameters are also provided. The solution keeps abreast with the user trends to allow a better user experience and improve computing performance and efficiency. For a service provider, no special secret algorithm is needed, and the operation is easy with a low development cost. | 05-15-2014 |
20140149433 | Estimating Unique Entry Counts Using a Counting Bloom Filter - A method of estimating a number of unique entry counts of an attribute in a database comprises, with a processor: identifying a sample of entries from an attribute database, determining frequencies of a number of input observations of the sample of entries, determining a number of high frequency values of the sample of entries, and estimating a number of unique entry counts of an attribute within the attribute database using a counting Bloom filter and based on the frequencies of the input observations and the high frequency values. | 05-29-2014 |
20140280242 | Method and apparatus for acquiring hot topics - A method includes: a first word set is acquired from community data within a period; words are selected from the first word set according to a frequency that each word of the first word set appears in the community data during a first group of days, the selected words are determined as hot words and form a second word set, wherein the first group of days are a plurality of days backward from a designated day; and topics are selected from a community topic set according to the second word set, and are determined as hot topics. | 09-18-2014 |
20140297658 | User Profile Recommendations Based on Interest Correlation - A search technology generates recommendations with minimal user data and participation, and provides better interpretation of user data, such as popularity, thus obtaining breadth and quality in recommendations. It is sensitive to the semantic content of natural language terms taken from user profiles, which can include interests, eccentricities, age, gender, and location information associated with the user. The interest information can include music, movies, sports and personality traits. Based on the user's profile information, the system determines which ad from a stock of ads is best suited to a given profile and delivers that ad. The system can be used to match user profiles to provide mate-matching. | 10-02-2014 |
20140297659 | Unsupervised Detection and Categorization of Word Clusters in Text Data - Categorizing data sets obtained from a number of sources includes determining the frequency of appearance of symbols in a first collection of data sets and the frequency of appearance of symbols in a second collection of data sets, determining the most significant symbols for the second collection based on the frequency of appearance in the first collection and the frequency of appearance in the second collection, grouping the most significant symbols into groups according to their appearance in the same data set and ranking the data sets in relation to the symbol groups according to a ranking scheme. Related methods, devices, and/or computer program products are described. | 10-02-2014 |
20140304279 | SYSTEMS AND METHODS FOR OF IDENTIFYING ANOMALOUS DATA IN LARGE STRUCTURED DATA SETS AND QUERYING THE DATA SETS - The technology disclosed relates to automatic generation of tuples from a record set for outlier analysis. Applying this new technology, user need not specify which 1-tuples to combine into n-tuples. The tuples are generated from structured records organized into features (that also could be fields, objects or attributes.) Tuples are generated from combinations of feature values in the records. Thresholding is applied to manage the number of tuples generated. The technology disclosed further relates to indexing and searching high dimensional tuple spaces in a computer-implemented system. | 10-09-2014 |
20140310289 | DATA ANALYTICS WITH NAVIGATION (DAWN) USING ASSOCIATIONS BETWEEN SELECTORS (TERMS) AND DATA ITEMS - Systems and methods are described which use associations between field values, more generally terms, called selectors, and data items, or structures within data items. The associative information is derived from the content of data and can be stored in optimal data structures, generally descriptively named associative matrices, which may be used to perform searches and calculations of data analytics. In some embodiments, calculations use only selector values and their counts, called frequencies, of associated data items, and/or structures within those items. Special queries, executed on the associative information, determine the frequencies. Methods of data analysis use the results of these queries. Applications can display results dynamically as a user creates queries by choosing selectors, changing the queries, and creating new ones, completely intuitively, using point and click. By comparing the results of multiple queries, such an application enables users to dynamically and quantitatively explore associations between facet values. | 10-16-2014 |
20140365510 | DEVICE AND METHOD FOR DETERMINING INTEREST, AND COMPUTER-READABLE STORAGE MEDIUM FOR COMPUTER PROGRAM - A device for determining interest includes a storage portion configured to store, on a user-by-user basis, a co-occurrence frequency in correlation with a user, the co-occurrence frequency indicating how many times a pair of words is used in a same cluster of a first document, on a pair-by-pair basis, to which the user gained access previously; a designating portion configured to allow a person who is to conduct a search to designate a second document and any one of the users; and a determination portion configured to determine that, among the pairs, a pair which is used in a same cluster of the designated second document and which also satisfies a predetermined condition of the co-occurrence frequency corresponding to the designated user is a particular pair in the second document which is probably of high interest to the designated user. | 12-11-2014 |
20140372455 | SMART TAGS FOR CONTENT RETRIEVAL - An aspect provides a method, including: storing an object; obtaining data associated with the object; analyzing, using one or more processors, the data associated with the object to identify one or more key words in the data associated with the object to create one or more tags; and storing the one or more tags in a searchable format. Other aspects are described and claimed. | 12-18-2014 |
20150046472 | WEIGHT ADJUSTMENT IN A PROBABILISTIC MATCHING SYSTEM BASED ON EXTERNAL DEMOGRAPHIC DATA - A record is received including a token without a corresponding predetermined weight. Information pertaining to the received token is retrieved from at least one of external reference information and historic statistics. A token with a predetermined weight closest to the received token is determined based on the retrieved information. The predetermined weight of the closest token is assigned to the received token and data is matched based on the assigned weight of the received token. | 02-12-2015 |
20150058364 | SYSTEMS AND METHODS FOR MATCHING PEOPLE BASED ON PERCEIVED ACTIVITIES - Matching systems and methods for social networking systems can select matches for users based on observed activities. A matching system can include, for example, a preference unit, a monitoring unit, and a matching unit. Generally, the preference unit can receive and process matching preference information for a user; the monitoring unit can monitor the user's activities on or observable by the server; and the matching unit can select and recommend matches for the user based on the monitored activities. Thus, matches can be suggested to the user based on the user's observed activities, and not simply based on the user's potentially inaccurate self-description. | 02-26-2015 |
20150074124 | AUTOMATED DISCOVERY USING TEXTUAL ANALYSIS - An example method includes receiving text from a plurality of documents, segmenting text received text of the plurality of documents, calculating a frequency statistic for each segment of each document, determining segments of potential interest of each document based on calculated frequency statistic, calculating distances between each document of the plurality of documents based on a text metric, and storing segments of potential interest of each document and the distances in a search database. The method may further include receiving a search query and performing a search of information contained in the search database, partitioning documents of search results using the distances, for each partition, determining labels of segments of potential interest for documents of that particular partition, the labels being determined based on a plurality of frequency statistics, and providing determined labels of segments of potential interest for documents of each partition. | 03-12-2015 |
20150106391 | METHOD AND APPARATUS OF GENERATING UPDATE PARAMETERS AND DISPLAYING CORRELATED KEYWORDS - Provided is a method of generating updating parameters. The method obtains search keywords used by users within a predetermined time period; counts the search keywords to obtain primary keywords, related keywords, co-search frequencies of each primary keyword and the respective related keywords being searched together, and search frequencies of the primary keywords being searched alone; computes first feature values based on the search frequencies of the primary keywords being searched alone; and then computes second feature values based on the first feature values and the co-search frequencies of the primary keywords and the respective related keywords. The second feature values serve as updating parameters for determining displaying modes of the related keywords. An apparatus of generating updating parameters, and a method and an apparatus of displaying related keywords according to the updating parameters are also provided. The solution keeps abreast with the user trends to allow a better user experience and improve computing performance and efficiency. For a service provider, no special secret algorithm is needed, and the operation is easy with a low development cost. | 04-16-2015 |
20150113006 | SYSTEMS, DEVICES AND METHODS FOR LIST DISPLAY AND MANAGEMENT - Exemplary embodiments provide systems, devices and methods that allow creation and management of lists of items in an integrated manner on an interactive graphical user interface. A user may speak a plurality of list items in a natural unbroken manner to provide an audio input stream into an audio input device. Exemplary embodiments may automatically process the audio input stream to convert the stream into a text output, and may process the text output into one or more n-grams that may be used as list items to populate a list on a user interface. | 04-23-2015 |
20150302083 | A Combinatorial Summarizer - A combinatorial summarizer includes a plurality of summarization engines, a processor in selective communication with each summarization engine, and computer readable instructions executable by the processor and embodied on a tangible, non-transitory computer readable medium. Each summarization engine is to select a respective plurality of sentences, and generate a relative rank and an associated weight for each sentence of the respective plurality of sentences. The computer readable instructions include instructions to determine a combined weight for each sentence of each respective plurality of sentences. The combined weight is based upon the respective associated weight and a respective relative human rank for each sentence in a set of sentences, including all sentences of each respective plurality of sentences. The computer readable instructions further include instructions to determine a total weight for each summarization engine based, respectively, upon the combined weights for each sentence of each respective plurality of sentences. | 10-22-2015 |
20150310002 | Selective Display of Comprehension Guides - Techniques are provided for selectively and dynamically determining one or more words of an electronic book to present with comprehension guides. For instance, an electronic device rendering an electronic book may determine whether to display some, all, or no words of the book with comprehension guides for words within the electronic book based on word difficulty, contextual importance or aspects of the user. Techniques are also provided for determining the content of comprehension guides to be presented with the words. | 10-29-2015 |
20150317314 | CONTENT SEARCH VERTICAL - Disclosed in some examples are methods, systems, and machine readable mediums which find a special set of keywords which, when used to search a supplemental set of search verticals (e.g., the newly added search verticals), return high quality results. When a user enters a search containing one or more keywords from the special set of keywords, the system may search both the standard set of search verticals (as normal), but also the one or more keywords may be used to search the supplemental set of search verticals. Results from both may then be presented to the user. | 11-05-2015 |
20150324868 | Query Categorizer - A system and method for receiving, by one or more processing devices, a search query containing one or more query terms from a remote computing device; determining, by the one or more processing devices, a query categorization of the search query based on one or more relevant query terms of the one or more query terms, the query categorization being indicative of one or more application categories to which the search query likely pertains; generating, by the one or more processing devices, an advertisement based on the query categorization; encoding, by the one or more processing devices, the advertisement in search results; and providing, by the one or more processing devices, the search results to the remote computing device. | 11-12-2015 |
20150331879 | SUGGESTED KEYWORDS - A method and system to suggest keywords to a social network member is described. A suggested keywords system, in one example embodiment, examines phrases that appear in profiles maintained by the on-line social networking system that are similar to the target profile and identifies those words and phrases that are most prominent in these profiles, utilizing a graph-based approach. These most prominent words and phrases may be presented to the target member as suggested keywords to be included in the member's professional summary. | 11-19-2015 |
20160004701 | Method for Representing Document as Matrix - A method for representing a document as a matrix in an electronic device comprising a processor and a memory storing instructions executed by the processor and the method includes creating a term vector comprising at least one term in the document, calculating a weight of each of the at least one term for each of at least one concept in the document and representing the document as a matrix by mapping the at least one term included in the document onto any one of rows and columns of the matrix, and mapping the at least one concept onto the other of the rows and columns of the matrix and the matrix comprises a weight the at least one term has in the document as a component. | 01-07-2016 |
20160012141 | SYSTEM AND METHOD FOR ANALYZING COMMUNICATIONS | 01-14-2016 |
20160070816 | Real Time Analysis of Big Data - This invention relates to a method for processing large scale unstructured data. The method includes receiving streamed input data from live data sources, deriving emergent patterns in data subsets, identifying a repeating pattern and corresponding data subset within the emergent patterns, reducing the identified data subset and identified pattern to a compressed signature, and storing the streamed input data with the compressed signature and without the identified data subset. The data subset can be rebuilt if necessary using the compressed signature | 03-10-2016 |
20160078055 | DATA PROCESSING APPARATUS USING CALCULATION OF HISTOGRAM INTERSECTION - In a histogram intersection calculating apparatus, a histogram intersection calculating unit calculates a histogram intersection to compare histograms of query data and target data to obtain a score value of the histogram intersection. A calculation controlling unit makes the histogram intersection calculating unit calculate the histogram intersection in descending order of bin number of the histogram of the query data from the bin number having the maximum frequency value by using frequency values in the bin numbers of the query data and frequency values in the bin numbers of the target data to obtain a score value of the histogram intersection. | 03-17-2016 |
20160085871 | SEARCHING FOR INFORMATION BASED ON GENERIC ATTRIBUTES OF THE QUERY - Searching information includes: receiving current query data from a client; extracting generic attribute features of the current query data, wherein the generic attribute features are used for calculating a plurality of confidence degrees of the current query data that correspond to a plurality of categories, each of the confidence degrees indicating a degree of confidence that the current query data belongs to a respective one of the plurality of categories; determining the plurality of confidence degrees of the current query data based at least in part on the generic attribute features; selecting a category based at least in part on the plurality of confidence degrees, the selected category being one of the plurality of categories and having a confidence degree higher than a confidence degree of another category; searching in the selected category for a search result that corresponds to the current query data; and returning the search result. | 03-24-2016 |
20160098478 | DOCUMENT SORTING SYSTEM, DOCUMENT SORTING METHOD, AND DOCUMENT SORTING PROGRAM - It is possible to analyze digitized document information gathered to be provided as evidence in a legal action and to classify the document information to be easily accessible in the legal action. A document classification system includes a keyword database, a related term database, a first classification unit which extracts a document including a keyword recorded in the keyword database from document information and attaches a specific classification mark to the extracted document based on keyword-corresponding information, and a second classification unit which extracts a document including a related term recorded in the related term database from document information, to which the specific classification mark is not attached in the first classification unit, calculates a score based on an evaluated value of the related term included in the extracted document and the number of related terms, and attaches a predetermined classification mark to a document, for which the score exceeds a given value, among the documents including the related term based on the score and the related term-corresponding information. | 04-07-2016 |
20160117359 | IDENTIFYING ENTITIES IN EMAIL SIGNATURE BLOCKS - Identifying entities in email signature blocks is described. A system scores each token, in a sequence of tokens from an email signature block, based on entity types, wherein each token is a word, a punctuation symbol, or an end-of-line character. The system identifies each entity sequence which includes a number of entities that matches the number of tokens in the sequence of tokens. The system identifies an entity sequence with a highest score based on applying scores for each token in the sequence of tokens to each identified entity sequence. The system outputs the sequence of tokens as an identified set of entities based on the entity sequence with the highest score. | 04-28-2016 |
20160125071 | DYNAMIC LOADING OF CONTEXTUAL ONTOLOGIES FOR PREDICTIVE TOUCH SCREEN TYPING - A system comprising a computer-readable storage medium storing at least one program and a computer-implemented method for generating a navigable user interface for the dynamic loading of contextual ontologies for predictive typing. In some embodiments, the method may include receiving an input from a client device, gathering context data corresponding to the input, and providing a predictive typing entry based on the context data and the received input, in a navigable user interface. | 05-05-2016 |
20160125073 | VISUALIZING CONFLICTS IN ONLINE MESSAGES - Visualizing social media conflict is provided. Active users in a set of human users authoring a number of textual messages regarding a particular topic more than a threshold number of textual messages are selected. Keywords are selected that occur more than a threshold number of times within the textual messages regarding the particular topic. A sentiment score is computed for each of the keywords occurring more than the threshold number of times within the textual messages using a keyword co-occurrence graph. A sentiment of each of the active users is determined based on the computed sentiment score of each of the selected keywords that are authored by a particular active user. Two distinct groups from the active users are selected based on at least one of a relationship between the two distinct groups and a determined degree of conflict between the two distinct groups with regard to the particular topic. | 05-05-2016 |
20160132506 | SYSTEMS AND METHODS FOR ELECTRONICALLY MINING GENOMIC DATA - A data analysis method and computer system electronically mines published articles from existing medical literature sources to discover associations that may exist between various diseases and various genes and/or gene mutations or other genetic changes. The method and system then organizes, categorizes and prioritizes the discovered associations in accordance with the strength of evidence supporting these associations. The resulting information can then be integrated into the processing of genome sequencing data to more quickly determine what genome sequencing data is of most relevance for clinical decision makings. | 05-12-2016 |
20160170986 | METHOD AND APPARATUS FOR STAGED CONTENT ANALYSIS | 06-16-2016 |
20160171091 | APPLICATION QUERY CONVERSION | 06-16-2016 |
20160188703 | CONTRASTIVE MULTILINGUAL BUSINESS INTELLIGENCE - Technology is discussed herein for identifying comparatively trending topics between groups of posts. Groups of posts can be selected based on parameters such as author age, location, gender, etc., or based on information about content items such as when they were posted or what keywords they contain. Topics, as one or more groups of words, can each be given a rank score for each group based on the topic's frequency within each group. A difference score for selected topics can be computed based on a difference between the rank score for the selected topic in each of the groups. When the difference score for a selected topic is above a specified threshold, that selected topic can be identified as a comparatively trending topic. | 06-30-2016 |
20160188704 | SYSTEMS AND METHODS TO DETERMINE TRENDING TOPICS FOR A USER BASED ON SOCIAL GRAPH DATA - Systems, methods, and non-transitory computer readable media configured to determine a degree of separation between a user and a connection within a social network of the user, the connection associated with an interaction from which at least topic is determined. A value of affinity between the user and the connection is determined. A weight reflecting a value of interest similarity between the user and the connection is determined. A term based on the degree of separation, the value of affinity, and the weight reflecting a value of interest similarity is calculated. Terms associated with the at least one topic are combined to generate a composite score associated with the at least one topic to determine whether to present the at least one topic to the user. | 06-30-2016 |