Patent application number | Description | Published |
20140046654 | TEXT PROCESSING METHOD, SYSTEM AND COMPUTER PROGRAM - A method includes hierarchically identifying occurrences of some of the words in the set of sentences; creating a first index for each of some of the words based on the upper hierarchy of occurrences identified for each word; receiving input of a queried word; hierarchically identifying occurrences of the queried word in the set of sentences; creating a second index based on the upper hierarchy of occurrences identified for the queried word; comparing the first index and the second index to calculate an estimated value for the number of occurrences of a word in the neighborhood of the queried word; and calculating the actual value of the number of occurrences of a word in the neighborhood of the queried word based on an upper hierarchy and lower hierarchy of the occurrences on condition that the estimated value is equal to or greater than a predetermined number. | 02-13-2014 |
20140046953 | TEXT PROCESSING METHOD, SYSTEM AND COMPUTER PROGRAM - A method includes hierarchically identifying occurrences of some of the words in the set of sentences; creating a first index for each of some of the words based on the upper hierarchy of occurrences identified for each word; receiving input of a queried word; hierarchically identifying occurrences of the queried word in the set of sentences; creating a second index based on the upper hierarchy of occurrences identified for the queried word; comparing the first index and the second index to calculate an estimated value for the number of occurrences of a word in the neighborhood of the queried word; and calculating the actual value of the number of occurrences of a word in the neighborhood of the queried word based on an upper hierarchy and lower hierarchy of the occurrences on condition that the estimated value is equal to or greater than a predetermined number. | 02-13-2014 |
20150026553 | ANALYZING A DOCUMENT THAT INCLUDES A TEXT-BASED VISUAL REPRESENTATION - A hardware device analyzes a document that includes a text-based visual representation. A correspondence information hardware storage device holds known representations of graphical images as text-based visual representations. The graphical images depict portraits of physical objects. The text-based visual representations are associated with information that each describe one of the physical objects. An identification hardware device identifies a text-based visual representation within a document. The identification hardware device matches the text-based visual representation within the document to one or more of the text-based visual representations stored in the correspondence information hardware storage device. An editing hardware device retrieves information from the correspondence information hardware storage device that is identified, by the identification hardware device, as describing a text-based visual representation component within the document. The editing hardware device displays the text-based visual representation component within the document and information that describes the text-based visual representation on a display. | 01-22-2015 |
20150039535 | GRASPING A BIAS OF INFORMATION FROM AN INTERNET MEDIUM FOR SUPPORTING A SURVEY - In one embodiment of the present invention, an apparatus may be used for supporting a survey based on information in an Internet medium. The apparatus comprises: a first acquisition hardware unit, wherein the first acquisition hardware unit acquires first evaluation information representing a degree of evaluation acquired by a survey of a real society pertaining to a prescribed target; a second acquisition hardware unit, wherein the second acquisition hardware unit acquires second evaluation information representing a degree of evaluation in the Internet medium pertaining to the prescribed target; and an estimator hardware unit, wherein the estimator hardware unit estimates a bias in information in the Internet medium based on a deviation of the second evaluation information from the first evaluation information. | 02-05-2015 |
20150227620 | CATEGORIZING KEYWORDS - A keyword to be categorized is received. A category dictionary including categories having associated registered keywords, and a text corpus are received. Registered keywords are identified in the category dictionary having a degree of similarity to the keyword to be categorized that is equal to or greater than a predetermined value, and the categories associated with the identified registered keywords are extracted. Registered keywords are identified that are co-occurring in the text corpus with the keyword to be categorized, and the categories associated with the identified co-occurring registered keywords are extracted. A degree of importance is determined for each extracted category based on a function of the identified registered keywords in the category dictionary and/or a function of the identified co-occurring registered keywords. The extracted categories are outputted, with at least an indication of each category's relative importance, as category candidates for categorizing the keyword to be categorized. | 08-13-2015 |
20150242537 | PATTERN MATCHING BASED CHARACTER STRING RETRIEVAL - Embodiments relate to generating a retrieval condition for retrieving a target character string from texts by pattern matching. An aspect includes dividing a first text into words. Another aspect includes generating a converted character string by performing at least one of appending at least one character in at least either one of previous and subsequent positions of the target character string. Another aspect includes replacing at least one character of the target character string. Another aspect includes generating the retrieval condition for retrieval candidates in the words of the first text, the retrieval condition comprising determining that a retrieval candidate matches the target character string and does not match the converted character string based on a ratio of a part of the retrieval candidate which matches the converted character string and corresponds to the target character string is less than or equal to a reference frequency. | 08-27-2015 |
20150278312 | CALCULATING CORRELATIONS BETWEEN ANNOTATIONS - An apparatus for calculating a correlation between annotations includes a first obtaining unit configured to provide an annotator with a first data group capable of being evaluated to determine whether or not to attach annotations thereto, and obtaining a plurality of first confidence levels indicating certainty of the annotations in the first data group, the annotator outputting confidence levels indicating certainty of annotations to be attached to data when the data is given; a second obtaining unit configured to provide the annotator with a second data group used to calculate a correlation between the plurality of annotations, and thereby obtaining a plurality of second confidence levels indicating the certainty of the annotations in the second data group; and a computing unit configured to compute an estimated value of the correlation between the plurality of annotations based on the plurality of first and second confidence levels. | 10-01-2015 |
20150293907 | CALCULATING CORRELATIONS BETWEEN ANNOTATIONS - An apparatus for calculating a correlation between annotations includes a first obtaining unit configured to provide an annotator with a first data group capable of being evaluated to determine whether or not to attach annotations thereto, and obtaining a plurality of first confidence levels indicating certainty of the annotations in the first data group, the annotator outputting confidence levels indicating certainty of annotations to be attached to data when the data is given; a second obtaining unit configured to provide the annotator with a second data group used to calculate a correlation between the plurality of annotations, and thereby obtaining a plurality of second confidence levels indicating the certainty of the annotations in the second data group; and a computing unit configured to compute an estimated value of the correlation between the plurality of annotations based on the plurality of first and second confidence levels. | 10-15-2015 |