Class / Patent application number | Description | Number of patent applications / Date published |
382229000 | Context analysis or word recognition (e.g., character string) | 75 |
20080212882 | Pattern Encoded Dictionaries - The present invention is related to a method and system providing a pattern-classifier encoded dictionary for use in language processing systems implemented in computer systems. The pattern encoded dictionary according to the present invention may be utilized in Optical Character Recognition systems or (OCR) or Automatic Speech Recognition systems (ASR) to retrieve reliably identified words used in an adaptive manner or as a tool to configure said OCR or ASR system. | 09-04-2008 |
20080240582 | METHOD AND APPARATUS FOR CHARACTER STRING RECOGNITION - A method and an apparatus for character string recognition may be provided that enables prevention of a decrease in recognition accuracy for a character string even when distortion of an image appears in a direction perpendicular to a medium transfer direction. | 10-02-2008 |
20080273802 | Program and apparatus for forms processing - A form processing program which is capable of automatically extracting keywords. When the image of a scanned form is entered, a layout recognizer extracts a readout region of the form image, a character recognizer recognizes characters within the readout region. A form logical definition database stores form logical definitions defining strings as keywords according to logical structures which are common to forms of same type. A possible string extractor extracts as possible strings combinations of recognized characters each of which satisfies defined relationships of a string. A linking unit links the possible strings according to positional relationships, and determines a combination of possible strings as keywords. | 11-06-2008 |
20080317359 | PRINTING APPARATUS AND PRINTING METHOD | 12-25-2008 |
20090016617 | Sender dependent messaging viewer - A mobile apparatus for receiving an electronic message that comprises a text message from a sender. The mobile device comprises a contact records repository that stores a number digital images, which are associated with a respective number of user identifiers. The mobile device further comprises a text analysis module that identifies predefined expressions in the text message, an image-editing module that matches one of the user identifiers with the sender and edits the associated digital image according to the identified predefined expression, and an output module for outputting the edited digital image. | 01-15-2009 |
20090028446 | DOCUMENT IMAGE PROCESSING APPARATUS, DOCUMENT IMAGE PROCESSING METHOD, DOCUMENT IMAGE PROCESSING PROGRAM, AND RECORDING MEDIUM ON WHICH DOCUMENT IMAGE PROCESSING PROGRAM IS RECORDED - An image of a character string composed of M pieces of characters is clipped from a document image, and the image is divided into separate characters. Image features of each character image are extracted. Based on the image features, N (N>1, integer) pieces of character images in descending order of degree of similarity are selected as candidate characters, from a character image feature dictionary which stores the image features of character image in units of character, and a first index matrix of M×N cells is prepared. A candidate character string composed of a plurality of candidate characters constituting a first column of the first index matrix, is subjected to a lexical analysis according to a language model, and whereby a second index matrix having a character string which makes sense is prepared. In the language model, statistics are taken and then, the lexical analysis is performed. | 01-29-2009 |
20090074306 | Estimating Word Correlations from Images - Word correlations are estimated using a content-based method, which uses visual features of image representations of the words. The image representations of the subject words may be generated by retrieving images from data sources (such as the Internet) using image search with the subject words as query words. One aspect of the techniques is based on calculating the visual distance or visual similarity between the sets of retrieved images corresponding to each query word. The other is based on calculating the visual consistence among the set of the retrieved images corresponding to a conjunctive query word. The combination of the content-based method and a text-based method may produce even better result. | 03-19-2009 |
20090092323 | SYSTEMS AND METHODS FOR CHARACTER CORRECTION IN COMMUNICATION DEVICES - A system and method for character error correction is provided, useful for a user of mobile appliances to produce written text with reduced errors. The system includes an interface, a word prediction engine, a statistical engine, an editing distance calculator, and a selector. A string of characters, known as the inputted word, may be entered into the mobile device via the interface. The word prediction engine may then generate word candidates similar to the inputted word using fuzzy logic and user preferences generated from past user behavior. The statistical engine may then generate variable error costs determined by the probability of erroneously inputting any given character. The editing distance calculator may then determine the editing distance between the inputted word and each of the word candidates by grid comparison using the variable error costs. The selector may choose one or more preferred candidates from the word candidates using the editing distances. | 04-09-2009 |
20090154815 | DOCUMENT INFORMATION PROCESSING APPARATUS AND DOCUMENT INFORMATION PROCESSING PROGRAM - A document information processing apparatus is obtained in which there is no need to provide the consistency of management between the instances of documents and their metadata, that is, there is no fear that inconsistency in management might be caused, thereby eliminating the possibility of loading the system, which would otherwise result from the provision of managerial consistency, as well as making it possible to improve their versatility. The apparatus includes a document input and output section that is able to at least input or output a document as an image data, an operation timing detection section that detects predetermined operation timing for the document, a metadata acquisition section that acquires metadata of the document based on the operation timing, and a metadata description section that describes the metadata in a predetermined format based on instance data of the document at predetermined timing with respect to the input or output of the document. | 06-18-2009 |
20090175545 | METHOD FOR COMPUTING SIMILARITY BETWEEN TEXT SPANS USING FACTORED WORD SEQUENCE KERNELS - A computer implemented method and an apparatus for comparing spans of text are disclosed. The method includes computing a similarity measure between a first sequence of symbols representing a first text span and a second sequence of symbols representing a second text span as a function of the occurrences of optionally noncontiguous subsequences of symbols shared by the two sequences of symbols. Each of the symbols comprises at least one consecutive word and is defined according to a set of linguistic factors. Pairs of symbols in the first and second sequences that form a shared subsequence of symbols are each matched according to at least one of the factors. | 07-09-2009 |
20090190841 | WORD RECOGNITION METHOD AND STORAGE MEDIUM THAT STORES WORD RECOGNITION PROGRAM - A recognition process is executed for each character of an input character string corresponding to a word to be recognized, and a probability is determined that the feature appears, which is obtained as a result of character recognition using, as a condition, each character of each word in a word dictionary having stored therein candidates of the word to be recognized, and this probability is divided by a probability that the feature obtained as a result of character recognition appears. Each division result obtained for each character of each word in the word dictionary is multiplied for all the characters, and all the multiplication results obtained for each word in the word dictionary are added. Then, the multiplication result obtained for each word in the word dictionary is divided by the addition result, and based on this result, the recognition result of the particular word is obtained. | 07-30-2009 |
20090196512 | Method And System For Removing Inserted Text From An Image - A method of removing inserted text from a digital image includes recognizing the inserted text in the digital image using optical character recognition; and replacing pixels of the digital image corresponding to the inserted text so as to remove the inserted text from the digital image. A computer program product for removing inserted text from a digital image includes an inserted text removal program stored on a computer-readable medium, the program including an optical character recognition module for recognizing inserted text in a digital image; and an extrapolation module for replacing pixels corresponding to the inserted text in the digital image with replacement image pixels so as to remove the inserted text from the digital image. A photo printing kiosk includes an interface for receiving a digital image; an optical character recognition module for recognizing inserted text in the digital image; and an extrapolation module for replacing pixels corresponding to the inserted text in the digital image with replacement image pixels so as to remove the inserted text from the digital image. | 08-06-2009 |
20090214125 | Image Processing Method and Image Processing Apparatus - An image processing method has the steps of: scanning respective pages in a document; generating respective pieces of image data corresponding to the pages; identifying respective orientations of isolated images contained by each of the pages according to a result of character recognition for the image data; determining whether or not the isolated images contained by the page have different orientations; assigning respective ones of the isolated images to new pages; and setting respective orientations of the isolated images in the new pages as respective upright orientations of the isolated images. | 08-27-2009 |
20090238474 | STRING SEARCHING FACILITY - In embodiments of the present invention improved capabilities are described for scanning a data set for the presence of a target string. The data set may be received at a computing facility and cause a scanning program to execute. A first character pair in the data set may be identified where each character making up the first character pair is identified in a vector map. It may then be confirmed that the first character pair matches a positive indicated bitmask in a bitmap matrix, and verify that the position of the first character pair matches a position of a matching character pair in the target string. An action may be caused to be taken as a result of the verification. | 09-24-2009 |
20090317003 | CORRECTING SEGMENTATION ERRORS IN OCR - A method for encoding characters includes identifying one or more sequences of the character codes that are likely to be generated due a segmentation error in application of a pattern recognition process, and associating a respective extension character code with each of the sequences. The area of an image containing characters is divided into segments, such that each segment contains approximately one character. The pattern recognition process is applied to each of the segments in order to generate an input string of character codes. At least one of the identified sequences of the character codes in the input string is replaced with the respective extension character code so as to generate a modified string. The output string is determined by comparing the modified string to a directory of known strings. | 12-24-2009 |
20100027896 | AUTOMATED APPLICATION INTERACTION USING A VIRTUAL OPERATOR - A computer-implemented method for automating interaction with a computer system includes linking a control computer system to an input interface and to an output interface of a client computer system, which is operative for producing user interface images on a display device. The control computer system executes distinct software modules that include a virtual operator for simulating actions of a human operator. Execution of the software modules causes the control computer system to capture an image from the output interface, and to recognize information in the image. In response to the information, the virtual operator controls an input device to automatically execute predetermined operations on the client computer system via the input interface. | 02-04-2010 |
20100034471 | METHOD AND APPARATUS THAT ENABLES REMOTE OPERATION OF A PLEASURING DEVICE THROUGH A COMMUNICATIONS NETWORK - A method and apparatus that enables remote operation of a pleasuring device through a communications network is disclosed. The method may include receiving one or more text strings at a first communication device from a second communication device through the communications network, recognizing one or more words or phrases from the received one or more text strings, determining if the recognized one or more words or phrases are found in a lexicon, matching the recognized one or more words or phrases with its corresponding action stored in the lexicon, and signaling the pleasuring device to perform the corresponding action. | 02-11-2010 |
20100054612 | IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, COMPUTER-READABLE MEDIUM AND COMPUTER DATA SIGNAL - An image processing method includes receiving an image including a writing, detecting a position of the writing in the received image, detecting a position of a character image in the received image, performing character recognition on the detected character image, comparing the position of the detected writing with the position of the detected character image to associate the writing with a result of the character recognition, translating the result of the character recognition so as to be recognizable as a translation of the result of the character recognition associated with the writing, generating an image of the translation result associated with the writing, so as to be output in a format different from a format of an image of a translation result that is not associated with the writing, and outputting the image of the translation result associated with the writing. | 03-04-2010 |
20100092095 | DATA GATHERING IN DIGITAL AND RENDERED DOCUMENT ENVIRONMENTS - A system for determining viewership of articles or ads in printed publications and using the determined viewership to determine author compensation, ad rates, or other historical or demographic related information is described. The system receives text capture information of portions of a number of rendered documents that is generated by capture devices operated by a number of users. The system searches digital content using the received capture information and determines a readership value for components of the rendered document based upon the search. | 04-15-2010 |
20100128994 | PERSONAL DICTIONARY AND TRANSLATOR DEVICE - A handheld dictionary and/or translation device includes an optical scanning element activated by one of an optical switch activated by proximity to text to be scanned, a one-way mechanical switch within the scanning element activated by contact with a printed page, and a mechanical switch separate from the optical element. When activated, the device may be used to scan one or more printed word(s) by wanding the optical element over the word(s). In one aspect, the device includes a display that is activated simultaneously with the optical element. Once the unknown word is scanned, optical character recognition software converts the word to an electronic format, and a controller retrieves the word's definition (or translation) from a corpus within the device for output to the display and/or an audio speaker. The device may be incorporated within a cellular telephone, a personal digital assistant (PDA), a pager, or a handheld computer. | 05-27-2010 |
20100177970 | CAPTURING TEXT FROM RENDERED DOCUMENTS USING SUPPLEMENTAL INFORMATION - A system for processing a text capture operation is described. The system receives text captured from a rendered document in the text capture operation. The system also receives supplemental information distinct from the captured text. The system determines an action to perform in response to the text capture operation based upon both the captured text and the supplemental information. | 07-15-2010 |
20100266215 | VARIABLE-STRIDE STREAM SEGMENTATION AND MULTI-PATTERN MATCHING - A variable-stride multi-pattern matching apparatus segments patterns and input streams into variable-size blocks according to a modified winnowing algorithm. The variable-stride pattern segments are used to determine the block-symbol alphabet for a variable-stride discrete finite automaton (VS-DFA) that is used for detecting the patterns in the input streams. Applications include network-intrusion detection and protection systems, genome matching, and forensics. The modification of the winnowing algorithm includes using special hash values to determine the position of delimiters of the patterns and input streams. The delimiters mark the beginnings and ends of the segments. In various embodiments, the patterns are segmented into head, core, and tail blocks. The approach provides for memory, memory-bandwidth, and processor-cycle efficient, deterministic, high-speed, line-rate pattern matching. | 10-21-2010 |
20100284625 | Computing Visual and Textual Summaries for Tagged Image Collections - Described is a technology for computing visual and textual summaries for tagged image collections. Heterogeneous affinity propagation is used to together identify both visual and textual exemplars. The heterogeneous affinity propagation finds the exemplars for relational heterogeneous data (e.g., images and words) by considering the relationships (e.g., similarities) within pairs of images, pairs of words, and relationships of words to images (affinity) in an integrated manner. | 11-11-2010 |
20100316300 | DETECTION OF OBJECTIONABLE VIDEOS - A video that advertises a particular web site may be a form of video spam. For example, pornographers often advertise their web sites by displaying a link to their web sites in videos, and then placing the videos on video-sharing services. This type of video spam may be detected by analyzing the video for the presence of text and then determining whether the text is a URL. If the text is a URL, the URL may be checked to determine whether it points to an objectionable web site. The determination of whether a URL points to an objectionable web site may be made by comparing the URL with a blacklist and/or whitelist, or by retrieving the URL and analyzing the retrieved content. If a video is found to be an advertisement for an objectionable web site, action may be taken, such as removing the video from a content database. | 12-16-2010 |
20100316301 | METHOD FOR EXTRACTING REFERENTIAL KEYS FROM A DOCUMENT - Embodiments of the present invention provide methods, computer-readable media, and systems for extracting referential keys from a document. A document is parsed to identify at least one key, the key being identified from at least one contextual indication. The key is classified according to a key type, the key type being identified from the contextual indication. The key is extracted and then stored in a location in a structured shell with the location corresponding to the key type. As a result, the key can be found by a search seeking one of the key and the key-type allowing a searcher to identify the document from which the key was extracted. | 12-16-2010 |
20100316302 | Adaptive Image Maps - A computer implemented method of processing an image for display on a mobile communication device includes extracting a portion of an image based on an image map. The image map relates to the portion of the image. The method also includes generating a document that comprises the extracted portion of the image and transmitting the generated document to a remote device for display. The method may also include assigning a selectable link to the extracted portion of the image and receiving a request from the remote device for an initial document having the image and image map. Additionally, the method may include storing in a database the generated document and transmitting the stored generated document in response to future requests for the initial document. | 12-16-2010 |
20110075940 | METHODS FOR MONITORING USAGE OF A COMPUTER - A computer implemented method for monitoring use of a computer and related systems and compositions of matter are disclosed. In various aspects, the methods may include the step of capturing an image of a monitored region of a computer screen of a computer, and the step of extracting image text from the image. | 03-31-2011 |
20110075941 | DATA MANAGING APPARATUS, DATA MANAGING METHOD AND INFORMATION STORING MEDIUM STORING A DATA MANAGING PROGRAM - A data managing apparatus having a word extracting portion that extracts one or a plurality of words from document data and a correlating portion that correlates the words extracted by the word extracting portion with related data related to the document data, includes a frequency storage portion having information about a frequency of each of the words stored thereon for each word; an infrequently-appearing word selecting portion that selects an infrequently-appearing word having the frequency lower than a given threshold value predetermined among the words extracted by the word extracting portion based on the information stored in the frequency storage portion; and a frequency updating portion that updates the information about frequency stored in the frequency storage portion in accordance with extraction by the word extracting portion or correlation by the correlating portion, the correlating portion correlating the infrequently-appearing word selected by the infrequently-appearing word selecting portion among the words extracted from the document data by the word extracting portion with the related data related to the document data. | 03-31-2011 |
20110097002 | APPARATUS AND METHOD OF PROCESSING IMAGE INCLUDING CHARACTER STRING - Image processing apparatus and method perform a character recognition process to an area indicating a character string included in image data, generate layout information for layout of the character string on the basis of the area, and perform layout of a result of the character recognition process on the basis of the generated layout information, thereby enabling to perform a process, which uses the layout information, to a document which includes various layouts. | 04-28-2011 |
20110110599 | DOCUMENT IMAGE GENERATION APPARATUS, DOCUMENT IMAGE GENERATION METHOD AND RECORDING MEDIUM - A character is recognized from an original document image that is obtained, for example, by an image reading apparatus. And a natural language processing is performed on a document configured from the recognized characters. Thus, a translation (supplementary annotation) for a word or a phrase in the document is obtained. Then, a supplementary annotation added document image is generated with an original document image layer configured from an original document image on which a supplementary annotation text layer is superimposed. In the supplementary annotation text layer, the translation is placed at a position corresponding to a position in an interline space near the word or the phrase. Furthermore, in addition to a translation, an underline is placed for a discontinuous phrase. | 05-12-2011 |
20110158548 | WORD RECOGNITION METHOD, WORD RECOGNITION PROGRAM, AND INFORMATION PROCESSING DEVICE - A word recognition method in which as a result of a recognition process performed on an image of a character string, one or more character candidates are obtained for each of characters forming the character string, according to which a word corresponding to the character string is recognized using a word database having registered therein a plurality of words includes setting a predetermined number of words included in the word database, as initial word candidates, performing a process in which the characters forming the recognition target character string are set as processing targets, one character by one character, and every time a processing target character is set, word candidates present at a time of the setting are narrowed down to words in which character candidates obtained for the processing target character are arranged at a same location as a location where the processing target character is arranged in the recognition target character string, and identifying, when a narrowing-down process performed on a last processing target character in the recognition target character string is completed, a word corresponding to the character string from among word candidates extracted at a point in time of the completion of the narrowing-down process. | 06-30-2011 |
20110170788 | METHOD FOR CAPTURING DATA FROM MOBILE AND SCANNED IMAGES OF BUSINESS CARDS - According to various embodiments of the invention, methods are provided for capturing various data fields from mobile and scanned images of business cards. Most embodiments are provided for capturing Personal and Company name fields, which are difficult to identify using conventional OCR and data capture techniques. In addition, some embodiments of the invention involve methods for capturing an email, URL or telephone number from an image of a business card. | 07-14-2011 |
20110222788 | INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM - An information processing device includes a recognition section for recognizing a feature keyword representing a feature of at least part of text content, an additional information acquisition section for acquiring additional information related to the text content from an outside of the text content in response to the recognized feature keyword, and a control section for controlling the additional information acquired by the additional information acquisition section to be output along with the part of the text content. | 09-15-2011 |
20110222789 | CHARACTER STRING SENSING DEVICE, CHARACTER EVALUATING DEVICE, IMAGE PROCESSING DEVICE, CHARACTER STRING SENSING METHOD, CHARACTER EVALUATION METHOD, CONTROL PROGRAM, AND RECORDING MEDIUM - Reduction of a processing load, and shortening of a processing time, is realized by performing character string sensing processing on an image. A character string sensing device senses a character string including at least one character from an image. The character string sensing device includes a character information storage unit in which an evaluation value, expressing difficulty of false sensing of the character, is stored in each character. The character string sensing device also includes a search sequence determining unit that determines a search sequence of each character based on the evaluation value of each character included in a keyword input to the character string sensing device as the character string to be sensed. The evaluation value is stored in the character information storage unit. A character search unit searches each character included in the keyword according to the determined search sequence. | 09-15-2011 |
20110255795 | APPARATUS AND METHOD FOR CHARACTER STRING RECOGNITION - An apparatus and a method for character string recognition for correctly recognizing a character string placed on a medium, even in a recognition process system in which a plurality of formats are handled. An image processing area is set on a medium. The image processing area is divided in a placement direction of character strings so as to make up a plurality of segments. An image data projection in a direction of character strings is calculated for each segment. The number of character string lines for each segment is calculated according to the image data projection. The number of character string lines is determined for the image processing area as a whole, according to the number of character string lines for each segment, and it is judged whether or not the character strings are predetermined character strings. | 10-20-2011 |
20120002889 | USING HANDWRITTEN NOTATIONS IN DIGITAL VIDEO PRESENTATIONS - A method for producing a slide show video from a collection of hardcopy media, the method includes digitizing the media and detecting handwritten information and estimating the age of the media; determining an order of presentation for the slide show video based on the detected handwritten information and estimated ages; and producing a slide show video from the hardcopy media using the determined order of presentation. | 01-05-2012 |
20120014613 | INTELLIGENT DOCUMENT SCANNING - A method, apparatus, and system, for scanning a first portion of a data to generate a second portion of data is provided. A control parameter relating to a level of detail associated with filtering a first portion of data is received. The filtering of the first portion of data is performed based upon the control parameter. The filtering of the first portion of data includes a rule-based filtering, a context-based filtering, a statistical-based filtering, or a semantic-based filtering. Performing the filtering provides for a reduction of a portion of the first portion of data. A second portion of data that is smaller than the first portion of data is provided based upon the filtering of the first portion of data. | 01-19-2012 |
20120020577 | SYSTEM AND METHOD FOR EFFICIENT UNIFIED MESSAGING SYSTEM SUPPORT FOR SPEECH-TO-TEXT SERVICE - Disclosed herein are systems, methods, and non-transitory computer-readable storage media for communicating information about transcription progress from a unified messaging (UM) server to a UM client. In one embodiment, the transcription progress describes speech to text transcription of speech messages such as voicemail. The UM server authenticates and establishes a session with a UM client, then receives a get message list request from a UM client as of a first time, responds to the get message list request with a view of a state of messages and available transcriptions for transcribable messages in a list of messages associated with the get message list call at the first time, and, at a second time subsequent to the first time, transmits to the UM client a notification that provides an indication of progress for at least one transcription not yet complete in the list of messages. The messages can include video. | 01-26-2012 |
20120020578 | Identifying Establishments in Images - Establishments are identified in geo-tagged images. According to one aspect, text regions are located in a geo-tagged image and text strings in the text regions are recognized using Optical Character Recognition (OCR) techniques. Text phrases are extracted from information associated with establishments known to be near the geographic location specified in the geo-tag of the image. The text strings recognized in the image are compared with the phrases for the establishments for approximate matches, and an establishment is selected as the establishment in the image based on the approximate matches. According to another aspect, text strings recognized in a collection of geo-tagged images are compared with phrases for establishments in the geographic area identified by the geo-tags to generate scores for image-establishment pairs. Establishments in each of the large collection of images as well as representative images showing each establishment are identified using the scores. | 01-26-2012 |
20120121195 | Identifying Establishments in Images - Establishments are identified in geo-tagged images. According to one aspect, text regions are located in a geo-tagged image and text strings in the text regions are recognized using Optical Character Recognition (OCR) techniques. Text phrases are extracted from information associated with establishments known to be near the geographic location specified in the geo-tag of the image. The text strings recognized in the image are compared with the phrases for the establishments for approximate matches, and an establishment is selected as the establishment in the image based on the approximate matches. According to another aspect, text strings recognized in a collection of geo-tagged images are compared with phrases for establishments in the geographic area identified by the geo-tags to generate scores for image-establishment pairs. Establishments in each of the large collection of images as well as representative images showing each establishment are identified using the scores. | 05-17-2012 |
20120177297 | Image Analysis System and Method Using Image Recognition and Text Search - Provided herein are systems and method for obtaining contextual information of an image published on a digital medium. The methods and systems disclosed herein generally identify and analyze the image to obtain image descriptors corresponding to the image. The methods also identify and analyze text published proximate to the image to obtain textual descriptors, which function to describe, identify, index, or name the image or content within the image. The textual descriptors are then matched to the image descriptors to provide contextual information of the published image. | 07-12-2012 |
20120183230 | METHOD AND APPARATUS TO ENHANCE SECURITY AND/OR SURVEILLANCE INFORMATION IN A COMMUNICATION NETWORK - Existing video surveillance security approaches enhanced with suitable functionality of the telecommunications wireless network are provided. Security personnel are equipped with hand-held devices capable of recording video, photos, audio, and text. This data is geo-tagged and time-stamped by the application and uploaded to the telecommunications network and stored in the network. As such, the geo-tagged, time-stamped information is immediately available to other investigators who are in the same geographic vicinity through access controls administered by a secure social network. The information may also be accessible from remote locations via the internet. All wireless and Internet communications may be protected using end-to-end secure transport layer communications protocols. | 07-19-2012 |
20120213446 | METHOD OF FAST TYPING TWIN SPECIAL CHARACTERS - A method for inputting characters pairs in an electronic device having a user input device, a display for displaying characters input through the user input device, and a memory for storing characters input through the user input device, including storing a character input through the user input device in the memory and displaying the input character on the display; and determining if the input character is an opening character of a predefined character pair, and if so, automatically and without further user input, causing a corresponding closing character of the predefined character pair to be inserted in the memory and on the display, and locating an input pointer so that subsequently input characters will be inserted between the opening and closing characters in the memory and on the display. | 08-23-2012 |
20120237131 | INFORMATION PROCESSING APPARATUS TO ACQUIRE CHARACTER INFORMATION - An information processing apparatus according to one aspect of the present invention includes a area recognizing unit to recognize, with respect to areas specified in predetermined representations within image data, a first area specified in a first area specifying representation and a second area specified in a second area specifying representation different from the first area specifying representation, a position information acquiring unit to acquire position information of the first area, which is recognized by the area recognizing unit as the position information for specifying a character recognition target area within the image data and a name-of-item acquiring unit to acquire character information obtained by recognizing characters existing in the second area recognized by the area recognizing unit as a name of item with respect to the character recognition target area specified by the position information acquired by the position information acquiring unit. | 09-20-2012 |
20130022284 | Method and system for updating business cards - Techniques for capturing images of business cards, uploading the images to a designated computing device for processing and recognition are disclosed. A mechanism is provided to update extracted data from the images when there are any changes. Depending on implementation, there are a number of ways to capture images of business cards (e.g., via a phone camera, a PC camera, or a scanning device). A transmission means is provided to transport the images to the designated computing device for centralized management of integrated contact information for individual users. As a result, a user may access his/her updatable integrated contact information database anywhere anytime from a chosen device. | 01-24-2013 |
20130114908 | IMAGE PROCESSING APPARATUS AND CONTROL METHOD CAPABLE OF PROVIDING CHARACTER INFORMATION - An image processing apparatus and control method capable of providing character information are disclosed the apparatus includes a signal receiving unit which receives an image signal; an image processing unit which processes the received image signal so that an image based on the image signal can be displayed; a searching unit which searches search words; and a controller which controls the searching unit to search at least one of the search words included in the displayed image and provide a user with a result of the search for the search word. With this configuration, users can more conveniently search character information included in contents being watched. | 05-09-2013 |
20130208991 | INFORMATION PROCESSING APPARATUS FOR DETERMINING MATCHING LANGUAGE FOR CHARACTERS IN IMAGE - An information processing apparatus of the present invention selects one language group, then selects one language from the selected language group, and performs OCR processing appropriate for the selected language on characters included in an image. From an obtained OCR processing result, a matching degree indicating a degree of similarity between the recognized characters in the image and the language selected for the OCR processing is calculated. Then, in a case where the matching degree is equal to or smaller than a particular value, a language belonging to a different language group is selected to further perform OCR processing. The efficiency of the OCR processing is improved. The information processing apparatus of the present invention allows improvement in the efficiency of the OCR processing. | 08-15-2013 |
20140037219 | CHARACTER STRING EXTRACTION METHOD AND CHARACTER STRING EXTRACTION DEVICE - A character string extraction device according to the present invention includes a replacement information registering unit in which replacement information to replace character information expected to be erroneously recognized is registered, a candidate character data registering unit in which supposed candidate character data is registered, an image information converting unit for converting the read image information into character information, a character information replacing unit for replacing a specific character with a designated character when the character information includes the specific character and generating read character data by using the converted character information when the character information does not include the specific character, a search character generating unit for replacing a predetermined character of the read character data with a special character and, generating search character data from the read character data, and a first comparing unit for comparing the search character data with the candidate character data. | 02-06-2014 |
20140044365 | Digital Image Archiving and Retrieval - A computer-implemented method of managing information is disclosed. The method can include receiving a message from a mobile device configured to connect to a mobile device network (the message including a digital image taken by the mobile device and including information corresponding to words), determining the words from the digital image information using optical character recognition, indexing the digital image based on the words, and storing the digital image for later retrieval of the digital image based on one or more received search terms. | 02-13-2014 |
20140099038 | INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND NON-TRANSITORY COMPUTER READABLE MEDIUM - An information processing apparatus includes a reading unit, a recognition unit, a table-of-contents analysis unit, a main-body analysis unit, and a creation unit. The reading unit reads a table of contents page and a main body page as images. The recognition unit performs character recognition on the images of the table of contents and main body pages. The table-of-contents analysis unit analyzes the image of the table of contents page, and acquires at least a heading item in accordance with a result of character recognition. The main-body analysis unit analyzes the image of the main body page, and associates an image including the heading item with the heading item in accordance with a result of character recognition. The creation unit creates electronic bookmarked information in which bookmark information for associating the heading item with the image of the main body page is added to electronic information of the read images. | 04-10-2014 |
20140133767 | SCANNED TEXT WORD RECOGNITION METHOD AND APPARATUS - A method for converting digital images to words includes receiving a digital image comprising text, generating a binary image from the digital image for each of N binarization threshold values to provide N binary images, converting each of the N binary images to text, and aligning the text from the N binary images to provide a word lattice for the digital image. Aligning the text may include prioritizing the text from the N binary images according to error rates on a training set. The training set may be a synthetic training set. An apparatus corresponding to the above method is also disclosed herein. | 05-15-2014 |
20140161365 | Method of Perspective Correction For Devanagari Text - An electronic device and method identify regions that are likely to be text in a natural image or video frame, followed by processing as follows: lines that are nearly vertical are automatically identified in a selected text region, oriented relative to the vertical axis within a predetermined range −max_theta to +max_theta, followed by determination of an angle θ of the identified lines, followed by use of the angle θ to perform perspective correction by warping the selected text region. After perspective correction in this manner, each text region is processed further, to recognize text therein, by performing OCR on each block among a sequence of blocks obtained by slicing the potential text region. Thereafter, the result of text recognition is used to display to the user, either the recognized text or any other information obtained by use of the recognized text. | 06-12-2014 |
20140212056 | Composite Label with History Feature - A code that stores a history of what has been done to it and where it has been. The history can be stored in a local memory. The code can be changed based on that history. | 07-31-2014 |
20140219571 | TIME-BASED SENTIMENT ANALYSIS FOR PRODUCT AND SERVICE FEATURES - Provided are a method, computer program product and system for reporting time-based sentiment for a product. Text analysis is performed on at least one communication. At least one feature for the product is determined based on the text analysis. A sentiment value is generated for the at least one feature for the product. A date associated with the sentiment value is determined, and the sentiment value is reported for at least one feature over time. | 08-07-2014 |
20140355896 | IMAGE PROCESSING APPARATUS AND IMAGE PROCESSING METHOD - According to an embodiment, an image processing apparatus selects as an output image a candidate character component, from which a non-character component is removed, in a gradation having the largest number of pixels when there is a significant difference between the number of character pixels in the gradation having the largest number of character pixels and the number of character pixels in a gradation having the second largest number of character pixels, and selects as an output image a candidate character component, from which the non-character component is removed, in a gradation having the smallest number of edge pixels when there is no significant difference between the number of character pixels in the gradation having the largest number of character pixels and the number of character pixels in the gradation having the second largest number of character pixels. | 12-04-2014 |
20150043832 | INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND COMPUTER READABLE MEDIUM - An information processing apparatus includes a storage unit, an interpretation unit, and a correction unit. The storage unit stores plural correction instructions. The interpretation unit interprets a correction instruction stored in the storage unit. The correction unit corrects a recognized character string in accordance with the correction instruction interpreted by the interpretation unit. The interpretation unit determines the type of the correction instruction, and extracts a first character string including one or more characters serving as a target of the correction instruction and a second character string obtained by performing conversion of a part of or whole the first character string, in accordance with the type of the correction instruction. The correction unit, in a case where the first character string exists in the recognized character string, converts a part of or whole the first character string within the recognized character string into the second character string. | 02-12-2015 |
20150063714 | CAPTURING TEXT FROM RENDERED DOCUMENTS USING SUPPLEMENTAL INFORMATION - A system for processing a text capture operation is described. The system receives text captured from a rendered document in the text capture operation. The system also receives supplemental information distinct from the captured text. The system determines an action to perform in response to the text capture operation based upon both the captured text and the supplemental information. | 03-05-2015 |
20150131918 | DISTRIBUTED DOCUMENT PROCESSING - A system for document processing including decomposing an image of a document into at least one data entry region sub-image, providing the data entry region sub-image to a data entry clerk available for processing the data entry region sub-image, receiving from the data entry clerk a data entry value associated with the data entry region sub-image, and validating the data entry value. | 05-14-2015 |
20150146992 | ELECTRONIC DEVICE AND METHOD FOR RECOGNIZING CHARACTER IN ELECTRONIC DEVICE - An electronic device and a method are provided for recognizing a character in the electronic device. The electronic device includes a display unit configured to, upon receipt of an image in a real-time character recognition method, display a character recognition area defined for character recognition adjusted to an inclination angle of an object included in the received image. The electronic device also includes a controller configured to detect the inclination angle of the object included in the image, to adjust an angle of the character recognition area defined for character recognition to the inclination angle of the object, and to control recognition of a character from the object in the angle-adjusted character recognition area. | 05-28-2015 |
20150302277 | IMAGE PROCESSING APPARATUS, IMAGE PROCESSING SYSTEM, AND IMAGE PROCESSING METHOD - An image processing apparatus includes a receiving unit configured to present an image that is a processing target to a user and receive a specification of an area in the image; a character recognition unit configured to perform a character recognition process on the area for which the receiving unit has received the specification in the image that is the processing target, and acquire an information item of a character string in the area; and a setting unit configured to set management information of the image that is the processing target, based on the character string acquired by the character recognition unit. | 10-22-2015 |
20150379343 | System and Method for Data Extraction and Searching - Systems and methods are provided for quickly and efficiently searching and receiving results for real estate-related information without or at least with minimal human processing of real estate-related documents. Optical character recognition on a plurality of scanned document images is performed to obtain a plurality of textual data representations of the real estate-related documents. Data is extracted from the textual data representations, and subsequently contextualized according to a real estate-related context. Aspects of the extracted data as well as the textual data representations are provided as search results based on one or more searches for real estate-related information. | 12-31-2015 |
20150379346 | PROPERTY RECORD DOCUMENT DATA VERIFICATION SYSTEMS AND METHODS - A data verification system is configured to verify machine-recognized data elements acquired during a machine-implemented data acquisition process. The system includes a data verification workstation, an image server, and a data entry server. The data verification workstation is configured to obtain document images from the image server, present portions of document images to an operator, wherein the document images include text, and receive input from the operator based on the text. The input includes data elements. The data verification workstation is also configured to acquire machine-recognized data elements from the data entry server. The machine-recognized data elements were acquired from the document image during a machine-implemented data acquisition process based on the text. The data verification workstation is also configured to compare the data elements received from the operator to the machine-recognized data elements and selectively prompt the operator to re-input the data elements based on the comparison. | 12-31-2015 |
20160004937 | SYSTEM AND METHOD FOR DETERMINING STRING SIMILARITY - Provided are string similarity assessment techniques. In one embodiment, the techniques include receiving a plurality of input strings comprising characters from a character set and generating hashtables for each respective input string using a hash function that assigns the characters as keys and character positions in the strings as values. The techniques may also include determine a character similarity index for at least two of the input strings relative to each other by comparing a similarity of the values for each key in the their respective hashtables; determining a total disordering index based representative of an alignment of the at least two input strings by determining differences between a plurality of index values for each individual key in their respective hashtables and determining the total disordering index based on the differences; and determining a string similarity metric based on at least one character similarity index and the total disordering index. | 01-07-2016 |
20160005202 | INFORMATION PROCESSING APPARATUS - In the case where a table region is erroneously recognized, the edition of a character is facilitated. An information processing apparatus selects one selectable region from an image. According to the change of the position of the region selected by a selecting unit, a region that is included in the region before the position change and that is not included in the region after the position change is set to a new selectable region. | 01-07-2016 |
20160034753 | IMAGE PROCESSING APPARATUS AND NON-TRANSITORY COMPUTER READABLE RECORDING MEDIUM STORING AN IMAGE PROCESSING PROGRAM - In an image processing apparatus, a character recognizing unit identifies a character image in a document image. A font matching unit determines a character code and a font type corresponding to the identified character image. A fore-and-background setting unit sets the document image as a background image and sets a standard character image based on the determined character code and the determined font type. A background image correcting unit (a) deletes a deletion area in the background image, the deletion area taking a same position as the character image or the standard character image, (b) interpolates a differential area between the character image and the standard character image in a specific neighborhood area that contacts with the deletion area on the basis of the background image, and (c) interpolates the deletion area on the basis of the back ground image. | 02-04-2016 |
20160063339 | SCRAPPED INFORMATION PROVIDING METHOD AND APPARATUS - An information providing method of an electronic device is provided. The information providing method includes determining a selected area based on a user input, determining an extraction method based on types of one or more objects included in the selected area, extracting information from the selected area according to the determined extraction method, and performing a specific function based on the information. | 03-03-2016 |
20160070992 | PRUNING AND LABEL SELECTION IN HIDDEN MARKOV MODEL-BASED OCR - Systems and techniques are provided for pruning a node from a possible nodes list for Hidden Markov Model with label transition node pruning. The node may be a label transition node. A frame may be at a predicted segmentation point in decoding input with the Hidden Markov Model. The node may be scored at the frame. The node may be pruned from the possible nodes list for the frame when score for the node is greater than the sum of a best score among nodes on the possible nodes list for the frame and a beam threshold minus a penalty term. A possible nodes list may be generated for a subsequent frame using label selection. A second node may be pruned from the possible nodes list for the subsequent frame with early pruning. | 03-10-2016 |
20160085726 | CONVERTING TEXT STRINGS INTO NUMBER STRINGS, SUCH AS VIA A TOUCHSCREEN INPUT - System and methods are provided for detecting numerical text strings within a text string and converting those numerical text strings into digit strings. The digit strings may be reflected in real-time, such as when the user is typing a text message. If more than one possible format of the digit string is determined, the system may then provide a selection of the various formats for selection. Once the proper format for the digit string is determined, that digit string may replace the numerical string previously detected in the text string. The text to digit conversion and associated formatting expedites user text entry such that the user is not required to switch keyboard views, (e.g., virtual keyboards). Additionally, converting to digit strings compresses message length, as well as provide other benefits. | 03-24-2016 |
20160098611 | TEXT ENTITY RECOGNITION - Various embodiments enable the identification of semi-structured text entities in an imager. The identification of the text entities is a relatively simple problem when the text is stored in a computer and free of errors, but much more challenging if the source is the output of an optical character recognition (OCR) engine from a natural scene image. Accordingly, output from an OCR engine is analyzed to isolate a character string indicative of a text entity. Each character of the string is then assigned to a character class to produce a character class string and the text entity of the string is identified based in part on a pattern of the character class string. | 04-07-2016 |
20160110598 | DATA PROCESSING SYSTEMS, DEVICES, AND METHODS FOR CONTENT ANALYSIS - Systems, devices and methods operative for identifying a reference within a figure and an identifier in a text associated with the figure, the reference referring to an element depicted in the figure, the reference corresponding to the identifier, the identifier identifying the element in the text, placing the identifier on the figure at a distance from the reference, the identifier visually associated with the reference upon the placing, the placing of the identifier on the figure is irrespective of the distance between the identifier and the reference. | 04-21-2016 |
20160125275 | CHARACTER RECOGNITION DEVICE, IMAGE DISPLAY DEVICE, IMAGE RETRIEVAL DEVICE, CHARACTER RECOGNITION METHOD, AND COMPUTER PROGRAM PRODUCT - According to an embodiment, a device includes a detector, first and second recognizers, an estimator, a second recognizer, and an output unit. The detector is configured to detect a visible text area including a visible character from an image. The first recognizer is configured to perform character pattern recognition on the visible text area, and calculate a recognition cost according to a likelihood of a character pattern. The estimator is configured to estimate a partially-hidden text area into which a hidden text area estimated to have a hidden character and the visible text area are integrated. The second recognizer is configured to calculate an integrated cost into which the calculated cost and a linguistic cost corresponding to a linguistic likelihood of a text that fits in the entire partially-hidden text area are integrated. The output unit is configured to output a text selected or ranked based on the integrated cost. | 05-05-2016 |
20160140426 | SYSTEMS AND METHODS FOR MULTI-FACTOR IMAGE RECOGNITION - A mechanism for image recognition based on multiple factors is described. A method, system and computer-readable medium for multi-factor image recognition includes using environmental contextual attributes to create likelihood tiers in an image recognition database such that irrelevant entries are excluded from the search. The mechanism described here limits and sorts by contextual likelihood the number of entries to be searched, increasing both the speed and accuracy of the image recognition process. | 05-19-2016 |
20180025225 | SYSTEM AND METHOD FOR GENERATING CONSOLIDATED DATA FOR ELECTRONIC DOCUMENTS | 01-25-2018 |
20180025256 | METHOD AND APPARATUS FOR RECOGNIZING CHARACTER STRING IN IMAGE | 01-25-2018 |
382230000 | Trigrams or digrams | 1 |
20090034851 | MULTIMODAL CLASSIFICATION OF ADULT CONTENT - Systems and methods for classifying content as adult content and, if desired, blocking content so classified from presentation to a user are provided. Received content is analyzed using a sequential series of classification techniques, each successive technique being implemented only if the previous technique did not result in classification of the content as adult content. In this way, adult content may be identified across a variety of different media types (e.g., text, images, video, etc.) and yet processing power may be reserved if one or more techniques requiring less power is sufficient to determine that the received content is, in fact, adult content. Content classification may be performed in-band (that is, in substantially real-time such that content may be identified and/or blocked at the time results of a user query are returned) or out-of-band (that is, prospectively as new content is received but not in association with a user query). | 02-05-2009 |