Entries |
Document | Title | Date |
20080212877 | HIGH SPEED ERROR DETECTION AND CORRECTION FOR CHARACTER RECOGNITION - Systems and methods for high speed error detection and correction are disclosed. An exemplary method may include grouping character images (ci) by suspected character code (cc) to generate a set of CI(cc). The method may also include displaying the set of CI(cc) for manual verification. The method may also include determining a set of RS(cc) of representative shapes (rs) of character images codes for each CI(cc). The method may also include displaying the set of RS(cc) for manual verification. | 09-04-2008 |
20080240567 | Displaying text of a writing system using syntax-directed translation - A method for displaying an input string of character codes as a sequence of glyphs. In one implementation, an ordered list of instructions for transforming an input string of character codes may be generated using syntax-directed translation. The ordered list of instructions may be executed to generate a sequence of glyph indices. A sequence of glyphs corresponding to the sequence of glyph indices may be displayed. | 10-02-2008 |
20080253657 | Geometric parsing of mathematical expressions - A processing device may parse a group of strokes representing a mathematical expression. The group of strokes may be examined to determine whether the group of strokes satisfies any of a finite set of rules. When the group of strokes, included in a region, satisfies any of the finite set of rules, the region may be partitioned according to a satisfied one of the finite set of rules. The group of strokes included in the region may be further examined to determine whether the group of strokes may be further partitioned according to any of the finite set of rules. After all regions have been examined and no further partitioning of regions may be performed, all mathematical symbols of the mathematical expression may be isolated in at least some of the regions and may be recognized. | 10-16-2008 |
20080310721 | Method And Apparatus For Recognizing Characters In A Document Image - A method of recognizing characters in a document image comprises examining the intensity of pixels in the document image and identifying a peak intensity deemed to represent foreground in the document image. A threshold level for distinguishing the foreground from background in the document image as a function of the identified peak intensity is determined. The document image is thresholded using the threshold level to identify the foreground. Character recognition is performed on the foreground of the document image. | 12-18-2008 |
20080310722 | IDENTIFYING CHARACTER INFORMATION IN MEDIA CONTENT - Implementations of identifying character information in media content are described. In one implementation, a frame of media content is marked with a frame identifier including one or more known characters. These known characters can uniquely identify the frame of media content. During transmission, compression, decompression, etc., of the frame, loss can occur. This loss can affect a quality of presentation of one or more of the known characters in the frame identifier. Therefore, when the frame is subsequently examined, the frame identifier can be identified, and best matches of known characters from a character recognition library can be found for characters in the frame identifier. | 12-18-2008 |
20080310723 | TEXT PREDICTION WITH PARTIAL SELECTION IN A VARIETY OF DOMAINS - A computing system may predict a word based on received user input that selects a part of the word (e.g., the first characters, the first root, etc.). Specifically, a program, when run on the computing system, may perform a method including creating a candidate list of words based on received user input. These words may be then organized into a hierarchy, or tree structure, in which each word is associated with a parent and each parent is a partial match for its associated words. The top-tier partial matches may be presented, and user input corresponding to a selected partial match may be received. A set of candidates related to the selected partial match may then be presented for user selection. | 12-18-2008 |
20080317346 | Character and Object Recognition with a Mobile Photographic Device - Character and object recognition are provided from digital photography followed by digitization and integration of recognized textual and non-textual content into a variety of software applications for enabling use of data associated with the photographed content. A digital photograph may be processed by an optical character recognizer or optical object recognizer for generating data associated with a photographed object. A user of the photographed content may tag the photographed content with descriptive or analytical information that may be used for improving recognition of the photographed content and that may be used by subsequent users of the photographed content. Data generated for the photographed object may then be passed to a variety of software applications for use in accordance with respective application functionalities. | 12-25-2008 |
20080317347 | Rendering engine test system - A system to compare a reference image of a text character, word or phrase with another image of the character, word or phrase that was rendered by a text rendering engine. Differences between the reference image and the rendered image may be recorded for subsequent analysis. Performance of a text rendering engine producing text according to typographical rules applicable to a natural language can be evaluated by one with no knowledge or ability to read the natural language. | 12-25-2008 |
20080317348 | IMAGE PROCESSING APPARATUS, IMAGE REPRODUCTION APPARATUS, SYSTEM, METHOD AND STORAGE MEDIUM FOR IMAGE PROCESSING AND IMAGE REPRODUCTION - An original document image is inputted as multi valued image data (original image data) from an input unit. The multivalued image data is binarized by a binary image generation unit. Then, layout analysis is performed based on the binary image data. Based on the layout information, a partial image having text-attribute is extracted and a partial image having non-text-attribute are extracted from the multi-valued image data. One of the partial images is encrypted, and the encrypted data is stored with the partial image that is not encrypted and the layout information. | 12-25-2008 |
20090028434 | SYSTEM AND METHOD FOR DISPLAYING CONTEXTUAL SUPPLEMENTAL CONTENT BASED ON IMAGE CONTENT - An image-based content item is analyzed to determine information about a subject of the content item. The analysis may include performing image analysis on at least an image of the content item. An inference may be programmatically made about one or more of (i) a viewer or holder of the content item, or (ii) the subject of content item. | 01-29-2009 |
20090074294 | DOCUMENT-IMAGE-DATA PROVIDING SYSTEM, DOCUMENT-IMAGE-DATA PROVIDING DEVICE, INFORMATION PROCESSING DEVICE, DOCUMENT-IMAGE-DATA PROVIDING METHOD, INFORMATION PROCESSING METHOD, DOCUMENT-IMAGE-DATA PROVIDING PROGRAM, AND INFORMATION PROCESSING PROGRAM - In a document-image-data providing device, a document image inputting unit is configured to input document image data. An area recognition unit is configured to recognize a text area of a document image element containing text data among document image elements constituting the document image data, and another area of a document image element containing data other than the text data. A text data acquiring unit is configured to acquire text data contained in the recognized text area. A providing unit is configured to provide, in response to a document image data request received from the information processing device, both image data generated from the input document image data to have a resolution lower than a resolution of the input document image data and the text data acquired by the text data acquiring unit, to the information processing device. | 03-19-2009 |
20090110283 | METHOD AND APPARATUS FOR OPERATING, INTERFACING AND/OR MANAGING FOR AT LEAST ONE OPTICAL CHARACTERISTIC SYSTEM FOR CONTAINER HANDLERS IN A CONTAINER YARD - Methods and several apparatus embodiments are disclosed operating Optical Characteristic Systems (OCS) in a container storage and/or transfer yard supporting the automated recognition of container codes displayed on various sides of the containers being stored and/or transferred. At least one processor may initiate an operational process by an OCS mounted on a container handler to create an operational result, select the operational process based upon an operational schedule and communicate with at least one OCS to receive an image of a container being handled by the container handler to at least partly create a container code estimate for a container inventory management system. A program system directing at least one computer implementing these operations, and may reside in computer readable memory, an installation package and/or a download server. The computer readable memory may or may not be accessibly coupled to the computer. | 04-30-2009 |
20090154810 | IMAGE PROCESSING DEVICE, IMAGE PROCESSING METHOD, AND PROGRAM AND RECORDING MEDIUM THEREOF - In an electronic document of drawing descriptions of a page image and a character, it is desired that although a font data necessary for drawing the character is held in the electronic document, the size of the electronic document is minimized. Furthermore, it is desired to ensure visibility at the time of highlighting of search. There is generated an electronic document in which a document image, a plurality of character codes obtained by executing a character recognition processing with respect to the document image, and a plurality of kinds of glyph data to be utilized in common with respect to the plurality of character codes when drawing characters corresponding to the plurality of character codes are stored. The plurality of kinds of glyph data are selectively used when characters corresponding to the character codes are drawn. It is desirable that the glyph data be the one in a simple form. | 06-18-2009 |
20090214115 | IMAGE PROCESSING APPARATUS AND COMPUTER READABLE MEDIUM - An image processing apparatus includes: an image acceptance unit that accepts an image; a character information adding unit that adds a character identifier for uniquely identifying a character, a character position indicating a position of the character in the image, a character size indicating a size of the character, and a character color indicating a color of the character as character information, to the image accepted by the image acceptance unit; a font allocation unit that allocates a font without drawing element as a font corresponding to the character identifier within the character information added by the character information adding unit; and an electronic document generation unit that generates an electronic document for reference to the font information allocated by the font allocation unit, based on the character information added to the image by the character information adding unit. | 08-27-2009 |
20090214116 | IMAGE PROCESSING METHOD, IMAGE PROCESSING APPARATUS, IMAGE FORMING APPARATUS, AND STORAGE MEDIUM - In the image processing apparatus of the present invention, when a document is read, a document matching process section determines whether the document is similar to a reference document or not. When the document is similar to the reference document, the document matching process section further determines whether the document has been zoomed (size of the document has been changed). When the document has been zoomed, an editing process section restores the size of the document to the size of the reference document. This provides an image processing apparatus capable of restoring the changed size of a document in a predetermined format such as a form document and an application document to its original size. | 08-27-2009 |
20090220154 | IMAGE PROCESSING APPARATUS, IMAGE READING APPARATUS, IMAGE DATA OUTPUT PROCESSING APPARATUS, AND IMAGE PROCESSING METHOD - A ruled-line extraction section can be performed with high precision by providing a main-scanning ruled-line extraction section for determining whether a target pixel of binary image data of a document image is a black pixel or a white pixel, for counting the number of black pixels connected one after another upstream in a main scanning direction with respect to the target pixel of the binary image data and for, when the target pixel of the binary image data is a black pixel and when a value counted for the target pixel is not less than a main-scanning run determination threshold value that has been set in advance, generating ruled-line image data by correcting, to pixel values corresponding to black pixels, pixel values of a predetermined number of pixels connected to the target pixel upstream in the main scanning direction. | 09-03-2009 |
20090245644 | OPTICAL CHARACTER READERS FOR READING CHARACTERS PRINTED ON WIRES OR WIRE SLEEVES - A scanning system for scanning a wire to determine the characters provided on the wire. | 10-01-2009 |
20090274369 | IMAGE PROCESSING DEVICE, IMAGE PROCESSING METHOD, PROGRAM, AND STORAGE MEDIUM - An image processing device includes a dividing unit for dividing objects of an input image, a metadata adding unit for adding metadata to each of the divided objects by performing OCR processing and morpheme analysis, a display unit for displaying at least one of the divided objects and the metadata added to the divided object, and a metadata accuracy determining unit for determining accuracies of the added metadata. The display unit preferentially displays metadata determined as being low in accuracy by the metadata accuracy determining unit. | 11-05-2009 |
20090310863 | FINDING IMAGE CAPTURE DATE OF HARDCOPY MEDIUM - A method of determining the image capture date of a scanned hardcopy medium having an image side and a non-image side, includes scanning the hardcopy medium to produce a scanned digital image; detecting handwritten annotations in the scanned digital image of the hardcopy medium; and using the handwritten annotations to determine the image capture date of the hardcopy medium by analyzing the handwritten annotations to identify names of people and associated ages; providing the names and lifespan information for a set of persons likely to appear in the hardcopy medium; and using the identified names of people and the associated ages along with the lifespan information to determine the image capture date. | 12-17-2009 |
20100067794 | OPTICAL CHARACTER RECOGNITION VERIFICATION - A method for optical character recognition (OCR) verification, the method includes: receiving a first character image that was obtained from applying an OCR process on a document; wherein the first character image is classified, by the OCR, as being associated with a first character; receiving a first character code of a text; replacing the first character code by the first character image; and evaluating a correctness of the OCR based upon a response of a user to a display of the text first character image. | 03-18-2010 |
20100067795 | METHOD AND APPARATUS FOR PATTERN PROCESSING - An apparatus for pattern processing exhibits a discretizing device for discretizing an input pattern, a device for generating a number n of discrete variants of the quantized input pattern in accordance with established rules, a number n of input stages ( | 03-18-2010 |
20100086209 | METHOD OF IMAGING POSITION-CODING PATTERN HAVING TAG COORDINATES ENCODED BY BIT-SHIFTED SUBSEQUENCES OF CYCLIC POSITION CODE - A method of decoding a position-coding pattern disposed on a surface of a substrate. The method comprises the steps of: (a) operatively positioning an optical reader relative to the surface and capturing an image of a portion of the coding pattern; (b) sampling a windowed subsequence of a cyclic code sequence; (c) identifying a coordinate codeword using the windowed subsequence; and (d) determining a position of the optical reader from the coordinate codeword. The imaged portion has a diameter of more than one tag diameter and less than two tag diameters. | 04-08-2010 |
20100092088 | Methods and data structures for improved searchable formatted documents including citation and corpus generation - Searchable annotated formatted documents are produced by correlating documents stored as a photographic or scanned graphic representations of an actual document (evidence, report, court order, etc.) with textual version of the same documents. A produced document will provide additional details in a data structure that supports citation annotation as well as other types of analysis of a document. The data structure also supports generation of citation reports and corpus reports. A method of creating searchable annotated formatted documents including citation and corpus reports by correlating and correcting text files with photographic or scanned graphic of the original documents. Data structures for correlating and correcting text files with graphic images. Generation of citation reports, concordance reports, and corpus reports. Data structures for citation reports, concordance reports, and corpus reports generation. | 04-15-2010 |
20100150445 | TEXT VECTORIZATION USING OCR AND STROKE STRUCTURE MODELING - Systems and methods are described that facilitate dominant point detection for text in a scanned document. The dominant points are classified as “major” (e.g., structural) and “minor” (e.g., serif). A set of rules or parameters for each character is determined off-line. During the text vectorization, OCR is performed and the rules (parameters) associated with the recognized character are selected. Both major and minor dominant points are detected as a maximization process with the parameter set. For minor dominant points, additional processes are optionally employed. | 06-17-2010 |
20100177965 | IMAGE PROCESSING APPARATUS, CONTROL METHOD THEREFOR, AND RECORDING MEDIUM - Even if an image processing apparatus which can recognize a certain character string is available on the network, processing results of an OCR process are determined by character recognition ability of an image processing apparatus which has happened to perform the OCR process. Thus, after an MFP performs a character recognition process based on image data contained in a character region of an image, if it is determined that processing results of the character recognition process are highly likely to contain recognition errors, the processing results are output to another MFP together with first information which indicates a high likelihood of the processing results containing recognition errors. Upon acquiring the processing results, the other MFP with higher character recognition capabilities performs a character recognition process on the image data contained in the character region if the first information is attached. | 07-15-2010 |
20100215272 | AUTOMATIC FILE NAME GENERATION IN OCR SYSTEMS - Methods and system for processing document images in OCR systems, particularly for selecting a proper file name for a recognized document. The method comprises generating at least one document type hypothesis for the document; verifying each document type hypothesis; selecting a best document type hypothesis and saving the document with a proper name based on the best type hypothesis and unique features. The method further includes determining a logical structure of a document and selecting a best document model hypothesis that has the best degree of correspondence with the selected best block hypotheses for the document. On the basis of the best document model hypothesis the text document reflecting the logical structure of the source document in extended computer-editable format is formed and saved with a proper file name. | 08-26-2010 |
20100254608 | METHOD AND SYSTEM FOR AIDED INPUT ESPECIALLY FOR COMPUTER MANAGEMENT TOOLS - A method of aided input especially for a computer management tool, the management tool being executed in a computer system possessing an operating system furnished with instrumentation services, characterized in that it comprises the following steps: (a) entering raw data from an exterior source, (b) extracting relevant data from said raw data, (c) using said instrumentation services to transcribe said extracted data to corresponding fields of a preexisting input interface belonging to the management tool, with a view to allowing further inputs and overall validation. Application in particular to the semi-automated input of accounting items such as supplier invoices and the like. | 10-07-2010 |
20100266205 | Device and Method to Assist User in Conducting A Transaction With A Machine - A device for assisting a user to perform a transaction on a machine is described. The device receives data that specifies a transaction mode to use for processing an image and accesses a knowledge base to provide data to configure the device for the transaction mode, the data including data specific to the transaction mode. The device receives an image or images of a portion of a machine that the user will use to perform the transaction and processes the image or images to identify a pattern of controls on the machine and to detect the presence of a user-controlled pointing item over controls on the machine. The device announces to the user the name or function of the control closest to an end of the user-controlled pointing item. | 10-21-2010 |
20100272359 | METHOD FOR RESOLVING CONTRADICTING OUTPUT DATA FROM AN OPTICAL CHARACTER RECOGNITION (OCR) SYSTEM, WHEREIN THE OUTPUT DATA COMPRISES MORE THAN ONE RECOGNITION ALTERNATIVE FOR AN IMAGE OF A CHARACTER - The present invention is related to a method for resolving contradicting output data from an Optical Character Recognition (OCR) system providing a conversion of pixelized documents into computer coded text as the output data, wherein the OCR output data comprises at least a first and second character listed as being likely candidates for an exemplar of a same sampled character instance from the pixelized document, by providing steps that identify locations of differences in graphical appearance between the candidate characters, and then using the location information to identify a corresponding locations in the sampled character instance. Based on correlation technique, this location information is used to select the correct candidate character as the identification of the sampled character instance. | 10-28-2010 |
20100272360 | METHOD FOR OUTPUTTING CONSECUTIVE CHARACTERS IN VIDEO-RECORDING MODE - The invention discloses a method for outputting consecutive characters in a video-recording mode. The method includes obtaining a first image and a second image from an object, comparing the first image and the second image to obtain a third image which is the overlapping part of the first image and the second image, removing the third image from the second image to generate a fourth image, integrating the fourth image with the first image to obtain a fifth image and recognize characters on the fifth image by OCR software and output the characters of the fifth image. | 10-28-2010 |
20100303356 | METHOD FOR PROCESSING OPTICAL CHARACTER RECOGNITION (OCR) DATA, WHEREIN THE OUTPUT COMPRISES VISUALLY IMPAIRED CHARACTER IMAGES - The present invention provides a method for an Optical Character Recognition (OCR) system providing recognition of characters that are partly hidden by crossing outs due to for example an imprint of a stamp, handwritten signatures, etc. The method establishes a set of template images of certainly recognized characters from the image of the text being processed by the OCR system, wherein the effect of the crossed out section is modelled into the template images before comparing these images with the image of a visually impaired crossed out character. The modelled template image having the highest similarity with the visually impaired crossed out character is the correct identification for the visually impaired character instance. | 12-02-2010 |
20100310171 | METHOD AND APPARATUS FOR ANALYSIS OF A DATABASE - A method for analyzing at least one database which contains a multiplicity of reference data items, in particular for determining the quality of the database in which, in the case of a data field which has a multiplicity of objects each having one information item, data elements are determined from the data field and these are checked and confirmed by comparison with the reference data items and comparison results resulting from this are recorded. It is proposed that a legibility degree is determined for at least some of the data elements, and a state of the database is determined automatically on the basis of the legibility degree and the comparison results. | 12-09-2010 |
20100316295 | IMAGE PROCESSING METHOD, IMAGE PROCESSING APPARATUS, IMAGE FORMING APPARATUS, AND STORAGE MEDIUM - An image processing apparatus includes: a division section for dividing input image data into portions; an orientation determining section for calculating reliabilities of directions of image data of each portion when the directions are regarded as orientations, and setting an orientation with the highest reliability as an orientation of each portion; a display control section for generating display image data including an image of a target portion whose reliability of an orientation is less than a predetermined value and images of designation regions from which a user's input to designate the orientation of the target portion is entered; and a character recognition section for recognizing characters of each portion in such a manner that the orientation is designated from the designation regions or set by the orientation determining section. This allows prompt recognition of characters of a portion whose reliability of orientation is low, in accordance with a right orientation. | 12-16-2010 |
20110019915 | Methods and data structures for multiple combined improved searchable formatted documents including citation and corpus generation - Searchable annotated formatted documents are produced by correlating documents stored as photographic or scanned graphic representations of an actual document (evidence, report, court order, etc.) with textual version of the same documents. A produced document will provide additional details in a data structure that supports citation annotation as well as other types of analysis of a document. The data structure also supports generation of citation reports and corpus reports. Methods of creating searchable annotated formatted documents including citation and corpus reports by correlating and correcting text files with photographic or scanned graphic of the original documents. Data structures for correlating and correcting text files with graphic images. Generation of citation reports, concordance reports, and corpus reports. Data structures for citation reports, concordance reports, and corpus reports generation. Multiple document data structures are used to create multiple citation documents and reports. Embodiments of citation reports and corpus reports contain correlated, comprehensive multiple citations. | 01-27-2011 |
20110033111 | Processing Method And Apparatus For Recording Media Having Printed Magnetic Ink Characters - In a method of processing recording media on which magnetic ink characters are printed, the media is transported at a first speed in an upright position along a transportation path from a supply unit to a discharge unit. The magnetic characters are read and output signals representative of the reading generated. The output signals are analyzed, including comparing the output signals with previously stored signal patterns of magnetic ink characters to determine if the magnetic characters can be recognized or not. The transporting of the recording media is paused, or slowed to a second speed substantially lower than the first speed, for a period of time during the analyzing of the output signals. A processing apparatus includes components for carrying out the operations of such method. | 02-10-2011 |
20110038542 | COMPUTER APPLICATION ANALYSIS - A method, system, and computer program product for computer application analysis are provided. The method for computer application analysis includes monitoring a computer system on which an application to be analyzed is executed and interacted with by a user of the computer system. The monitoring includes: capturing screen data of the application as displayed on a display screen of the computer system including interpreting the screen data using optical character recognition (OCR); and capturing user inputs to the application to input devices of the computer system. The method further includes analyzing the captured screen data and user inputs to generate a summary of the usage of the application. | 02-17-2011 |
20110052064 | METHOD FOR PROCESSING OPTICAL CHARACTER RECOGNITION (OCR) OUTPUT DATA, WHEREIN THE OUTPUT DATA COMPRISES DOUBLE PRINTED CHARACTER IMAGES - The present invention is related to a method of processing of output data from an Optical Character Recognition (OCR) system, wherein the output data comprises images of double printed characters. The method identifies the respective members of a suspected double printed character image by first providing a set of single character template images from images of characters identified in the text being processed by the OCR system, then combining the single character templates providing candidate models for the suspected double printed character image. Correlation between each respective candidate model and the suspected double printed character image provides an indication of which pair of modelled single template character images that most probable are he correct identification of the respective character images in the double printed character image. | 03-03-2011 |
20110058742 | SYSTEM AND METHOD FOR DETERMINING AUTHORSHIP OF A DOCUMENT - Systems, methods, and computer-readable mediums for determining authorship of a handwritten document for which the authorship is not known. A method includes scanning a document to produce a high-quality scanned image of the document, and identifying stylus information corresponding to the document. The method includes identifying authorship information corresponding to the document, and determining an authorship of the document based on the stylus information and the authorship information. In some cases, content analysis of the document is also performed and used to determine authorship. | 03-10-2011 |
20110081083 | GESTURE-BASED SELECTIVE TEXT RECOGNITION - An image is displayed on a touch screen. A user's underline gesture on the displayed image is detected. The area of the image touched by the underline gesture and a surrounding region approximate to the touched area are identified. Skew for text in the surrounding region is determined and compensated. A text region including the text is identified in the surrounding region and cropped from the image. The cropped image is transmitted to an optical character recognition (OCR) engine, which processes the cropped image and returns OCR'ed text. The OCR'ed text is outputted. | 04-07-2011 |
20110081084 | CHARACTER RECOGNITION DEVICE, MOBILE COMMUNICATION SYSTEM, MOBILE TERMINAL DEVICE, FIXED STATION DEVICE, CHARACTER RECOGNITION METHOD AND CHARACTER RECOGNITION PROGRAM - Words possibly included in a scene image shot by a mobile camera can be efficiently extracted using a word dictionary or a map database. Positional information acquiring means | 04-07-2011 |
20110085732 | QR CODE PROCESSING METHOD AND APPARATUS THEREOF - A QR code processing method includes an edge processing process, a QR code positioning process and a projection modification process. The edge processing process converts an original image into a binarized input image. The QR code positioning process includes a group search process and a tag search process. The group search process includes: deriving a plurality of luminance groups according to luminance values of pixels within an input image; identifying a plurality of finder pattern groups complying with QR code finder pattern among the plurality of luminance groups according to a central point of each luminance group; and deriving position information of each finder pattern group. The tag search process derives position information of the QR code according to the position information of the finder pattern groups. The projection modification process converts the input image into a modified image according to the position information of the QR code. | 04-14-2011 |
20110103688 | System and method for increasing the accuracy of optical character recognition (OCR) - A system and/or method for increasing the accuracy of optical character recognition (OCR) for at least one item, comprising: obtaining OCR results of OCR scanning from at least one OCR module; creating at least one OCR seed using at least a portion of the OCR results; creating at least one OCR learn set using at least a portion of the OCR seed; and applying the OCR learn set to the at least one item to obtain additional optical character recognition (OCR) results. | 05-05-2011 |
20110103689 | SYSTEM AND METHOD FOR OBTAINING DOCUMENT INFORMATION - A method and system for determining at least one target value of at least one target in at least one document, comprising: determining, utilizing at least one scoring application; at least one possible target value, wherein the at least one scoring application utilizes information from at least one training document, and applying the information, utilizing the at least one scoring application, on the at least one new document to determine at least one value of the at least one target on the at least one new document. | 05-05-2011 |
20110110592 | ELECTRONIC APPARATUS AND IMAGE DISPLAY METHOD - According to one embodiment, an electronic apparatus includes a text recognition module, a group creation module, a group extraction module, an arrangement module, and a movie generator. The text recognition module recognizes a character string in a plurality of still images. The group creation module creates a plurality of groups by classifying the plurality of still images. The group extraction module extracts, from the plurality of groups, groups including a still image which meets a predetermined condition. The arrangement module arranges still images included in the extracted groups in a predetermined order, and inserts a still image included in the extracted groups and including the character string at a predetermined position of the still images which are arranged. The movie generator generates movie data for successively displaying the arranged still images in the extracted groups. | 05-12-2011 |
20110129153 | Identifying Matching Canonical Documents in Response to a Visual Query - A server system receives a visual query from a client system. The visual query is an image containing text such as a picture of a document. At the receiving server or another server, optical character recognition (OCR) is performed on the visual query to produce text recognition data representing textual characters. Each character in a contiguous region of the visual query is individually scored according to its quality. The quality score of a respective character is influenced by the quality scores of neighboring or nearby characters. Using the scores, one or more high quality strings of characters are identified. Each high quality string has a plurality of high quality characters. A canonical document containing the one or more high quality textual strings is retrieved. At least a portion of the canonical document is sent to the client system. | 06-02-2011 |
20110142344 | BROWSING SYSTEM, SERVER, AND TEXT EXTRACTING METHOD - In order to precisely extract a character in an image displayed at a terminal device in the case that an imaged web page is sent to the terminal device and the web page is browsed at the terminal device, a server acquires the web page from the Internet, generates the image from the acquired web page, and sends the image to a client terminal, the client terminal receives the image, displays the image on a display part, specifies a rectangular area, and sends information regarding the specified rectangular area to the server, and the server extracts the image in the rectangular area from the image of the web page, recognizes a text by an OCR process, extracts a text from a source of an HTML file which matches the recognized text most closely, and sends the extracted text to the client terminal. | 06-16-2011 |
20110150336 | Hardware Management Based on Image Recognition - Embodiments of the disclosed technology allow for the control, monitoring, and/or configuration of specialized hardware devices with proprietary interfaces from a central interface capable of interacting with one or a plurality of specialized hardware devices via respective proprietary interfaces. Such embodiments are especially useful in controlling medical equipment, such as radiology equipment at a central and/or remote location, where otherwise, only a proprietary interface at a proximate location could be used to do same. | 06-23-2011 |
20110211759 | CHARACTER RECOGNITION APPARATUS AND METHOD BASED ON CHARACTER ORIENTATION - A character recognition apparatus and method based on a character orientation are provided, in which an input image is binarized, at least one character area is extracted from the binarized image, a slope value of the extracted at least one character area is calculated, the calculated slope value is set as a character feature value, and a character is recognized by using a neural network for recognizing a plurality of characters by receiving the set character feature value. Accordingly, the probability of wrongly recognizing a similar character decreases, and a recognition ratio of each character increases. | 09-01-2011 |
20110222772 | RESOLUTION ADJUSTMENT OF AN IMAGE THAT INCLUDES TEXT UNDERGOING AN OCR PROCESS - An optical character recognition process characterizes text lines in a textual image by their base-line, mean-line and x-height. The base-line for at least one text line in the image is determined by finding a parametric curve that maximizes a first fitness function that depends on the values of pixels through which the parametric curve passes and pixels below the parametric curve. The base-line corresponds to the parametric curve for which the first fitness function is maximized. The first fitness function is designed so that it increases with increasing lightless or brightness of pixels immediately below the parametric curve while also increasing with decreasing lightness of pixels through which the parametric curve passes. The mean-line is determined by incrementally shifting the base-line upward by predetermined amounts (e.g., a single pixel) until a second fitness function for the shifted base-line is maximized. The second fitness function is essentially the inverse of the first fitness function. Specifically, the second fitness function increases with increasing lightless of pixels immediately above the shifted base-line while also increasing with decreasing lightness of pixels through which the shifted base-line passes. The x-height is equal to the sum of the predetermined amounts by which the base-line is shifted upward in order to maximize the second fitness function. In some cases different groups of text-lines in the textual image may be characterized differently from one another. For example, each group may be characterized by a most probable x-height for that group. | 09-15-2011 |
20110222773 | PARAGRAPH RECOGNITION IN AN OPTICAL CHARACTER RECOGNITION (OCR) PROCESS - An image processing apparatus for detecting paragraphs in a textual image includes an input component for receiving an input image in which textual lines and words have been identified and a page classification component for classifying the input image as a first or second page type. The apparatus also includes a paragraph detection component for classifying all textual lines on the input image as a beginning paragraph line or a continuation paragraph line. The apparatus is also provided with a paragraph creation component for creating paragraphs that include textual lines between two successive beginning paragraph lines, including a first of the two successive beginning paragraph lines. The paragraphs that have been identified may be classified by the type of alignment they exhibit. For instance, paragraphs may be classified according to whether they are left aligned, right aligned, center aligned or justified. | 09-15-2011 |
20110229036 | METHOD AND APPARATUS FOR TEXT AND ERROR PROFILING OF HISTORICAL DOCUMENTS - The present invention enables the computation of various types of information for a particular scanned and OCR recognised or retyped historical input document. It provides a global view on the “patterns” for historical language variation (text profiling) and the OCR errors most frequently found in the text (error profiling). For each of the individual tokens of the OCR output, an interpretation is given which based on the document specific information attempts to describe both, the underlying correct word of the text and the corresponding modern spelling of the word. This not only provides input for optimised OCR recognition of historical documents, but also for quality assurance and improved information retrieval. | 09-22-2011 |
20110229037 | CHARACTER RECOGNITION APPARATUS AND CHARACTER RECOGNITION METHOD - An objective is to eliminate dotted lines in a character box in image data to increase the character recognition rate. There are some cases in which a dotted line candidate cannot be extracted due to many overlapping parts of dotted lines and characters or due to a blurry part in a dotted line. In such cases, the position of a dotted line candidate is estimated referring to features such as the interval, length, width, etc. of a dotted line candidate in the same character box (or in a character box for another relevant item), and image data of the estimated position and image data of a previously extracted dotted line (or a reference dotted line) are compared to determine whether or not they are an identical dotted line. | 09-22-2011 |
20110243446 | CODE READING APPARATUS, SALES REGISTERING APPARATUS, AND SALES REGISTERING METHOD - According to one embodiment, a code reading apparatus includes a commodity-information reading unit, a commodity-information output unit, a benefit-information reading unit, and a benefit-information output unit. The commodity-information reading unit reads commodity information from a code symbol attached to a commodity. The commodity-information output unit outputs the commodity information read by the commodity-information reading unit. The benefit-information reading unit detects an image of benefit indication from an image imaged by an imaging unit and reads benefit information corresponding to the benefit indication from the detected image. The benefit-information output unit outputs the benefit information read by the benefit-information reading unit. | 10-06-2011 |
20110243447 | METHOD AND APPARATUS FOR SYNTHESIZING SPEECH - Method and apparatus of synthesizing speech from a plurality of portion of text data, each portion having at least one associated attribute. The invention is achieved by determining ( | 10-06-2011 |
20110255784 | SYSTEMS AND METHODS FOR AUTOMATICALLY EXTRACTING DATA FROM ELETRONIC DOCUMENTS USING MULTIPLE CHARACTER RECOGNITION ENGINES - In a document analysis system that receives and processes jobs from a plurality of users, in which each job may contain multiple electronic documents, to extract data from the electronic documents, a method of automatically extracting data from each received electronic document using a plurality of character recognition engines is provided. The method includes: automatically processing each received electronic document page using each of a plurality of recognition engines to extract data; comparing quality of data extracted from each of the recognition engines to assign a confidence score to the extracted data; and selecting extracted data having highest confidence score as the correct extracted data. | 10-20-2011 |
20110255785 | Character area extracting device, imaging device having character area extracting function, recording medium saving character area extracting programs, and character area extracting method - A character area extracting device includes a reflective and non-reflective area separation unit separating image data into reflective and non-reflective areas, and binarizing the image data by changing a first threshold value when it is inappropriate; a reflective area binarizing unit separating the reflective area into character and background areas, and binarizing it by changing a second threshold value when it is inappropriate; a non-reflective area binarizing unit separating the non-reflective area into the character and background areas, and binarizing it by changing a third threshold value when it is inappropriate; a reflective and non-reflective area separation evaluation unit; and a line extracting unit connecting the character areas of the reflective and non-reflective areas and extracting positional information of the connected character areas in the image data. | 10-20-2011 |
20110268361 | Method for Locating and Decoding Distorted Two-Dimensional Matrix Symbols - A method is presented for processing an image of a two-dimensional (2D) matrix symbol having a plurality of data modules and a discontinuous finder pattern, each distorted by “donut effects”. A resulting processed image contains an image of the 2D matrix symbol having a continuous finder pattern suitable for conventional 2D matrix symbol locating techniques, and having a plurality of data modules, each data module having a center more truly representative of intended data, and suitable for conventional 2D matrix symbol sampling and decoding. The method includes sharpening the distorted image of the 2D matrix symbol to increase a difference between low frequency and high frequency image feature magnitudes, thereby providing a sharpened image, and smoothing the sharpened image using a moving window over the sharpened image so as to provide a smoothed image, the moving window and a module of the 2D matrix code being of substantially similar size. | 11-03-2011 |
20110280483 | Shape Clustering in Post Optical Character Recognition Processing - Techniques for shape clustering and applications in processing various documents, including an output of an optical character recognition (OCR) process. | 11-17-2011 |
20110286668 | IMAGE PROCESSING DEVICE, IMAGE PROCESSING METHOD, AND COMPUTER READABLE MEDIUM - An image processing device includes a storage module, character recognition module, a circumscribed rectangle extraction module, a ratio extraction module, and a character size calculation module. The storage module stores a reference ratio between a reference size of a reference circumscribed rectangle and a reference character size in a reference character image representing a reference character in association with a reference character identification code which uniquely identified the reference character. The character recognition module recognizes a character image in an image to get a character identification code from the recognized character image. The circumscribed rectangle extraction module extracts a circumscribed rectangle of the character image. The ratio extraction module extracts the reference ratio corresponding to the reference character identification code stored in the storage module based on the character identification code. The character size calculation module calculates a character size of the character image. | 11-24-2011 |
20110286669 | FORM PROCESSING SYSTEM, OCR DEVICE, FORM CREATION DEVICE, AND COMPUTER READABLE MEDIUM - There is provided a form processing system including a form creation device and an OCR device, wherein the form creation device includes a layout generation unit that generates layout information denoting a layout of a form and a layout transmission unit that transmits the layout information generated to the OCR device, and the OCR device includes a layout acquisition unit that acquires the layout information transmitted from the form creation device and an OCR processing unit that performs OCR processing on image data of the form read by a scanner, based on the layout information acquired. | 11-24-2011 |
20110293183 | SCANNING SYSTEM WITH OPTICAL CHARACTER RECOGNITION - A system includes an imaging device and an acquisition layer. The imaging device acquires an image. The acquisition layer is logically located between a source manager and the imaging device, the source manager being called by an application when a user of the system requests to acquire the image. The acquisition layer includes imaging acquisition logic that receives the image from the imaging device and performs optical character recognition (OCR) that extracts machine editable text from the image. The acquisition layer forwards the image to the application and makes the machine editable text available to the user. | 12-01-2011 |
20110293184 | METHOD OF IDENTIFYING PAGE FROM PLURALITY OF PAGE FRAGMENT IMAGES - A method of identifying a physical page containing printed text from a plurality of page fragment images captured by a camera. The method includes the steps of: placing a handheld electronic device in contact with a surface of the physical page; moving the device across the physical page and capturing the plurality of page fragment images at a plurality of different capture points; measuring a displacement or direction of movement; performing OCR on each captured page fragment image; creating a glyph group key for each page fragment image; looking up each created glyph group key in an inverted index of glyph group keys; comparing a displacement or direction between glyph group keys in the inverted index with a measured displacement or direction between the capture points for corresponding glyph group keys created using OCR; and identifying a page identity corresponding to the physical page using the comparison. | 12-01-2011 |
20110293185 | HYBRID SYSTEM FOR IDENTIFYING PRINTED PAGE - A hybrid system for identifying a printed page. The system includes: (i) the printed page having human-readable content and a coding pattern printed in every interstitial space between portions of human-readable content, the coding pattern being either absent from the human-readable content or unreadable when superimposed with the human-readable content; and (ii) a handheld device for overlaying and contacting the printed page. The handheld device includes: a camera for capturing page fragment images; and a processor configured for: decoding the coding pattern and determining the page identity in the event that the coding pattern is visible in and decodable from the captured page fragment image; and otherwise initiating OCR or SIFT techniques to identify the page. | 12-01-2011 |
20110299779 | Methods and Systems for Detecting Numerals in a Digital Image - Aspects of the present invention are related to systems and methods for determining the location of numerals in an electronic document image. | 12-08-2011 |
20110305393 | TECHNIQUES IN OPTICAL CHARACTER RECOGNITION - An image deskew system and techniques are used in the context of optical character recognition. An image is obtained of an original set of characters in an original linear (horizontal) orientation. An acquired set of characters, which is skewed relative to the original linear orientation by a rotation angle, is represented by pixels of the image. The rotation angle is estimated, and a confidence value may be associated with the estimation, to determine whether to deskew the image. In connection with rotation angle estimation, an edge detection filter is applied to the acquired set of characters to produce an edge map, which is input to a linear hough transform filter to produce a set of output lines in parametric form. The output lines are assigned scores, and based on the scores, at least one output line is determined to be a dominant line with a slope approximating the rotation angle. | 12-15-2011 |
20110311140 | Selecting Representative Images for Establishments - Establishments are identified in geo-tagged images. According to one aspect, text regions are located in a geo-tagged image and text strings in the text regions are recognized using Optical Character Recognition (OCR) techniques. Text phrases are extracted from information associated with establishments known to be near the geographic location specified in the geo-tag of the image. The text strings recognized in the image are compared with the phrases for the establishments for approximate matches, and an establishment is selected as the establishment in the image based on the approximate matches. According to another aspect, text strings recognized in a collection of geo-tagged images are compared with phrases for establishments in the geographic area identified by the geo-tags to generate scores for image-establishment pairs. Establishments in each of the large collection of images as well as representative images showing each establishment are identified using the scores. | 12-22-2011 |
20120008865 | SYSTEM AND METHOD OF DETERMINING BUILDING NUMBERS - A system and method is provided for automatically recognizing building numbers in street level images. In one aspect, a processor selects a street level image that is likely to be near an address of interest. The processor identifies those portions of the image that are visually similar to street numbers, and then extracts the numeric values of the characters displayed in such portions. If an extracted value corresponds with the building number of the address of interest such as being substantially equal to the address of interest, the extracted value and the image portion are displayed to a human operator. The human operator confirms, by looking at the image portion, whether the image portion appears to be a building number that matches the extracted value. If so, the processor stores a value that associates that building number with the street level image. | 01-12-2012 |
20120020562 | CAMERA-VISION SYSTEMS, USED IN COLLABORATION WHITEBOARDS, FOR PRE-FORMATTED, REUSABLE, ANNOTATABLE, MOVABLE MENUS AND FORMS. - Systems and devices for, and methods of, image-based processing where a device embodiment comprises: (a) a processor; (b) an addressable memory, the memory comprising a set one or more image references, and where the set of image references comprises a rule of interpretation and a rule of execution; and the processor is configured to: (1) compare captured surface indicia of a sheet with the set of at least one image reference; (2) determine the image reference associated with the surface indicia based on the comparison of the surface indicia and the set of at least one image reference; (3) extract a marking by differencing the surface indicia and the image reference; (4) interpret the extracted marking based on the rule of interpretation associated with the image reference; and (5) invoke the rule of execution based on the rule of interpretation. | 01-26-2012 |
20120020563 | Systems and Methods for Automated Extraction of Measurement Information in Medical Videos - Systems and methods providing automated extraction of information contained in video data and uses thereof are described. In particular, systems and associated methods are described that provide techniques for extracting data embedded in video, for example measurement-value pairs of medical videos, for use in a variety of applications, for example video indexing, searching and decision support applications. | 01-26-2012 |
20120020564 | SHAPE CLUSTERING IN POST OPTICAL CHARACTER RECOGNITION PROCESSING - Techniques for shape clustering and applications in processing various documents, including an output of an optical character recognition (OCR) process. The output of an OCR process is classified into a plurality of clusters of clip images and a representative image for each cluster is generated to identify clusters whose clip images were incorrectly assigned character codes by the OCR process. | 01-26-2012 |
20120020565 | Selecting Representative Images for Establishments - Establishments are identified in geo-tagged images. According to one aspect, text regions are located in a geo-tagged image and text strings in the text regions are recognized using Optical Character Recognition (OCR) techniques. Text phrases are extracted from information associated with establishments known to be near the geographic location specified in the geo-tag of the image. The text strings recognized in the image are compared with the phrases for the establishments for approximate matches, and an establishment is selected as the establishment in the image based on the approximate matches. According to another aspect, text strings recognized in a collection of geo-tagged images are compared with phrases for establishments in the geographic area identified by the geo-tags to generate scores for image-establishment pairs. Establishments in each of the large collection of images as well as representative images showing each establishment are identified using the scores. | 01-26-2012 |
20120039537 | METHOD, APPARATUS, AND SYSTEM FOR WORKFLOW PARTICIPATION OF AN IMAGING DEVICE - A method, apparatus, and system for communicating between an apparatus hosting a workflow application and an imaging device, the system including a state engine configured to read and extract data from a first message received from the imaging device, to communicate with an application component, and to advance to a workflow state, a state translator configured to receive the workflow state from the state engine, to convert the workflow state into an imaging device instruction, and to send the imaging device instruction to the imaging device, a state instantiater configured to change a state of a component of the imaging device in accordance with the imaging device instruction, an event responder configured to assemble data in a second message based on the changed state of the component of the imaging device, and an interface configured to send the second message to the apparatus. | 02-16-2012 |
20120076415 | COMPUTER AIDED VALIDATION OF PATENT DISCLOSURES - A method and system for analyzing a patent disclosure is disclosed. The method and system comprise a computerized cross-check of reference labels within drawings of a disclosure to reference labels found within the text of the disclosure, and generating warnings for reference labels that are missing from either the drawings or the text. | 03-29-2012 |
20120082382 | DISTRIBUTED DOCUMENT PROCESSING - A system for document processing including decomposing an image of a document into at least one data entry region sub-image, providing the data entry region sub-image to a data entry clerk available for processing the data entry region sub-image, receiving from the data entry clerk a data entry value associated with the data entry region sub-image, and validating the data entry value. | 04-05-2012 |
20120087587 | Binarizing an Image - The invention provides various methods and techniques for binarizing an image, generally in advance of further processing such as optical character recognition (OCR). One step includes establishing boundaries of image objects of an image and classifying each image object as either suspect or non-suspect. Another step includes creating a local binarization threshold map that may include or store threshold binarization values associated with image objects classified as non-suspect. Yet another step includes expanding the local binarization threshold map to cover the entire image thereby creating a global binarization threshold map for the entire image. The methods and techniques are capable of identifying and working with separation objects and incuts in images. | 04-12-2012 |
20120093415 | Dynamic Recognition of Web Addresses in Video - One embodiment described herein may take the form of a system or method for dynamically recognizing an Internet address within a video or audio component of a multimedia presentation on a distribution system or network such as, but not limited to, a satellite, cable or Internet network. In general, the embodiment may analyze the audio portion of the presentation or one or more frames of a video component to detect the presence of a web address within the one or more frames. In the embodiment where the audio portion is analyzed, the system may perform a voice recognition or a similar analysis on the audio portion to detect the utterance of a web address. Similarly, one embodiment analyzing the one or more frames of the video component may comprise performing an optical character recognition (OCR) of the frame. | 04-19-2012 |
20120099792 | ADAPTIVE OPTICAL CHARACTER RECOGNITION ON A DOCUMENT WITH DISTORTED CHARACTERS - A computer implemented method for adaptive optical character recognition on a document with distorted characters includes performing a distortion-correction transformation on a segmented character of the document assuming the segmented character to be a candidate character. The method further includes comparing the transformed segmented character to the candidate character by calculating a comparison score. If the calculated score is within a predetermined range, the segmented character is identified with the candidate character. The method may be implemented in either of computer hardware configured to perform the method, or in computer software embodied in a non-transitory, tangible, computer-readable storage medium. Also disclosed are corresponding computer program product and data processing system. | 04-26-2012 |
20120106845 | Replacing word with image of word - First data represents an image of text including words. Second data represents the text in a non-image form. A particular word within the second data is replaced with a corresponding part of the first data representing the image of the particular word. | 05-03-2012 |
20120114243 | Shape Clustering in Post Optical Character Recognition Processing - Techniques for shape clustering and applications in processing various documents, including an output of an optical character recognition (OCR) process. | 05-10-2012 |
20120128250 | Generating a Combination of a Visual Query and Matching Canonical Document - A server system receives a visual query from a client system distinct from the server system, performs optical character recognition (OCR) on the visual query to produce text recognition data representing textual characters, including a plurality of textual characters in a contiguous region of the visual query, and scores each textual character in the plurality of textual characters. The server system identifies, in accordance with the scoring, one or more high quality textual strings, each comprising a plurality of high quality textual characters from among the plurality of textual characters in the contiguous region of the visual query; retrieves a canonical document having the one or more high quality textual strings; generates a combination of the visual query and at least a portion of the canonical document; and sends the combination to the client system. | 05-24-2012 |
20120128251 | Identifying Matching Canonical Documents Consistent with Visual Query Structural Information - A server system receives a visual query from a client system, performs optical character recognition (OCR) on the visual query to produce text recognition data representing textual characters, including a plurality of textual characters in a contiguous region of the visual query. The server system also produces structural information associated with the textual characters in the visual query. Textual characters in the plurality of textual characters are scored. The method further includes identifying, in accordance with the scoring, one or more high quality textual strings, each comprising a plurality of high quality textual characters from among the plurality of textual characters in the contiguous region of the visual query. A canonical document that includes the one or more high quality textual strings and that is consistent with the structural information is retrieved. At least a portion of the canonical document is sent to the client system. | 05-24-2012 |
20120134589 | Optical character recognition (OCR) engines having confidence values for text types - An image of a known text sample having a text type is generated. The image of the known text sample is input into each OCR engine of a number of OCR engines. Output text corresponding to the image of the known text sample is received from each OCR engine. For each OCR engine, the output text received from the OCR engine is compared with the known text sample, to determine a confidence value of the OCR engine for the text type of the known text sample. | 05-31-2012 |
20120134590 | Identifying Matching Canonical Documents in Response to a Visual Query and in Accordance with Geographic Information - A server system receives a visual query from a client system distinct from the server system. The server system performs optical character recognition (OCR) on the visual query to produce text recognition data representing textual characters, including a plurality of textual characters in a contiguous region of the visual query. The server system scores each textual character in the plurality of textual characters in accordance with the geographic location of the client system. The server system identifies, in accordance with the scoring, one or more high quality textual strings, each comprising a plurality of high quality textual characters from among the plurality of textual characters in the contiguous region of the visual query. Then the server system retrieves a canonical document having the one or more high quality textual strings and sends at least a portion of the canonical document to the client system. | 05-31-2012 |
20120141030 | Code Recognition Method, Device and Computer Readable Storage Medium for Storing Code Recognition Method - A code recognition method includes the following steps: a first code-image block is received. Wherein, several first codes are displayed on the first code-image block. The first code-image block is partitioned into several second code-image blocks. Wherein, each of the second code-image blocks displays a second code respectively. Each of the second codes is one of the first codes. Each of the second code-image blocks is recognized as several third codes corresponding to each of the second codes respectively. Some of the neighboring second code-image blocks are combined to form several third code-image blocks. Wherein, each of the third code-image blocks displays a first code set, which comprises some of the second codes. Each of the third code-image blocks is recognized as a second code set corresponding to each of the first code sets respectively. Wherein, each of the second code sets includes the codes selected from the third codes. | 06-07-2012 |
20120141031 | ANALYSING CHARACTER STRINGS - A method for analyzing a character string, the method including: analyzing a character string to determine one of more characters of the character string; determining from a dictionary source, an alternative character string to the analyzed character string; comparing the analyzed character string with the alternative character string to determine a weighting factor for each of the characters of the analyzed character string relative to the positional arrangement of the characters in the alternative character string; and for each determined weighting factor, generating for each of the characters in the analyzed character string a corresponding character of a particular size as determined by the weighting factor. | 06-07-2012 |
20120183222 | COMPUTING DEVICE AND METHOD FOR AUTOMATICALLY TYPESETTING PATENT IMAGES - A method for automatically typesetting patent images extracts a brief introduction of each patent image from a description part of a patent document, and records a keyword of the brief introduction. The method distinguishes an image label of each patent image from an image part of the patent document. The method rotates the patent image by ninety degrees clockwise in response that the image label of the patent image does not contain the keyword, and outputs the rotated image onto a display device. | 07-19-2012 |
20120201461 | CHARACTER DETECTION APPARATUS, CHARACTER DETECTION METHOD, AND COMPUTER-READABLE STORAGE MEDIUM - A character detection apparatus is provided that detects, from an image including a first image representing a character and a second image representing a translucent object, the character. The character detection apparatus includes a calculating portion that, for each of blocks obtained by dividing an overlapping region in which the first image is overlapped by the second image, calculates a frequency of appearance of pixels for each of gradations of a property, and a detection portion that detects the character from the overlapping region based on the frequency for each of the gradations. | 08-09-2012 |
20120207392 | OPTICAL IMAGING AND ANALYSIS OF A GRAPHIC SYMBOL - Method, computer program product, and apparatus are provided for identifying a graphic symbol within an image obtained by optical scanning. An image intensity is measured for each of a plurality of columns of the image, wherein each column has a length that extends across the graphic symbol in a first direction, and wherein the plurality of columns collectively extend across the graphic symbol in a second direction. The graphic symbol is then identified by matching a profile of the image intensity to a predetermined image intensity profile associated with a given graphic symbol. Optionally, the image is a digital image and the image intensity for each column is the sum of the image intensity for each pixel in that individual column. An image intensity differential between adjacent columns may be calculated for matching with a predetermined differential profile or comparison with an electronic profile generated by a magnetic scan. | 08-16-2012 |
20120213442 | CHARACTER RECOGNITION APPARATUS, CHARACTER RECOGNITION METHOD, AND COMPUTER READABLE MEDIUM STORING PROGRAM - A character recognition apparatus includes an acquisition unit, a specification unit, a movement unit, and a recognition unit. The acquisition unit acquires data representing a character string. The specification unit specifies an element of a compound character satisfying a predetermined condition for determining the compound character from the character string. The movement unit moves the element of the compound character close to an adjacent character. The recognition unit recognizes a changed character string in which the movement unit has moved the element of the compound character, based on a shape of characters and relevance between adjacent characters. | 08-23-2012 |
20120230587 | SYSTEMS AND METHODS FOR TESTING CONTENT OF MOBILE COMMUNICATION DEVICES - The embodiments described herein relates to systems and method for testing user content of given mobile communication devices. According to one aspect, there is provided a method for testing user content of given mobile communication devices that includes the steps of providing at least one model image associated with at least one graphical user interface (“GUI”) screen of a model mobile communication device corresponding to the given mobile communication device, obtaining at least one test image associated with at least one GUI screen of the given mobile communication device, comparing the test image with the model image, and determining whether the user content of the given mobile communication device is different from the desired content of the model mobile communication device. | 09-13-2012 |
20120251004 | IMAGE PROCESSING APPARATUS AND IMAGE PROCESSING METHOD - An image processing apparatus supports image processing in multiple languages via a user interface, a determining unit, a setting unit, and a character recognizing unit. The user interface sets an instruction from a user for various functions performed by the image processing apparatus. The user interface displays characters in a language. The determining unit automatically determines the language currently used for the characters displayed in the user interface of the various functions. The setting unit sets, in response to the determining unit automatically determining the language currently used for the characters displayed in the user interface, the determined language as a scanned document language for use in recognizing characters in a scanned document which is obtained by scanning a paper document. The character recognizing unit utilizes the scanned document language set by the setting unit to recognize characters in the scanned document and create text data. | 10-04-2012 |
20120257832 | INFORMATION PROCESSING APPARATUS AND METHOD, PROGRAM, AND IMAGING APPARATUS - An information processing apparatus includes: a character recognition processing portion which performs a character recognition processing with respect to a character string region in an image; a character string information extraction portion which extracts character string information being information related to a character string from the character string in which a character is recognized by the character recognition processing portion; a display character string generation portion which generates a display character string of a character font corresponding to the character string information which is extracted by the character string information extraction portion; and a display control portion which performs control so as to display the display character string in the vicinity of the character string region in the image. | 10-11-2012 |
20120263380 | DATA PROCESSING APPARATUS, METHOD FOR CONTROLLING DATA PROCESSING APPARATUS, AND NON-TRANSITORY COMPUTER READABLE STORAGE MEDIUM - When a display language is different from an OCR language, which is used for document name OCR, the name of a document to be sent may not be correctly displayed on a screen. A data processing apparatus is provided that includes a document name setting unit configured to set a document name including a character string recognized on the basis of document data for the document data generated by a read unit, and a control unit configured to restrain the document name setting unit from setting the document name when a language specified by a character recognition language specifying unit is different from a language specified by a display language setting unit. | 10-18-2012 |
20120269438 | IMAGE PROCESSING APPARATUS - The object of this invention is to provide an image processing apparatus in which, in processing of a document image read by a document reading device, an inclination of a character string in the document image which is recognized in character recognition is obtained more accurately. The image processing apparatus includes a similar character extraction portion which extracts and outputs a character group comprised of characters having a shape and a size that are same with or similar to each other from among characters constituting a character string comprised of a character recognized in optical character recognition from a document image read by a document reading device; and an inclination calculation portion which calculates an inclination value of the character string based on position information of each character of the character group output from the similar character extraction portion. | 10-25-2012 |
20120281920 | PARALLEL TEST PAYLOAD - A parallel test payload includes a bit sequence configured to be segmented into a plurality of sub-sequences having variable bit length carriers. Respective carriers are represented uniformly in each one of the plurality of sub-sequences. | 11-08-2012 |
20120288202 | INTERIOR LOCATION IDENTIFICATION - A parse module calibrates an interior space by parsing objects and words out of an image of the scene and comparing each parsed object with a plurality of stored objects. The parse module further selects a parsed object that is differentiated from the stored objects as the first object and stores the first object with a location description. A search module can detect the same objects from the scene and use them to determine the location of the scene. | 11-15-2012 |
20120314954 | EMBEDDED FORM EXTRACTION DEFINITION TO ENABLE AUTOMATIC WORKFLOW CONFIGURATION - A system and methods are disclosed to automatically extract data from documents, such as scanned paper forms and/or digital forms that need to be pre-configured to understand a layout for the forms to be processed. The system extracts data from the form definition at a two dimensional barcode and dynamically configures a workflow with services for extracting desired user filled information from the data fields present on the form. Support for a re-flowable service is provided. | 12-13-2012 |
20120321191 | READING ORDER DETERMINATION APPARATUS, METHOD, AND PROGRAM FOR DETERMINING READING ORDER OF CHARACTERS - A method and apparatus for determining a reading order of characters The method includes preparing a list of character information, which is character information extracted from image data by character recognition processing and preparing a list of line information, which is made up of a line box surrounding a set of characters which are continuously aligned in the same direction in image data and an alignment direction of characters in the line box. In response to a request for adding character information to the list of character information, extracting a line box containing a character region of the character to be added, obtaining all character information having the character region contained in the concerned line box from the list of character information and rearranging according to the position with respect to the alignment direction of characters corresponding to the line box to determine a new reading order of characters. | 12-20-2012 |
20130004077 | METHOD OF AND DEVICE FOR IDENTIFYING DIRECTION OF CHARACTERS IN IMAGE BLOCK - The present embodiments disclose a method of and device for identifying the direction of characters in an image block. The method includes: performing optical character recognition processing on the image block by assuming various directions as assumed character directions to obtain sub image blocks, recognized characters corresponding to the sub image blocks and correctness measures thereof in each assumed character directions; in sub image blocks in the assumed character directions with | 01-03-2013 |
20130004078 | DOCUMENT MANAGEMENT SYSTEM, EVALUATION DEVICE, DATA OUTPUT CONTROL DEVICE, DOCUMENT MANAGEMENT METHOD AND DOCUMENT MANAGEMENT PROGRAM - According to one embodiment, a document management system in the embodiments, includes an information acquisition unit that acquires a management ID, acquires, using the management ID, document type information, and outputs the document type information. The document management system in the embodiments of the invention, includes a policy selection evaluation unit that acquires operation information, user information, and the document type information, selects policy information defining an operation extent of user based on the document type information, and evaluates whether or not that a user defined in the user information is authorized to perform an operation defined in the operation information in accordance with a definition of the selected policy information. | 01-03-2013 |
20130022270 | Optical Character Recognition of Text In An Image for Use By Software - A computer implemented method and apparatus that can OCR an image, or selected portions of an image, and then provide options to a user for use of the results of the OCR, including passing the results of the OCR to a software program so the software program can perform some action on the results of the OCR. | 01-24-2013 |
20130022271 | METHOD OF AND DEVICE FOR IDENTIFYING DIRECTION OF CHARACTERS IN IMAGE BLOCK - The embodiments disclose a method of and a device for identifying direction of characters in image block. The method includes: performing optical character recognition processing on the image block by assuming various directions as assumed character directions to obtain sub image blocks, recognized characters and correctness measures in each assumed direction; in sub image blocks in the assumed directions with a 180° mutual relation, searching for a minimum matching pair; when there is one sub image block in each assumed direction in a minimum matching pair and recognized characters belonging to the minimum matching pair are the same rotation invariant character or belong to the same rotation invariant character pair, adjusting their correctness measures to the same; calculating an accumulative correctness measure in each assumed direction based on the adjusted results; and identifying the direction of the characters in the image block according to the accumulative correctness measures. | 01-24-2013 |
20130022272 | METHOD OF AND DEVICE FOR IDENTIFYING DIRECTION OF CHARACTERS IN IMAGE BLOCK - The present embodiments disclose a method of and a device for identifying the direction of characters in an image block. The method includes: performing optical character recognition processing on the image block by assuming various directions as assumed character directions, respectively, to obtain sub image blocks, recognized characters corresponding to the sub image blocks and correctness measures thereof in each of the assumed character directions; determining a language group to which the characters in the image block belong; adjusting a correctness measure corresponding to a sub image block which corresponds to a recognized character not belonging to the determined language group in each of the assumed character directions; calculating an accumulative correctness measure in each of the assumed character directions based on the adjusted correctness measure; and identifying the direction of the characters in the image block according to the accumulative correctness measures. | 01-24-2013 |
20130034302 | CHARACTER RECOGNITION APPARATUS, CHARACTER RECOGNITION METHOD AND PROGRAM - The character recognition apparatus recognizes characters from a read document original to correct a character string as a character recognition result in a word unit with a space character as a separator. The character recognition apparatus includes a circumscribed rectangle formation portion which forms a circumscribed rectangle for each recognized alphabet character string, a fixed-pitch font determination portion which determines whether or not a font is a fixed-pitch font based on a distance between center lines in a width direction of adjacent circumscribed rectangles, a portion for determining an excess space character which determines, in the case of a fixed-pitch font, that the space character is an excess based on that a width of a space character in the character string is narrower than a predetermined width, and a portion for deleting the space character determined as an excess from the character string. | 02-07-2013 |
20130084009 | SYSTEMS, METHODS AND USER INTERFACES IN A PATENT MANAGEMENT SYSTEM - A system and method of providing data for a patent management system is proposed. The system presents one or more data fields of interest to a user and the method comprises downloading at least one patent document from an external patent database; applying optical character recognition to the downloaded document to provide a text-readable version of the at least one patent document; automatically applying electronic text analysis to the text-readable version to extract one or more data elements associated with a field of interest, and transmitting the data elements to a user. | 04-04-2013 |
20130084010 | IN-FIELD DEVICE FOR DE-CENTRALIZED WORKFLOW AUTOMATION - In one example, a system is provided. The system includes a portable, in-field unit including: a tag reader to acquire an ID tag identifier from a tag located in or on a physical item positioned within functional range of the in-field unit tag reader; a digital processor arranged for executing software code stored in the in-field unit responsive to the acquired ID tag identifier, the stored software code including—a customer application layer; and a database adapter component configured to provide database services to the processor; wherein the database services include accessing a stored database to acquire stored data associated with the acquired ID tag identifier. | 04-04-2013 |
20130084011 | PROOF READING OF TEXT DATA GENERATED THROUGH OPTICAL CHARACTER RECOGNITION - A novel system includes: a first proof reading tool for performing carpet proof reading on text data; a second proof reading tool for performing side-by-side proof reading on the text data; a storage unit configured to store a log of proof reading operations having been performed by using the first and second proof reading tools; and an analysis unit configured to determine, for each attribute serving as units in which carpet proof reading is performed with the first proof reading tool, whether or not to use the first proof reading tool in proof reading of the attribute, by comparing a first estimated value of a time taken when proof reading is performed by using the first proof reading tool with a second estimated value of a time taken when proof reading is performed by using the second proof reading tool without using the first proof reading tool, the first and second estimated values being calculated on the basis of the log. | 04-04-2013 |
20130108162 | INFORMATION OUTPUT DEVICE AND INFORMATION OUTPUT METHOD | 05-02-2013 |
20130114900 | METHODS AND APPARATUSES FOR MOBILE VISUAL SEARCH - Methods, apparatuses, and computer program products are herein provided for providing a REVV system that is configured to provide an MVS that is operable on a mobile terminal. One example method may include causing a plurality of vector word residuals to be aggregated for at least one visual word using local feature descriptors extracted from an image. The method may further include causing the dimensionality of the aggregated at least one vector word residual for each visual word to be reduced by using a classification aware linear discriminant analysis. The method may further include computing, using a processor, a weighted correlation for at least one compact image signature that is binarized from the aggregated at least one vector word residual when compared to a list of candidates. The method may further include determining a ranked list of candidates based on the computed weighted correlation. | 05-09-2013 |
20130121580 | ANALYSIS OF SERVICE DELIVERY PROCESSES BASED ON INTERROGATION OF WORK ASSISTED DEVICES - A method of monitoring input devices to discover units of work and type of work includes recording uses of input devices of a computer, analyzing the recorded uses against pre-defined use patterns to determine sets of the recorded uses that correspond to one of a plurality of units of work, and outputting an indicator indicating which of the units of work have occurred. A method of accessing a call center includes performing speech to text transcription on audio recordings from the center, determining an identifier identifying an operator for a call from the text, estimating a phase of the call based on the text, recording ant entry including the identifier, the phase, and a time period of the phase, correlating the entry with another entry including information on an application run during the estimated phase to generate a correlated entry, and determining quality level of operator based on correlated entry. | 05-16-2013 |
20130121581 | IDENTIFICATION METHOD AND APPARATUS OF CONFUSABLE CHARACTER - An identification method and apparatus of confusable character are provided. The method involves: the detected character image is identified to gain the initial character information which is corresponding to the character image; the step change times of the corresponding external outline of the character image are counted if the initial character information is the confusable character; the final character information corresponding to the character image is confirmed according to the step change times; The final character information of the character image can be known conveniently according to the step change times, therefore the corresponding correct character information of the character image can be identified more precisely. The possibility of wrong identification of the character image because of the appearing confusable character can be reduced, and the identification precision rate of the confusable character can be improved. | 05-16-2013 |
20130129217 | COLLECTION AND USE OF MONITORED DATA - A device is configured to capture an image of a monitoring device display, perform optical character recognition to identify alphanumeric data in the image, apply a device profile to map each identified alphanumeric datum to a parameter associated with the monitoring device; and store each datum along with its associated parameter. | 05-23-2013 |
20130129218 | SYSTEM AND METHOD FOR PROCESSING RECEIPTS AND OTHER RECORDS OF USERS - A service can perform optical character recognition (OCR) on an image of a record to determine a first set of information items about the record. A second set of information items can be identified that are likely part of the record but not determinable from performing OCR on the image. Another resource can be utilized to determine the second set of information items. A classification for the record can be determined based on first and second sets of information items. The record can be associated with a financial resource of the user based at least in part on the classification. | 05-23-2013 |
20130129219 | PATTERN RECOGNITION APPARATUS, PATTERN RECOGNTION METHOD, IMAGE PROCESSING APPARATUS, AND IMAGE PROCESSING METHOD - An image and supplementary information of the image, such as a photographing point and time, are input by an image input section and are stored in an image data storage section. Character recognition in the image is performed by a character recognition section, and the recognition result is stored in a character recognition result storage section. An analysis section extracts object character information relevant to an object from the image, the supplementary information, and the character recognition result on the basis of the analysis conditions input in a designation section to thereby analyze an object, and the analysis result is output to a result output section. Accordingly, a change in the object can be analyzed by analyzing a change in character patterns indicating the identical object. | 05-23-2013 |
20130142430 | INFORMATION PROCESSING APPARATUS AND INFORMATION PROCESSING METHOD - An information processing apparatus encodes an input pattern to a code including a plurality of bits, calculates reliabilities for respective bits of the code, generates a similar codes each similar to the code based on the reliabilities, and recognizes the input pattern based on the code and the similar codes. | 06-06-2013 |
20130156317 | Enhanced Note Processing - Techniques and systems are disclosed to perform, in some examples, the steps of receiving a note or an image of a note, imaging at least a portion of the note, determining a value of at least one field indicated by a predetermined identifier of the note through character and mark recognition, and storing information regarding the note in a memory. The information regarding the note that may be stored in a memory may be forwarded to a regulatory agency or an external entity for reporting or record-keeping. | 06-20-2013 |
20130170752 | IMAGE, AUDIO, AND METADATA INPUTS FOR KEYWORD RESOURCE NAVIGATION LINKS - A system, method, and computer-readable medium, is described that implements a resource navigation links tool that receives one or more inputs, extracts information from the inputs into a submission string, submits the submission string to a resource navigation links tool, and receives resource navigation links based on the submission string. Inputs types may include images, audio clips, and metadata. The inputs sources may be processed to extract information related to the image source to build the submission string. | 07-04-2013 |
20130177246 | Identification and Separation of Form and Feature Elements from Handwritten and Other User Supplied Elements - A system and methods for progressive feature evaluation of an electronic document image to identify user supplied elements is disclosed. The system includes a controller in communication with a storage device configured to receive and accessibly store a generated plurality of candidate images. The controller is operable to analyze the electronic document image to identify a first feature set and a second feature set, wherein each of the first and second feature sets represent a different form feature, compare the first feature set to the second feature set, and define a third feature set based on the intersection of the first and second feature sets, wherein the third feature sets represents the user provided elements. | 07-11-2013 |
20130188872 | INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND RECORDING MEDIUM THAT HAS RECORDED INFORMATION PROCESSING PROGRAM - An appropriate search is carried out even with images including a complicated layout structure, decorated characters, and so on. An image search device | 07-25-2013 |
20130195360 | LOWER MODIFIER DETECTION AND EXTRACTION FROM DEVANAGARI TEXT IMAGES TO IMPROVE OCR PERFORMANCE - Systems, apparatus and methods for extracting lower modifiers from a word image, before performing optical character recognition (OCR), based on a plurality of tests comprising a first test, a second test and a third test are presented. The method obtains the word image and performing a plurality of tests (e.g., a first test, a second test and a third test). The first test determines whether a vertical line spanning the height of the word image exists. The second test determines whether a jump of a number of components in the lower portion of the word image exists. The third test determines sparseness in a lower portion of the word image. The plurality of tests may run sequentially and/or in parallel. Results from the plurality of tests are used to decide whether a lower modifier exists by comparing and accumulating test results from the plurality of tests. | 08-01-2013 |
20130202207 | METHOD, SERVER, AND COMPUTER-READABLE RECORDING MEDIUM FOR ASSISTING MULTIPLE USERS TO PERFORM COLLECTION SIMULTANEOUSLY - The present invention relates to a method for assisting multiple users to perform a collection simultaneously. The method includes the steps of: (a) acquiring digital data created with respect to recognition reference information of an object from a terminal of each of the multiple users; (b) determining or recognizing whether the respective digital data on the recognition reference information acquired through the terminals were created within a preset place condition and whether the respective digital data on the recognition reference information acquired through the terminals were created within a preset scope of the time; (c) selecting a specified group of users, including a first to an n-th user among the multiple users, who create the digital data within the preset place condition and within the preset scope of the time; and (d) providing information on rewards corresponding to the object for users included in the specified group of users. | 08-08-2013 |
20130202208 | INFORMATION PROCESSING DEVICE AND INFORMATION PROCESSING METHOD - An information processing device comprises a word string acquirer which acquires a word string that is a target of analysis; a partial string extractor which extracts, using two words on either side of each space in the word string, a partial string containing one word but not the other, a partial string not containing the one word but containing the other, and a partial string containing both words from the word string; a division coefficient acquirer which acquires, for each partial string, division coefficients indicating degree of reliability in dividing the partial string by respective division patterns that divide the partial string into words; a probability coefficient acquirer which calculates a coefficient indicating probability that the word string is divided at the space based on the division coefficients; and an ouputter which determines division of the word string based on the coefficient, and divides and outputs the word string. | 08-08-2013 |
20130223744 | DATA ACCESS BASED ON CONTENT OF IMAGE RECORDED BY A MOBILE DEVICE - Embodiments of the invention are directed to using image data and contextual data to determine information about a scene, based on one or more previously obtained images. Contextual data, such location of image capture, can be used to determine previously obtained images related to the contextual data and other location-related information, such as billboard locations. With even low resolution devices, such as cell phone, image attributes, such as a histogram or optically recognized characters, can be compared between the previously obtained images and the newly captured image. Attributes matching within a predefined threshold indicate matching images. Information on the content of matching previously obtained images can be provided back to a user who captured the new image. User profile data can refine the content information. The content information can also be used as search terms for additional searching or other processing. | 08-29-2013 |
20130230248 | ENSURING VALIDITY OF THE BOOKMARK REFERENCE IN A COLLABORATIVE BOOKMARKING SYSTEM - A method, system and computer program product for ensuring that the tags accurately describe a resource referenced by a bookmark in a collaborative bookmarking system. A user bookmarking an Internet resource that is referenced by a bookmark is detected. The user provides a description of the bookmark in the form of metadata, which includes tags, to be associated with the bookmark. The Internet resource is analyzed to determine its meaning. A second user bookmarking the same Internet resource that is referenced by the bookmark is detected. The second user provides a description of the bookmark in the form of metadata, which includes tags. The Internet resource is analyzed a second time to determine its meaning If the relatedness of these meanings is beyond a threshold limit, then the original bookmark metadata is invalidated and the invalidated tags are replaced with the tags provided by the second user. | 09-05-2013 |
20130243325 | COMPARING SETS OF CHARACTER DATA HAVING TERMINATION CHARACTERS - Multiple sets of character data having termination characters are compared using parallel processing and without causing unwarranted exceptions. Each set of character data to be compared is loaded within one or more vector registers. In particular, in one embodiment, for each set of character data to be compared, an instruction is used that loads data in a vector register to a specified boundary, and provides a way to determine the number of characters loaded. Further, an instruction is used to find the index of the first delimiter character, i.e., the first zero or null character, or the index of unequal characters. Using these instructions, a location of the end of one of the sets of data or a location of an unequal character is efficiently provided. | 09-19-2013 |
20130243326 | INTERIOR LOCATION IDENTIFICATION - A parse module calibrates an interior space by parsing objects and words out of an image of the scene and comparing each parsed object with a plurality of stored objects. The parse module further selects a parsed object that is differentiated from the stored objects as the first object and stores the first object with a location description. A search module can detect the same objects from the scene and use them to determine the location of the scene. | 09-19-2013 |
20130272613 | DETERMINING SCALING FACTORS FOR DEVICES - An image scaling service includes determining an image as a candidate for a scaling process, scanning the image for an initial text value, and scaling the image to a next lower resolution. The image scaling service also includes iteratively performing the scaling process until a threshold value of a readability metric is reached, the scaling process includes scanning the scaled image for a scaled text value, comparing a difference between the initial text value and the scaled text value, the difference indicative of the readability metric, and scaling the scaled image to a next lower resolution. In response to reaching the threshold value of the readability metric, the image scaling service further includes selecting from scaled images an image having a lowest resolution resulting from the scaling process before the threshold value of the readability metric was reached. | 10-17-2013 |
20130294694 | Zone Based Scanning and Optical Character Recognition for Metadata Acquisition - There is disclosed a method and apparatus for zone based scanning and optical character recognition for metadata acquisition comprising receiving user input identifying a first zone and a second zone on a visible representation of an electronic document and associating the first zone with a first database category and the second zone with a second database category, the association made using a metadata map. The method further comprises scanning a physical document in order to obtain a digital representation of the physical document as an electronic document, performing optical character recognition on the first zone and the second zone on the electronic document to thereby obtain a first metadata element and a second metadata element, and storing the electronic document along with the first metadata element and the second metadata element in a database, the first and second metadata elements stored in the database as directed by the metadata map. | 11-07-2013 |
20130294695 | POST OPTICAL CHARACTER RECOGNITION DETERMINATION OF FONT SIZE - A method and system are disclosed for post optical character recognition font size determination. Optical character recognition output from an optical character recognition engine that includes character and bounding box information is aggregated into character strings. Measurements are then collected from each character in each character string that correspond to alignment heights of the top or bottom of the character with an ascender-line, a cap-line, a digit-line, a mean-line, a base-line, or a descender-line. Histograms are formed for each of these heights for each character string from the collected measurements. Based on the histograms, a pivot height is selected and used to determine the relative font size of the character string. The relative font size is normalized using a preselected factor associated with the selected pivot height. The normalized font size is then output as the font size of characters in the optical character recognition output. | 11-07-2013 |
20130294696 | IMAGE PROCESSING METHOD AND APPARATUS - An image processing method and apparatus is provided. The image processing method includes steps of: generating a first scale binary image from an image, wherein the first scale is smaller than the original scale of the image; detecting at least one text line in the image based on the first scale binary image; generating a second scale binary image from the image, wherein the second scale is larger than the first scale; for each text line, calculating a similarity between corresponding sections in the first scale binary image and the second scale binary image, and removing the text line for which the similarity is lower than a predetermined level; for one or more of the remaining text line(s), performing OCR on corresponding section(s) in the second scale binary image to determine character orientation(s) of corresponding text line(s); and determining the orientation of the image according to the determined character orientation(s). | 11-07-2013 |
20130301919 | SELECTION FEATURES FOR IMAGE CONTENT - A method for serving content based on a selection feature for a campaign includes receiving an image associated with particular content and analyzing the image content of the image to derive a selection feature from the image. The selection feature is descriptive of image content. The method further includes identifying at least one keyword based on the selection feature. The method further includes associating the at least one keyword with the particular content and storing the particular content and its associated at least one keyword for serving in response to a content request. | 11-14-2013 |
20130301920 | METHOD FOR PROCESSING OPTICAL CHARACTER RECOGNIZER OUTPUT - A method, a system, and a computer program product for processing the output of an OCR are disclosed. The system receives a first character sequence from the OCR. A first set of characters from the first character sequence are converted to a corresponding second set of characters to generate a second character sequence based on a look-up table and language scores. | 11-14-2013 |
20130308862 | IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND COMPUTER READABLE MEDIUM - An image processing apparatus includes an accepting unit, a recognizing unit, and a selecting unit. The accepting unit accepts character information about a character image in a character region in an image. The recognizing unit performs character recognition on the character image in the character region. The selecting unit selects a character recognition result which matches the character information accepted by the accepting unit, from multiple character recognition results that are obtained by the recognizing unit. | 11-21-2013 |
20130315485 | TEXTUAL INFORMATION EXTRACTION METHOD USING MULTIPLE IMAGES - A method for extracting textual information from a document containing text characters using a digital image capture device. A plurality of digital images of the document are captured using the digital image capture device. Each of the captured digital images is automatically analyzed using an optical character recognition process to determine extracted textual data. The extracted textual data for the captured digital images are merged to determine the textual information for the document, wherein differences between the extracted textual data for the captured digital images are analyzed to determine the textual information for the document. | 11-28-2013 |
20130322757 | Document Processing Apparatus, Document Processing Method and Scanner - The disclosure provides a document processing apparatus, method and a scanner. The document processing apparatus includes: a text line extraction unit extracting a text line from an input document; a language classification unit determining whether an OCR process is necessary for a language of the input document; an OCR unit determining, by performing the OCR process, an OCR confidence in the case that it is determined that the OCR process is necessary; an graphic feature recognition unit determining an graphic feature recognition confidence; and a determination unit determining a combination confidence based on at least one of the determined graphic feature recognition confidences and the determined OCR confidences, and determining an orientation of the input document based on the combination confidences. This technical solution can determine better an orientation of the document, and is especially applicable when the quality of the image of the document is deteriorated. | 12-05-2013 |
20130322758 | IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND PROGRAM - To make it easier to grasp characters that appear across different images by determining a pair of character area images to be a combination target based on a degree of similarity or a position of each character area image extracted from different images, and connecting and combining overlapping area images that are the determined pair of character area images and that have a similar image feature amount. | 12-05-2013 |
20130322759 | METHOD AND DEVICE FOR IDENTIFYING FONT - A technique for identifying font in connection with text data processing. An original font corresponding to an embedded font used in an electronic document is identified. At least one glyph is selected from a glyph collection of the embedded font. The font corresponding to each selected glyph is identified, and the original font corresponding to the embedded font is identified according to the font that corresponds to each selected glyph. | 12-05-2013 |
20130330005 | ELECTRONIC DEVICE AND CHARACTER RECOGNITION METHOD FOR RECOGNIZING SEQUENTIAL CODE - A character recognition method to apply one or more user-controlled definitions to scanned writing is implemented by an electronic device. The electronic device includes an image capturing device where a recognition rule is set for recognizing a type of a sequential code according to an arrangement rule of characters of the type of sequential codes. A first sequential code is extracted from a captured image of an object having a sequential code and when the first sequential code does not match with the recognition rule, the first sequential code is corrected according to the preset recognition rule to obtain a second sequential code for matching with the recognition rule. | 12-12-2013 |
20130343652 | CHARACTER STRING EXTRACTION METHOD AND CHARACTER STRING EXTRACTION DEVICE - In a character string extraction method, a character portion, a rim portion, a character frame, and a character string frame are set, a feature value of each image in the character portion and the rim portion is calculated for each character frame, a character string frame evaluation value is calculated based on the feature value for the character string frame, a position of the character string frame is moved on the paper sheet image, and the image in the character portion is extracted by using the character string frame at a position at which the character string frame evaluation value reaches a maximum. | 12-26-2013 |
20140003721 | SYSTEM AND METHOD FOR LOCALIZING DATA FIELDS ON STRUCTURED AND SEMI-STRUCTURED FORMS | 01-02-2014 |
20140003722 | ANALYZING STRUCTURED LIGHT PATTERNS | 01-02-2014 |
20140003723 | Text Detection Devices and Text Detection Methods | 01-02-2014 |
20140010452 | OPTICAL CHARACTER RECOGNITION VERIFICATION AND CORRECTION SYSTEM - A system for verifying and correcting errors after translation of printed text into machine-readable text. The system includes a memory for storing formulas defining relationships between data fields. A processor evaluates the formulas according to data values associated with the data fields to determine whether the formulas evaluate as truthful statements. The processor marks the data fields of the formulas as unverified or as verified based upon this evaluation. The system also uses the processor to calculate a determined value for data fields in an attempt to correct errors in the translation of the printed text into machine-readable text. If different determined values are calculated for the same data field, based upon different formulas, the data field is marked as uncertain. The system iterates based upon the marking of the data fields of the formulas as verified or unverified and as uncertain or not uncertain. | 01-09-2014 |
20140016867 | SERIAL TEXT DISPLAY FOR OPTIMAL RECOGNITION APPARATUS AND METHOD - Various embodiments are disclosed that relate to serially displaying text on an electronic display using techniques for placement of an optimal recognition position of words at a fixed display location. In some embodiments, an optimal recognition position character is displayed at the fixed display location. In other embodiments, an optimal recognition proportionate position is displayed at the fixed display location. Various related techniques for processing and displaying text are further disclosed herein. | 01-16-2014 |
20140023274 | Method Of Handling Complex Variants Of Words Through Prefix-Tree Based Decoding For Devanagiri OCR - An electronic device and method identify a block of text in a portion of an image of real world captured by a camera of a mobile device, slice sub-blocks from the block and identify characters in the sub-blocks that form a first sequence to a predetermined set of sequences to identify a second sequence therein. The second sequence may be identified as recognized (as a modifier-absent word) when not associated with additional information. When the second sequence is associated with additional information, a check is made on pixels in the image, based on a test specified in the additional information. When the test is satisfied, a copy of the second sequence in combination with the modifier is identified as recognized (as a modifier-present word). Storage and use of modifier information in addition to a set of sequences of characters enables recognition of words with or without modifiers. | 01-23-2014 |
20140023275 | REDUNDANT ASPECT RATIO DECODING OF DEVANAGARI CHARACTERS - An electronic device and method receive a block sliced from a rectangular portion of an image of a scene of real world captured by a camera and use a property of the block to operate one of multiple optical character recognition (OCR) decoders. In an illustrative aspect, a first OCR decoder is configured to recognize characters whose property satisfies the test based on a first limit, the first limit being obtained by reducing a predetermined limit by an overlap amount. In this illustrative aspect, a second OCR decoder is configured to recognize characters whose property does not satisfy the test based on a second limit, the second limit being obtained by increasing the predetermined limit by the overlap amount. When the property of the block satisfies the test, the first OCR decoder is operated and alternatively the second OCR decoder is operated, resulting in candidates for a character being identified. | 01-23-2014 |
20140029852 | SYSTEMS AND METHODS FOR MULTI-DIMENSIONAL OBJECT DETECTION - Systems and methods for multi-dimensional object detection are described. Embodiments disclose receiving image frames, extracting image components in the image frame, identifying line segments in the extracted components, grouping the line segments into groups, based at least in part on one or more similarities between the slope associated with a line segment and the spatial proximity between the line segments, and merging each of the one or more identified line segments in a selected group into a single line segment. Embodiments additionally disclose detecting the position of one or more objects in the image frame by identifying objects in the image frame, producing a second version of the image frame, applying at least one image classifier to the image frame and the second version of the image frame, and identifying coordinates associated with at least one target object. Some embodiments additionally couple lane and object detection with alert generation. | 01-30-2014 |
20140029853 | FORM RECOGNITION METHOD AND DEVICE - Embodiments of the present application relate to a form recognition method, a form recognition system, and a computer program product for recognizing forms. A form recognition method is provided. The method includes conducting a straight line detection of a form in a form binary image to acquire a plurality of form boundaries of the form and a plurality of positional relationships between the plurality of form boundaries, extracting a plurality of features from the form using the plurality of form boundaries and the positional relationships between the plurality of form boundaries, establishing a feature vector associated with the form based at least in part on the plurality of features, calculating similarities between the form and respective ones of a plurality of template forms based at least in part on the feature vector of the form, and identifying the form based on the calculated similarities. | 01-30-2014 |
20140044356 | ARRANGEMENT FOR AND METHOD OF READING SYMBOL TARGETS AND FORM TARGETS BY IMAGE CAPTURE - An arrangement for, and a method of, electro-optically reading different types of targets by image capture, include an imaging assembly for capturing an image of a target over a field of view, and a controller for automatically distinguishing between the different types of targets, for decoding a symbol if the target being imaged is a symbol target, and for identifying and processing individual fields on a form if the target being imaged is a form target. | 02-13-2014 |
20140056521 | DISTRIBUTED DOCUMENT PROCESSING - A system for document processing including decomposing an image of a document into at least one data entry region sub-image, providing the data entry region sub-image to a data entry clerk available for processing the data entry region sub-image, receiving from the data entry clerk a data entry value associated with the data entry region sub-image, and validating the data entry value. | 02-27-2014 |
20140056522 | METHOD AND APPARATUS FOR PROCESSING DATA USING OPTICAL CHARACTER READER - A method for processing data by using an optical character reader (OCR) is provided. The method includes obtaining OCR data from each image file of a plurality of image files and storing the obtained OCR data, receiving a search command with respect to an object, extracting the object from the stored OCR data, selecting OCR data which includes the object from among the OCR data, and displaying a list of image files which correspond to the selected OCR data. | 02-27-2014 |
20140064618 | DOCUMENT INFORMATION EXTRACTION USING GEOMETRIC MODELS - A receipt processing system includes at least one imaging device which generates an image of a paper receipt. At least one processor programmed to acquire the image of a paper receipt from the at least one imaging device, textualize the data from the acquired receipt image, define specific local partial models based on regular expressions and geometric proximity, extract information from the textualized data according to the local partial models, apply rules to the extracted information, and generate receipt data from the application of the rules. | 03-06-2014 |
20140064619 | IDENTIFIER AND METHOD OF ENCODING INFORMATION - A two-dimensional code that includes a bar code readable by a scanning operation is provided. The two-dimensional code has an associated alphanumeric representation, and the bar code and the alphanumeric representation represent the same first information. Further, a position of at least one character of the alphanumeric representation with respect to at least one element of the bar code represents second information. | 03-06-2014 |
20140072224 | PAGE LAYOUT DETERMINATION OF AN IMAGE UNDERGOING OPTICAL CHARACTER RECOGNITION - A method and system is provided for identifying a page layout of an image that includes textual regions. The textual regions are to undergo optical character recognition (OCR). The system includes an input component that receives an input image that includes words around which bounding boxes have been formed and a text identifying component that groups the words into a plurality of text regions. A reading line component groups words within each of the text regions into reading lines. A text region sorting component that sorts the text regions in accordance with their reading order. | 03-13-2014 |
20140086488 | IMAGE PROCESSING DEVICE AND IMAGE PROCESSING METHOD - An image processing device includes a processor; and a memory which stores a plurality of instructions which, when executed by the processor, cause the processor to execute: acquiring a picked image; selecting pixels, which are adjacent to each other, to be connected based on value of the pixels in the image; generating a pixel connected area which includes the connected pixels; extracting a feature point from an outer edge of the pixel connected area; and calculating a moved amount of the feature point on the basis of the feature point of a plurality of images that have been picked at the first time and the second time by the acquiring. | 03-27-2014 |
20140093171 | SYSTEM AND METHOD FOR GENERATING MACHINE READABLE MEDIUM - A system and method is provided that enables a business to purchase a generic, but unique, kit containing one or more signs, with a machine readable medium. The computer readable medium stores information relating to a unique web address of a configurable web site landing page. An administrator configures the web site as desired so that when a user scans the machine readable medium, the user will be direct to the web site, and will have access to the content configured by the administrator. A system and method is also provided for programming or generating machine readable medium. | 04-03-2014 |
20140093172 | IMAGE PROCESSING DEVICE AND IMAGE PROCESSING METHOD - An image processing method for identifying a region in an input image by character recognition, the region coinciding with a predetermined search condition, includes receiving the search condition, the search condition including assignments of plural format character strings, each format character string including an assignment of a character type or a specific character for each character of a recognition target, extracting a character string region becoming a candidate from the input image, calculating a similarity between a character recognition result and the plural format character strings with respect to each group of plural character string regions, the character recognition result being of each character string region included in each group, and determining the group coinciding with the search condition among the groups of plural character string regions according to the calculated similarity. | 04-03-2014 |
20140105501 | CHARACTER RECOGNITION APPARATUS, NON-TRANSITORY COMPUTER READABLE MEDIUM, AND CHARACTER RECOGNITION METHOD - A character recognition apparatus includes an extracting unit that extracts a numerical value reading region in which a numerical value that is a character string of a numeral is read from an image that is a character recognition target, a character recognizing unit that performs character recognition on the extracted numerical value reading region and obtains plural recognition candidates for each numerical value, and a selecting unit that selects a potential candidate from the plural recognition candidates on the basis of a numerical value range set for each numerical value reading region. | 04-17-2014 |
20140140622 | SYSTEM AND METHOD FACILITATING DESIGNING OF CLASSIFIER WHILE RECOGNIZING CHARACTERS IN A VIDEO - The present disclosure relates to designing of a hierarchy of feature vectors. In one embodiment, a method for facilitating design of a hierarchy of feature vectors while recognizing one or more characters in a video is disclosed. The method comprises collecting one or more features from each of the segments in a video frame extracted from a video; preparing multi-dimensional feature vectors to classify the one or more characters; calculating a minimum distance between the multi-dimensional features vectors of a test character and the multi-dimensional feature vectors of a pre-stored character template; selecting, with respect to a decreasing order of the minimum distance, the multi-dimensional feature vectors to design a hierarchy of the multi-dimensional feature vectors; and classifying the characters based on the hierarchy of the multi-dimensional feature vectors. | 05-22-2014 |
20140147045 | Method and Apparatus for Operating, Interfacing and/or Managing for at Least One Optical Characteristic System for Container Handlers in a Container Yard - Methods and several apparatus embodiments are disclosed operating Optical Characteristic Systems (OCS) in a container storage and/or transfer yard supporting the automated recognition of container codes displayed on various sides of the containers being stored and/or transferred. At least one processor may initiate an operational process by an OCS mounted on a container handler to create an operational result, select the operational process based upon an operational schedule and communicate with at least one OCS to receive an image of a container being handled by the container handler to at least partly create a container code estimate for a container inventory management system. A program system directing at least one computer implementing these operations, and may reside in computer readable memory, an installation package and/or a download server. The computer readable memory may or may not be accessibly coupled to the computer. | 05-29-2014 |
20140147046 | METHOD AND DEVICE FOR AUTHENTICATING A TAG - A method for authenticating a tag, includes:
| 05-29-2014 |
20140169674 | METHOD AND SYSTEM FOR A SELECTION OF A SOLUTION TECHNIQUE FOR A TASK - A method, system, and computer program product for selecting a solution technique from a plurality of solution techniques for accomplishing a task is provided. The plurality of solution techniques are ranked according to a set of parameters. A first set of solutions are then obtained based on each of the plurality of solution techniques until at least the first predefined number of solutions from the first set of solutions matches with the corresponding solution from the second set of solutions. The second set of solutions corresponds to correct solutions for the task. Thereafter, one of the plurality of solution techniques is selected for which at least the first predefined number of solutions from the first set of solutions matches with the corresponding solution from the second set of solutions. | 06-19-2014 |
20140169675 | Method and system for character recognition - Character recognition is described. In one embodiment, it may use matched sequences rather than character shape to determine a computer-legible result. | 06-19-2014 |
20140185934 | PROCESS AND SYSTEM FOR AUTHENTICATING OWNERSHIP OF A PHYSICAL BOOK TO A THIRD PARTY VIA A MOBILE APPLICATION - A user's mobile device has an application that, when employed, verifies and authenticates to a third party that the user has ownership of a physical book. Use of the application's process allows unique identification of each book by means of ISBN lookup in conjunction with a) signature image analysis or b) a generated unique identifier code, and assigns that book to the user's account. User possession of the book is associated with ownership in the user's account and ownership is authenticated by assigning the unique book to the user's account. Fraud or multiple use prevention is built into the application and process. Evidence of authentication is sent to the interested third party for use in distribution of e-books in a manner contingent on ownership of the physical book. | 07-03-2014 |
20140193075 | Local Scale, Rotation and Position Invariant Word Detection for Optical Character Recognition - A system and method using a text extraction application for identifying words with multiple orientations from an image are described. The text extraction application receives an input image, generates progressively blurred images, detects blobs in the blurred images, outputs ellipses over the blobs, detects a word in the input image, orients and normalizes a first version of the word, generates an inverted version of the word, performs OCR on the first version and the inverted version of the word, generates confidence scores for the first version and the inverted version of the word and outputs text associated with the word. | 07-10-2014 |
20140212039 | Efficient Verification or Disambiguation of Character Recognition Results - Machines, systems and methods for character recognition disambiguation are provided. The method comprises selecting a first set of characters that match a first visual profile based on results of a character recognition process applied to target content; selecting a subset of the first set based on criteria associated with at least one of confidence level with which characters grouped in the subset are recognized or fragmentation associated with the characters grouped in the subset; and disambiguating recognition results for the characters grouped in the subset by displaying the characters along with context information, wherein reviewing two or more of the characters on a display screen along with context information associated with said two or more characters allows a human operator to select one or more suspect characters from among the two or more characters. | 07-31-2014 |
20140212040 | Document Alteration Based on Native Text Analysis and OCR - Example embodiments relate to document alteration based on native text analysis and optical character recognition (OCR). In example embodiments, a system analyzes native text obtained from a native document to identify a text entity in the native document. At this stage, the system may use a native application interface to convert the native document to a document image and perform OCR on the document image to identify a text location of the text entity. The system may then generate an alteration box (e.g., redaction box, highlight box) at the text location in the document image to alter a presentation of the text entity. | 07-31-2014 |
20140212041 | Apparatus for Identifying Documents - An apparatus for document identification, having a capture device for capturing a document feature of a document, a processor that is designed to perform document identification locally using the document feature if a processing criterion for the local performance of document identification by means of the apparatus for document identification is satisfied, and a transmitter that is designed to send a data record that is dependent on the document feature via a communication network to a communication network address if the processing criterion for the local performance of document identification by means of the apparatus for document identification is not satisfied. | 07-31-2014 |
20140219563 | LABEL-EMBEDDING FOR TEXT RECOGNITION - A system and method for comparing a text image and a character string are provided. The method includes embedding a character string into a vectorial space by extracting a set of features from the character string and generating a character string representation based on the extracted features, such as a spatial pyramid bag of characters (SPBOC) representation. A text image is embedded into a vectorial space by extracting a set of features from the text image and generating a text image representation based on the text image extracted features. A compatibility between the text image representation and the character string representation is computed, which includes computing a function of the text image representation and character string representation. | 08-07-2014 |
20140247991 | SYSTEM AND METHOD FOR GENERATING MACHINE READABLE MEDIUM - A system and method is provided that enables a business to purchase a generic, but unique, kit containing one or more signs, with a machine readable medium. The computer readable medium stores information relating to a unique web address of a configurable web site landing page. An administrator configures the web site as desired so that when a user scans the machine readable medium, the user will be direct to the web site, and will have access to the content configured by the administrator. A system and method is also provided for programming or generating machine readable medium. | 09-04-2014 |
20140270528 | LOCAL IMAGE ENHANCEMENT FOR TEXT RECOGNITION - Various embodiments enable regions of text to be identified in an image captured by a camera of a computing device for preprocessing before being analyzed by a visual recognition engine. For example, each of the identified regions can be analyzed or tested to determine whether a respective region contains a quality associated with poor text recognition results, such as poor contrast, blur, noise, and the like, which can be measured by one or more algorithms. Upon identifying a region with such a quality, an image quality enhancement can be automatically applied to the respective region without user instruction or intervention. Accordingly, once each region has been cleared of the quality associated with poor recognition, the regions of text can be processed with a visual recognition algorithm or engine. | 09-18-2014 |
20140286573 | SYSTEM AND METHOD OF DETERMINING BUILDING NUMBERS - A system and method is provided for automatically recognizing building numbers in street level images. In one aspect, a processor selects a street level image that is likely to be near an address of interest. The processor identifies those portions of the image that are visually similar to street numbers, and then extracts the numeric values of the characters displayed in such portions. If an extracted value corresponds with the building number of the address of interest such as being substantially equal to the address of interest, the extracted value and the image portion are displayed to a human operator. The human operator confirms, by looking at the image portion, whether the image portion appears to be a building number that matches the extracted value. If so, the processor stores a value that associates that building number with the street level image. | 09-25-2014 |
20140294304 | METHOD AND SYSTEM FOR CREATING OPTIMIZED IMAGES FOR DATA IDENTIFICATION AND EXTRACTION - A viewfinder screen display is generated and positioned such that a source document is displayed in the viewfinder screen display. Source document image blocks corresponding to different portions of the source document are then defined. For each source document image block, the image capture parameter of an image capture device is set to an optimized image capture parameter setting for the source document image block. The image capture device then captures an image block optimized image of the source document optimized for the source document image block. The optimized source document image blocks are then extracted from each image block optimized image of the source document. The extracted optimized source document image blocks are then aggregated and used to construct an image capture parameter optimized image of the source document. | 10-02-2014 |
20140294305 | DETECTING A DOCUMENT - A method is proposed for detecting a document in which image data are recorded by means of a camera, in which filtered picture data are determined by a first processing unit on the basis of the recorded image data, and a camera picture is stored by a second processing unit on the basis of the filtered picture data if a stability criterion is fulfilled. Also specified correspondingly are a device, computer program product and storage medium. | 10-02-2014 |
20140301645 | METHOD AND APPARATUS FOR MAPPING A POINT OF INTEREST BASED ON USER-CAPTURED IMAGES - An approach is provided for mapping a point of interest (POI) based on user-captured images. One or more user-captured images are queried based, at least in part, on a POI data record. One or more identifying features of a POI associated with the POI data record are recognized in the one or more user-captured images and a position of the one or more identifying features is determined in the one or more user-captured images. The POI is mapped based, at least in part, on the position of the one or more identifying features and image metadata associated with the one or more user-captured images. | 10-09-2014 |
20140363081 | MACHINE READING OF PRINTED DATA - A method of reading data represented by characters formed of an x by y array of dots, e.g. as printed by a dot-matrix printer, is described. An image of the character(s) is captured by a digital camera device and transmitted to a computer, and by using analysis software operating in the computer to which the camera image has been sent, dot shapes are identified and their positions within the captured image detected, using the similarity of dots to idealised representations of dots using a combination of covariance, correlation or colour data. The position information about the detected dots is then processed to determine the distance between dots and to identify “clusters” of adjacent dots in groups of dots close to one another, and to enable such clusters to be mapped on to a notional x by y grid that defines the intended positions of the dots where grid elements intersect. The image is then analysed to determine, for each intersection of the grid, whether a dot is present or not, and starting at one corner of the x by y grid, a binary number is generated corresponding to the presence or absence of a dot at each intersection. This binary number is compared with the binary number in a reference table of binary numbers referenced to information corresponding to a dot-matrix printed character, and an output then produced corresponding to the character(s) identified. By using Reed Solomon mathematics, characters which have been misprinted can still be reliably identified. | 12-11-2014 |
20140369602 | Methods for Automatic Structured Extraction of Data in OCR Documents Having Tabular Data - Methods to select and extract tabular data among the optical character recognition returned strings to automatically process documents, including documents containing academic transcripts. | 12-18-2014 |
20150010235 | System and Method for Feature Recognition and Document Searching Based on Feature Recognition - A system for document searching can include a camera. The system may further include an image capturing module configured to capture a first image of a first portion of a document, a feature recognition module in communication with the image capturing module, the feature recognition module configured to determine a first feature associated with the first image, a search module configured to send search information to a server and receive a first result from a first search of a set of documents that was performed based on one or more search criteria determined based on the first feature associated with the first image, and a search interface configured to present the first result on the device. | 01-08-2015 |
20150023599 | Optical Match Character Classification - Machines, systems and methods for enhanced optical character recognition are provided. In one embodiment, the method comprises identifying a sample character in a textual context to be optically recognized; comparing the sample character with a template character, wherein the sample character is scaled into a first grid and the template character is scaled into a second grid; identifying one or more pixels in the sample character within the first grid and one or more pixels in the template character in the second grid, wherein the one or more pixels are identified as belonging to a foreground category in the textual content, a foreground pixel having at least N gradients corresponding to edges of the foreground pixel that are juxtaposed to a neighbor pixel, wherein a contour foreground pixel has at least one gradient that is neighbored by a background pixel in the textual context. | 01-22-2015 |
20150023600 | AUTHORIZING THE USE OF A BIOMETRIC CARD - Embodiments of the present invention provide a system and method for authorizing the use of a biometric transaction card. Specifically, embodiments of the present invention provide a biometric card having a biometric sensor to determine whether the biometric information (fingerprint) is from human skin. In a typical embodiment, the cardholder approaches a magnetic reader with the card. The user places his/her finger on the SpO | 01-22-2015 |
20150043822 | Machine And Method To Assist User In Selecting Clothing - A device to convey information to a user regarding clothing. The device receives data that specifies a clothing mode to use for processing an image, accesses a knowledge base to provide data to configure the computer program product for the clothing mode, the data including data specific to the clothing mode and receives an image or images of an article of clothing. The device processes the image or images to identify patterns in the image corresponding to items of clothing based on information obtained from the knowledge base. | 02-12-2015 |
20150043823 | METHOD AND SYSTEM FOR CAPTURING AND UTILIZING ITEM ATTRIBUTES - Various embodiments of a method and system for capturing and utilizing item attributes are described. Various embodiments may include a mobile image capture apparatus, which may include a computer system configured to execute an image capture application. The image capture application may instruct an agent to capture an image of an item label. A data extraction component may be configured to process the images captured by the mobile image capture apparatus. For a given captured image, the data extraction component may in various embodiments be configured to perform OCR to determine one or more strings of characters from the image. The data extraction component may be further configured to determine whether one or more patterns match a determined string of characters. In response to the detection of a particular pattern matching a particular string of characters, the data extraction component may extract and store an attribute of the corresponding item. | 02-12-2015 |
20150049947 | DYNAMIC SERVICE CONFIGURATION DURING OCR CAPTURE - Dynamically configuring OCR processing may include determining a device type and determining whether to perform optical character recognition (OCR) processing of the received image locally based on one or more OCR parameters. Example OCR parameters may include the device type, the image type, the size of the received image, the available amount of the memory, the measured/benchmarked throughput of OCR processing on the device relative to an OCR server throughput and network throughput, and/or the current level of network connectivity. If it is determined that OCR processing of the received image should be performed locally, the device may compute one or more name-value pairs corresponding to the received image and transmit the name-value pairs to a remote data server for processing. | 02-19-2015 |
20150049948 | MOBILE DOCUMENT CAPTURE ASSIST FOR OPTIMIZED TEXT RECOGNITION - A device and method for providing a visual cue for improved text imaging on a mobile device. The method includes determining a minimum text size for accurate optical character recognition (OCR) of an image captured by the mobile device, receiving an image stream of a printed substrate, and displaying the image stream and a visual cue superimposed onto the image stream, wherein the visual cue is indicative of the minimum text size. The method further includes capturing a digital image of the image stream, wherein the digital image does not include the visual cue. Additionally, the method further includes notifying a user of the mobile device when text displayed within the image stream is at least as large as the minimum text size. | 02-19-2015 |
20150049949 | Redigitization System and Service - A system and method to error correct extant electronic documents is disclosed. An electronic document may be rasterized to obtain a pixel representation of the electronic document (e.g., raster image). One or more optical character recognition (OCR) tasks may be performed on the raster image of the electronic document. Errors discovered by the OCR tasks may be corrected and a customized error corrected version of the electronic document may be created and stored. If the author of the electronic document is known, the raster image may be compared to a personalized tf*idf error dictionary associated with the author to determine known OCR errors specific to the author. The raster image may also be compared to a personalized electronic error dictionary associated with the author to determine known typographical errors specific to the author. | 02-19-2015 |
20150055867 | SYSTEM AND METHOD FOR INDEXING ELECTRONIC DISCOVERY DATA - Systems and methods for efficiently processing electronically stored information (ESI) are described. The systems and methods describe processing ESI in preparation for, or association with, litigation. The invention preserves the contextual relationships among documents when processing and indexing data, allowing for increased precision and recall during data analytics. | 02-26-2015 |
20150063700 | MULTIPLE HYPOTHESIS TESTING FOR WORD DETECTION - Embodiments disclosed pertain to Optical Character Recognition using Multiple Hypothesis Testing based techniques on images occurring in a variety of settings, including images captured by mobile stations. In some embodiments, a set of bifurcation points for a character cluster in an image may be determined. The character cluster may comprise non-uniformly spaced text or closely spaced text. A plurality of hypotheses may be determined for the character cluster, where each hypothesis is based on a subset of the bifurcation points and comprises a set of words generated from the character cluster. A plurality of scores corresponding to the plurality of hypotheses may be determined, where each score corresponds to a hypothesis, and a hypothesis may be selected from among the plurality of hypotheses based on a score associated with the selected hypothesis. | 03-05-2015 |
20150110401 | DOCUMENT REGISTRATION APPARATUS AND NON-TRANSITORY COMPUTER READABLE MEDIUM - A document registration apparatus includes a receiving unit that receives a request for registration of a registration candidate document from a new registrant, a word extracting unit that extracts a word from the registration candidate document, a registrant information acquiring unit that acquires information on the new registrant, an associating unit that associates the extracted word with a group to which the new registrant belongs, a first storage unit that stores history information, a second storage unit that stores an identifier of a previous registrant and a group to which the previous registrant has belonged, an extracting unit that extracts an identifier of a previous registrant who registered a word identical to the extracted word, and extracts a group to which the previous registrant has belonged, a registration permission determining unit that determines whether to allow registration, and a document registering unit that registers the registration candidate document. | 04-23-2015 |
20150117780 | FAST SINGLE-PASS INTEREST OPERATOR FOR TEXT AND OBJECT DETECTION - The invention provides a method of using machine vision to recognize text and symbols, and more particularly traffic signs. | 04-30-2015 |
20150117781 | METHOD, APPARATUS AND SYSTEM FOR INFORMATION IDENTIFICATION - Methods, apparatus, and systems for information identification are provided. A card image of a pre-set collection area is photographed and obtained, when a request event for information identification is detected. Edge-size information of the card image obtained by photographing is determined. A target area of the card image is marked according to the edge-size information. An image of the target area is extracted. Character shapes to be identified in the image of the target area is determined based on a pre-set character pattern library. A character corresponding to each character shape to be identified is identified according to each character shape to be identified that is determined and according to the character pattern library. | 04-30-2015 |
20150131908 | CHARACTER RECOGNITION METHOD AND DEVICE - A character recognition method may include at least the following steps. A location step may include acquiring an image and locating a character region of the image. The character region may include a character and a local background. The method may further include a background judgment step for determining whether the local background is a complex background; a determination step for determining a color of the character if the local background is a complex background; a construction step for constructing a mask for the character by combining the color of the character and a character region; and a first recognition step for extracting the character from the character region by using the mask, recognizing the character, and outputting the recognition result. A character recognition device is further provided. | 05-14-2015 |
20150146983 | IMAGE PROCESSING SYSTEM, INFORMATION PROCESSING APPARATUS, AND RECORDABLE MEDIUM - An image processing system comprises a generating unit that reads documents and generates a read image including document images; an extracting unit that extracts each of the document images; a character recognizing unit that performs character recognition processing on each the document images; a determining unit that determines whether first information and second information have a pair relationship, the first information being any one of recognition results obtained on one side of the document images, and the second information being any one of recognition results obtained on the other side of the document images; and a registering unit that registers pieces of information for the recognition results corresponding to the one side and to the other side respectively including the first information and the second information which are determined that both pieces of information have the pair relationship, as information for a piece of the documents, in a storage unit. | 05-28-2015 |
20150146984 | SYSTEM AND METHOD FOR IDENTIFICATION AND EXTRACTION OF DATA - A system and method of for describing target data as a sequence of pattern elements and pattern element groups that comprise an overall target pattern is described. Pattern elements may utilize regular expression syntax along with other metadata that describe the behavior of the element. A pattern element group may be a collection of fully defined pattern elements where at least one pattern element from the group must have a match for the overall pattern to match. Patterns contain both pattern elements and pattern element groups. The general process involves first performing optical character recognition (OCR) on the document, which in turn produces a sequence of text tokens representing the lines of text on each page of the document. The search algorithm may then apply each defined pattern to the entire document capturing and/or extracting data that match each pattern's required elements and element groups. | 05-28-2015 |
20150146985 | HANDWRITTEN DOCUMENT PROCESSING APPARATUS AND METHOD - According to one embodiment, a handwritten document processing apparatus is provided with a stroke acquisition unit, a stroke group generation unit and an additional information generation unit. The stroke acquisition unit acquires stroke data. The stroke group generation unit generates stroke groups each including one or a plurality of strokes, which satisfy a predetermined criterion, based on the stroke data. The additional information generation unit generates additional information which indicates a relationship between a first stroke group of the stroke groups and a second stroke group of the stroke groups, and to assign the additional information to the first stroke group. | 05-28-2015 |
20150293915 | SYSTEM AND METHODS FOR REMOTE IMAGE ACQUISITION AND REMOTE IMAGE PROCESSING OF A DOCUMENT - A document processing system for remote processing an image frame of a document and methods of using thereof. The system includes a remote server, having a processing unit and a data repository unit. The system further includes a personal mobile device having an image acquisition device for acquiring at least one image frame of a document, a communication unit adapted to communicate with the processing unit and an image-transmission-management module. Upon receiving the at least one image frame of a document by the remote server from the personal mobile device, via the wireless network, the processing unit extracts textual data, image data or both from the received at least one image frame to thereby create extracted data; associates an access code to the extracted data; and stores the at least one image frame, the extracted data and the associated access code in the data repository unit. | 10-15-2015 |
20150294170 | METHODS AND SYSTEMS FOR DETERMINING ASSESSMENT CHARACTERS - A method of determining an input character based upon character recognition output of an education assessment system may include receiving, by a processing device, a proposed value generated using character recognition. The proposed value may be associated with at least one handwritten character of an assessment. The method may include determining, by the processing device, whether the proposed value is correct, by determining a posterior probability associated with each of one or more possible characters, identifying the possible character associated with the posterior probability having a highest value, and in response to identifying the proposed value as the possible character associated with the posterior probability having a highest value, determining, by the processing device, that the proposed value is correct, otherwise, determining that the proposed value is incorrect. | 10-15-2015 |
20150294174 | METHOD OF VEHICLE IDENTIFICATION AND A SYSTEM FOR VEHICLE IDENTIFICATION - A method for vehicle identification to determine at least one vehicle characteristic, comprising: obtaining an input image ( | 10-15-2015 |
20150294437 | SYSTEMS AND METHODS FOR SCANNING PAYMENT AND LOYALTY CARDS AS A SERVICE - The systems and methods of the present disclosure enable a scanner cloud server to control a mobile device to scan a payment or loyalty card and to convert the scanned card into an electronic card. This is accomplished through the use of a scanner cloud interface that is incorporated into a mobile application installed on the mobile device. The scanner cloud server receives a request for scanning a card from the mobile application via the scanner cloud interface, and sends a request to the mobile application via the scanner cloud interface to scan a card using a scanning device, such as a camera, of the mobile device. The scanner cloud interface connects to the scanning device using the authorizations granted to the mobile application and controls the scanning device to scan the card to obtain an image of the card. The mobile application transmits the image to the scanner cloud server, which recognizes information in the image and generates an electronic card based on the recognized information. The mobile application displays the electronic card when it is received from the scanner cloud server via the scanner cloud interface in response to a user's request to display the electronic card. | 10-15-2015 |
20150310270 | Portable Optical Reader, Optical Reading Method Using The Portable Optical Reader, And Computer Program - The present invention provides a portable optical reader, an optical reading method using the portable optical reader and a computer program capable of detecting a high possibility of a reading error and notifying a user of a possibility of a reading error. A character string as a reading target is imaged and a character string is recognized based on the captured image. A plurality of reading formats defining an attribute of the character string is stored, and a first reading format matched with the recognized character string among a plurality of stored reading format is searched. Among the plurality of stored reading formats, a second reading format in which a character string matched with the first reading format as a partial character string is searched. Based on the search result, a possibility of a reading error regarding the recognized character string is notified. | 10-29-2015 |
20150310290 | TECHNIQUES FOR DISTRIBUTED OPTICAL CHARACTER RECOGNITION AND DISTRIBUTED MACHINE LANGUAGE TRANSLATION - A technique for selectively distributing OCR and/or machine language translation tasks between a mobile computing device and server(s) includes receiving, at the mobile computing device, an image of an object comprising a text. The mobile computing device can determine a degree of optical character recognition (OCR) complexity for obtaining the text from the image. Based on this degree of OCR complexity, the mobile computing device and/or the server(s) can perform OCR to obtain an OCR text. The mobile computing device can then determine a degree of translation complexity for translating the OCR text from its source language to a target language. Based on this degree of translation complexity, the mobile computing device and/or the server(s) can perform machine language translation of the OCR text from the source language to a target language to obtain a translated OCR text. The mobile computing device can then output the translated OCR text. | 10-29-2015 |
20150310291 | TECHNIQUES FOR DISTRIBUTED OPTICAL CHARACTER RECOGNITION AND DISTRIBUTED MACHINE LANGUAGE TRANSLATION - A technique for selectively distributing OCR and/or machine language translation tasks between a mobile computing device and server(s) includes receiving, at the mobile computing device, an image of an object comprising a text. The mobile computing device can determine a degree of optical character recognition (OCR) complexity for obtaining the text from the image. Based on this degree of OCR complexity, the mobile computing device and/or the server(s) can perform OCR to obtain an OCR text. The mobile computing device can then determine a degree of translation complexity for translating the OCR text from its source language to a target language. Based on this degree of translation complexity, the mobile computing device and/or the server(s) can perform machine language translation of the OCR text from the source language to a target language to obtain a translated OCR text. The mobile computing device can then output the translated OCR text. | 10-29-2015 |
20150317530 | KEY WORD DETECTION DEVICE, CONTROL METHOD, AND DISPLAY APPARATUS - A key word detection device and a method for detecting a search key word from a target image in order to perform a search with a search engine on the internet, the key word detection device comprising: a processor configured to operate as a feature point detector configured to detect a feature point of a specific character string from the target image, the specific character string prompting a user to perform the search; a key word recognition unit configured to recognize a character string existing in surroundings of the feature point detected by the feature point detector as the search key word in the target image; and a storage for storing character information and data of the target image used by the processor. | 11-05-2015 |
20150324638 | SYSTEM AND METHOD FOR IDENTIFICATION AND SEPARATION OF FORM AND FEATURE ELEMENTS FROM HANDWRITTEN AND OTHER USER SUPPLIED ELEMENTS - A system and methods for progressive feature evaluation of an electronic document image to identify user supplied elements is disclosed. The system includes a controller in communication with a storage device configured to receive and accessibly store a generated plurality of candidate images. The controller is operable to analyze the electronic document image to identify a first feature set and a second feature set, wherein each of the first and second feature sets represent a different form feature, compare the first feature set to the second feature set, and define a third feature set based on the intersection of the first and second feature sets, wherein the third feature sets represents the user provided elements. | 11-12-2015 |
20150331887 | MEDICINAL SUBSTANCE RECOGNITION SYSTEM AND METHOD - Provided is an apparatus for identifying a medicinal substance. A tray receives and concurrently supports a plurality of pills formed at least in part from the medicinal substance. A computer-readable memory stores a drug database including one or more identifying features for identifying different pills formed at least in part from different medicinal substances. A recognition device is arranged to interrogate the pills on the tray and detect at least one of the identifying features from the pills. A controller receives the identifying feature(s) detected by the recognition device and determines the identity of the medicinal substance from among the different medicinal substances in the drug database based on the identifying feature(s). | 11-19-2015 |
20150332128 | NETWORK-IMPLEMENTED METHODS AND SYSTEMS FOR AUTHENTICATING A PAPER FINANCIAL INSTRUMENT - The disclosure relates generally to financial instruments and particularly to mitigating exposure to financial fraud. Specifically, the disclosure relates to systems and methods of authenticating paper financial instruments (PFI) issued by a first entity for the benefit of a third party. | 11-19-2015 |
20150339536 | COLLABORATIVE TEXT DETECTION AND RECOGNITION - Various embodiments provide methods and systems for identifying text in an image by applying suitable text detection parameters in text detection. The suitable text detection parameters can be determined based on parameter metric feedback from one or more text identification subtasks, such as text detection, text recognition, preprocessing, character set mapping, pattern matching and validation. In some embodiments, the image can be defined into one or more image regions by performing glyph detection on the image. Text detection parameters applying to each of the one or more image regions can be adjusted based on measured one or more parameter metrics in the respective image region. | 11-26-2015 |
20150339543 | METHOD AND APPARATUS FOR CLASSIFYING MACHINE PRINTED TEXT AND HANDWRITTEN TEXT - A method, non-transitory computer readable medium, and apparatus for classifying machine printed text and handwritten text in an input are disclosed. For example, the method defines a perspective for an auto-encoder, receives the input for the auto-encoder, wherein the input comprises a document comprising the machine printed text and the handwritten text, performs an encoding on the input using an auto-encoder to generate a classifier, applies the classifier on the input and generates an output that separates the machine printed text and the handwritten text in the input based on the classifier in accordance with the perspective. | 11-26-2015 |
20150347834 | IMAGE PROCESSING DEVICE AND IMAGE FORMING APPARATUS - According to a processing period for OCR processing of image data for character recognition for a given page, the image processing section determines whether or not to generate image data for character recognition for a next page subsequent to the given page in accordance with an image quality setting different from that set for the given page. Upon determining generation in accordance with an image quality setting different from that set for the given page, the image processing section determines the image quality setting for the next page based on document type and size of the given page of the original document. The image processing section generates image data for character recognition for the next page in accordance with the determined image quality setting. | 12-03-2015 |
20150371086 | EXTRACTING CARD DATA FROM MULTIPLE CARDS - Extracting financial card information with relaxed alignment comprises a method to receive an image of a card, determine one or more edge finder zones in locations of the image, and identify lines in the one or more edge finder zones. The method further identifies one or more quadrilaterals formed by intersections of extrapolations of the identified lines, determines an aspect ratio of the one or more quadrilateral, and compares the determined aspect ratios of the quadrilateral to an expected aspect ratio. The method then identifies a quadrilateral that matches the expected aspect ratio and performs an optical character recognition algorithm on the rectified model. A similar method is performed on multiple cards in an image. The results of the analysis of each of the cards are compared to improve accuracy of the data. | 12-24-2015 |
20150371100 | CHARACTER RECOGNITION METHOD AND SYSTEM USING DIGIT SEGMENTATION AND RECOMBINATION - Method and systems are provided for recognizing characters in an original image. The images received in the system as a set of pixels representing the original image as a character skeleton and a chaincore representation thereof. A skeleton intersection points are identified using a basis for determining a cutting points in the chaincore contours compared to the cutting points are then used to define cutting lines for segleg the original image into distinct segments. The segments are analyzed with respect to their geometric properties individually and relative to adjacent to other segments for determination that select ones of the segments may be combined wherein the combination is expected to have a high probability of conformance to a likely a digit or character. Verification that the combined string is a recognizable digit or character is accomplished using a convolutional neural network digit recognizer. | 12-24-2015 |
20160019431 | EXTRACTING CARD IDENTIFICATION DATA USING CARD TYPES - Extracting card information comprises a server at an optical character recognition (“OCR”) system that interprets data from a card. The OCR system performs an optical character recognition algorithm an image of a card and a data recognition algorithm on a machine-readable code on the image of the card. The OCR system compares a series of extracted alphanumeric characters obtained via the optical character recognition process to data extracted from the machine-readable code via the data recognition process and matches the alphanumeric series of characters to a particular series of characters extracted from the machine-readable code. The OCR system determines if the alphanumeric series and the matching series of characters extracted from the machine-readable code comprise any discrepancies and corrects the alphanumeric series of characters. The OCR system may also determine a card type and use a card format associated with the card type to improve the data extraction. | 01-21-2016 |
20160034774 | IMAGE PROCESSING DEVICE AND IMAGE PROCESSING METHOD - A font recognition section recognizes a font of characters depicted by character image data included in image data upon first character data corresponding to the character image data being acquired through character recognition. A font character recognition section compares the character image data to font character image data depicting characters in the recognized font and acquires second character data corresponding to font character image data when the font character image data matches the character image data. A recognized character determination section determines whether or not the first character data matches the second character data, and when determining that the first character data does not match the second character data, adopts the second character data, instead of the first character data, as recognized character data for the character image data. | 02-04-2016 |
20160034775 | METHODS AND APPARATUS FOR BOUNDED IMAGE DATA ANALYSIS AND NOTIFICATION MECHANISM - The present invention relates to a method and apparatus for preliminary optical character recognition document imaging and user notification on an electronic device with image capture capabilities. More specifically, this disclosure relates to an electronic device capable of connecting to a network, wherein a document image may be transmitting and stored securely in an external server. | 02-04-2016 |
20160055374 | Enhanced Interpretation of Character Arrangements - Technologies are described herein for interpreting character arrangements. An image including an arrangement of characters may be received or captured by a computing device. Techniques described herein generate data representative of the characters. Characteristics and other information interpreted from the image may be processed to determine a data type. The data representative of the characters may be arranged into a data structure based on the data type, an arrangement type and/or other information interpreted from the image. The data type may indicate one or more attributes of the arranged data such as a format, font, date, language, or currency. The data type may also indicate how data is used in a process, equation or calculation. In addition, the data type may identify an anchor that may be used to merge data generated from the image with other data generated from another image. | 02-25-2016 |
20160063102 | SEARCHING AND RANKING OF CODE IN VIDEOS - A method comprising: receiving a multiplicity of videos from a source; for each video: receiving meta data related to the video; extracting from the video a video frame containing computer code; identifying a region of interest (ROI) within the video frame; performing OCR of the ROI to extract a code segment; analyzing the code segment by: semantically analyzing the code segment to obtain a first rank, structurally analyzing the code segment to obtain a second rank, and analyzing the meta data to obtain a third rank; and combining the first rank, second rank and third rank into a total rank associated with the code segment; receiving a query; matching the query to each code segment to identify matching code segments and associated videos; and providing the associated videos in accordance with total ranks associated with the matching code segments. | 03-03-2016 |
20160063354 | SYSTEM AND METHOD FOR TRANSCRIBING HANDWRITTEN RECORDS USING WORD GROUPINGS BASED ON FEATURE VECTORS - A handwriting recognition system converts word images on documents, such as document images of historical records, into computer searchable text. Word images (snippets) on the document are located, and have multiple word features identified. For each word image, a word feature vector is created representing multiple word features. Based on the similarity of word features (e.g., the distance between feature vectors), similar words are grouped together in clusters, and a centroid that has features most representative of words in the cluster is selected. A digitized text word is selected for each cluster based on review of a centroid in the cluster, and is assigned to all words in that cluster and is used as computer searchable text for those word images where they appear in documents. An analyst may review clusters to permit refinement of the parameters used for grouping words in clusters, including the adjustment of weights and other factors used for determining the distance between feature vectors. | 03-03-2016 |
20160063355 | SYSTEM AND METHOD FOR TRANSCRIBING HANDWRITTEN RECORDS USING WORD GROUPING WITH ASSIGNED CENTROIDS - A handwriting recognition system converts word images on documents, such as document images of historical records, into computer searchable text. Word images (snippets) on the document are located, and have multiple word features identified. For each word image, a word feature vector is created representing multiple word features. Based on the similarity of word features (e.g., the distance between feature vectors), similar words are grouped together in clusters, and a centroid that has features most representative of words in the cluster is selected. A digitized text word is selected for each cluster based on review of a centroid in the cluster, and is assigned to all words in that cluster and is used as computer searchable text for those word images where they appear in documents. An analyst may review clusters to permit refinement of the parameters used for grouping words in clusters, including the adjustment of weights and other factors used for determining the distance between feature vectors. | 03-03-2016 |
20160086007 | SYSTEM AND METHOD TO MANIPULATE AN IMAGE - A method of operating an image reader typically includes: searching a digital image for nominally straight edges; characterizing the nominally straight edges in terms of length and/or direction; determining a predominant orientation of the nominally straight edges; establishing a group of edges as a function of their proximity to the center of the image; establishing a group of edges as a function of their proximity to other remaining edge positions; and transmuting a rectangle bounding those edges into a rectified image. The rectified image is typically an image that is cropped or rotated. | 03-24-2016 |
20160098599 | Microform Word Search Method and Apparatus - A digital imaging system for real-time searching for expressions that appear on a microform medium, the system comprising a computer including a processor, a display, a temporary memory, and a non-volatile memory, and a digital microform imaging apparatus including a controller and an area sensor generating a real-time digital microform image of the microform medium. The controller is in communication with the area sensor and receives the image. The controller is in communication with the processor and outputs the image to the processor that temporarily stores the image in the temporary memory. The processor uses the image output by the controller to drive the display with the image being generated by the area sensor in real-time. The processor searches the image used to drive the display for instances of a search expression, the processor visually distinguishing identified instances of the search expression on the display. | 04-07-2016 |
20160125237 | CAPTURING SPECIFIC INFORMATION BASED ON FIELD INFORMATION ASSOCIATED WITH A DOCUMENT CLASS - A device may obtain a document, of a document type, from which specific information is to be captured. The specific information to be captured may depend on a document class associated with the document. The device may identify the document class, associated with the document, based on a characteristic of the document. The document class may identify a category of document in which the document is included, and may be associated with multiple document types. The device may determine field information associated with the document class. The field information may include information that identifies a portion of the document where the specific information is located, or may include information that identifies a manner in which the specific information can be identified within the document. The device may analyze the document, based on the field information, in order to capture the specific information. The device may provide the captured specific information. | 05-05-2016 |
20160125254 | USING EXTRACTED IMAGE TEXT - Methods, systems, and apparatus including computer program products for using extracted image text are provided. In one implementation, a computer-implemented method is provided. The method includes receiving an input of one or more image search terms and identifying keywords from the received one or more image search terms. The method also includes searching a collection of keywords including keywords extracted from image text, retrieving an image associated with extracted image text corresponding to one or more of the image search terms, and presenting the image. | 05-05-2016 |
20160125391 | BACKGROUND OCR DURING CARD DATA ENTRY - Financial transaction card data can be entered by providing a picture of the card to a server programmed with a text recognition algorithm. The server can perform text recognition on the image at the same time that a consumer enters additional required data, such as a zip code. The server can perform as much text recognition processing as possible in the time the consumer is entering the additional data. Once the additional data is received, a signal can be provided to the server indicating that the user is now waiting for results of the text recognition process, meaning the server should provide them as quickly as possible. Once text recognition results are received, a consumer can make a selection to identify a character which the text recognition algorithm did not sufficiently identify. Based on known account number constraints, the user selection can cause multiple characters to be identified. | 05-05-2016 |
20160132738 | Template Matching with Data Correction - An approach is provided to generate forms with template inclusions. In the approach, optical character recognition (OCR) text is compared to corresponding text in a selected form. Characters of text in the OCR text are then replaced with text from the template text, the replacing results in a form with template inclusions. The form with template inclusions is then processed by a forms processing operation. | 05-12-2016 |
20160132739 | METHOD AND APPARATUS FOR INFORMATION RECOGNITION - A method for information recognition using an Optical Character Recognition (OCR) program includes acquiring an image of an object to be recognized, analyzing a layout of the contents of the image and extracting character area blocks in the image, determining character lines in the character area blocks, and recognizing, by the OCR program, character information of the key character lines in the character area blocks. | 05-12-2016 |
20160132740 | COMPARING EXTRACTED CARD DATA WITH USER DATA - Extracting card data comprises receiving, by one or more computing devices, a digital image of a card; perform an image recognition process on the digital representation of the card; identifying an image in the digital representation of the card; comparing the identified image to an image database comprising a plurality of images and determining that the identified image matches a stored image in the image database; determining a card type associated with the stored image and associating the card type with the card based on the determination that the identified image matches the stored image; and performing a particular optical character recognition algorithm on the digital representation of the card, the particular optical character recognition algorithm being based on the determined card type. Another example uses an issuer identification number to improve data extraction. Another example compares extracted data with user data to improve accuracy. | 05-12-2016 |
20160140409 | TEXT CLASSIFICATION BASED ON JOINT COMPLEXITY AND COMPRESSED SENSING - A server computes a sparsifying matrix from a set of reference blocks that is selected from first blocks of text based on joint complexities of each pair of the first blocks of text. The server determines one of the set of reference blocks that is most similar to a second block of text based on the sparsifying matrix, a measurement matrix, and a measurement vector formed by compressing the second block of text using the measurement matrix. The server transmits a signal representative of the one of the set of reference blocks to indicate a classification of the second block of text. | 05-19-2016 |
20160171329 | SYSTEM AND METHOD FOR USING PRIOR FRAME DATA FOR OCR PROCESSING OF FRAMES IN VIDEO SOURCES | 06-16-2016 |
20160183672 | ORAL CARE IMPLEMENT - Disclosed is a toothbrush comprising an optical machine-readable representation of data, such as a barcode. Also disclosed is an oral care implement comprising a surface and a plurality of protrusions extending at least 1 millimeter from the surface, the protrusions at least partially defining an optical machine-readable representation of data, such as a barcode. Also disclosed is a system, comprising: an oral care implement comprising an optical machine-readable representation of data, such as a barcode; and a device comprising a processor and memory storing computer readable instructions for causing the processor to process an image of the optical machine-readable representation of data to extract the data from the image. | 06-30-2016 |
20160188991 | SYSTEM AND METHOD OF DETERMINING BUILDING NUMBERS - A system and method is provided for automatically recognizing building numbers in street level images. In one aspect, a processor selects a street level image that is likely to be near an address of interest. The processor identifies those portions of the image that are visually similar to street numbers, and then extracts the numeric values of the characters displayed in such portions. If an extracted value corresponds with the building number of the address of interest such as being substantially equal to the address of interest, the extracted value and the image portion are displayed to a human operator. The human operator confirms, by looking at the image portion, whether the image portion appears to be a building number that matches the extracted value. If so, the processor stores a value that associates that building number with the street level image. | 06-30-2016 |