Patent application number | Description | Published |
20110069180 | CAMERA-BASED SCANNING - Embodiments of camera-based scanning are described. In various embodiments, scanned documents can be created using images captured by a camera associated with a device. An image captured by the camera is processed to identify portions within the image that correspond to rectangular objects such as paper, business cards, whiteboards, screens, and so forth. One or more of these portions can be selected for scanning automatically based on a scoring scheme and/or semi-automatically with the aid of input from a user. One or more scanned documents are created from the selected portions by un-warping the selected portions to remove effects of perspective (e.g., rectify the portions to rectangles) and applying various image enhancements to improve appearance. | 03-24-2011 |
20110222771 | PAGE LAYOUT DETERMINATION OF AN IMAGE UNDERGOING OPTICAL CHARACTER RECOGNITION - A method and system is provided for identifying a page layout of an image that includes textual regions. The textual regions are to undergo optical character recognition (OCR). The system includes an input component that receives an input image that includes words around which bounding boxes have been formed and a text identifying component that groups the words into a plurality of text regions. A reading line component groups words within each of the text regions into reading lines. A text region sorting component that sorts the text regions in accordance with their reading order. | 09-15-2011 |
20110222772 | RESOLUTION ADJUSTMENT OF AN IMAGE THAT INCLUDES TEXT UNDERGOING AN OCR PROCESS - An optical character recognition process characterizes text lines in a textual image by their base-line, mean-line and x-height. The base-line for at least one text line in the image is determined by finding a parametric curve that maximizes a first fitness function that depends on the values of pixels through which the parametric curve passes and pixels below the parametric curve. The base-line corresponds to the parametric curve for which the first fitness function is maximized. The first fitness function is designed so that it increases with increasing lightless or brightness of pixels immediately below the parametric curve while also increasing with decreasing lightness of pixels through which the parametric curve passes. The mean-line is determined by incrementally shifting the base-line upward by predetermined amounts (e.g., a single pixel) until a second fitness function for the shifted base-line is maximized. The second fitness function is essentially the inverse of the first fitness function. Specifically, the second fitness function increases with increasing lightless of pixels immediately above the shifted base-line while also increasing with decreasing lightness of pixels through which the shifted base-line passes. The x-height is equal to the sum of the predetermined amounts by which the base-line is shifted upward in order to maximize the second fitness function. In some cases different groups of text-lines in the textual image may be characterized differently from one another. For example, each group may be characterized by a most probable x-height for that group. | 09-15-2011 |
20110280481 | USER CORRECTION OF ERRORS ARISING IN A TEXTUAL DOCUMENT UNDERGOING OPTICAL CHARACTER RECOGNITION (OCR) PROCESS - An electronic model of the image document is created by undergoing an OCR process. The electronic model includes elements (e.g., words, text lines, paragraphs, images) of the image document that have been determined by each of a plurality of sequentially executed stages in the OCR process. The electronic model serves as input information which is supplied to each of the stages by a previous stage that processed the image document. A graphical user interface is presented to the user so that the user can provide user input data correcting a mischaracterized item appearing in the document. Based on the user input data, the processing stage which produced the initial error that gave rise to the mischaracterized item corrects the initial error. Stages of the OCR process subsequent to this stage then correct any consequential errors arising in their respective stages as a result of the initial error. | 11-17-2011 |
20130070122 | Camera-Based Scanning - Embodiments of camera-based scanning are described. In various embodiments, one or more quadrangular objects are automatically selected from a captured image for scanning. The automatic selection is determined to be successful based on the selected quadrangular objects having an associated score that exceeds a predefined threshold. One or more scanned documents are created from portions of the captured image corresponding to the selected quadrangular objects, and the created scanned documents include corrections for perspective distortions of the selected quadrangular objects in the captured image. | 03-21-2013 |
20140072224 | PAGE LAYOUT DETERMINATION OF AN IMAGE UNDERGOING OPTICAL CHARACTER RECOGNITION - A method and system is provided for identifying a page layout of an image that includes textual regions. The textual regions are to undergo optical character recognition (OCR). The system includes an input component that receives an input image that includes words around which bounding boxes have been formed and a text identifying component that groups the words into a plurality of text regions. A reading line component groups words within each of the text regions into reading lines. A text region sorting component that sorts the text regions in accordance with their reading order. | 03-13-2014 |
20140112527 | SIMULTANEOUS TRACKING AND TEXT RECOGNITION IN VIDEO FRAMES - Architecture that enables optical character recognition (OCR) of text in video frames at the rate at which the frames are received. Additionally, conflation is performed on multiple text recognition results in the frame sequence. The architecture comprises an OCR text recognition engine and a tracker system; the tracker system establishes a common coordinate system in which OCR results from different frames may be compared and/or combined. From a set of sequential video frames, a keyframe is chosen from which the reference coordinate system is established. An estimated transformation from keyframe coordinates to subsequent video frames is computed using the tracker system. When text recognition is completed for any subsequent frame, the result coordinates can be related to the keyframe using the inverse transformation from the processed frame to the reference keyframe. The results can be rendered for viewing as the results are obtained. | 04-24-2014 |
20140169668 | AUTOMATIC CLASSIFICATION AND COLOR ENHANCEMENT OF A MARKABLE SURFACE - Architecture that automatically computes if a quadrangular object captured in a given image is or is not a markable board (e.g., a whiteboard, green board, chalkboard, etc.). The markable board has a surface on which marks can be applied using chalk, ink, dry ink, or any other suitable marking instrument or tool for the given surface. The imaged quadrangular object can be defined as having a background image and a foreground image. The background image is representative of a markable board with no applied surface marks and the foreground image comprises all discernible marks applied to the board surface, but does not include the background image. A set of performance-friendly features is received and processed by a machine-learning classifier to compute if the given quadrangular object is a markable board. Thereafter, if the given image is determined to be a markable board, image enhancement is performed. | 06-19-2014 |
20140172408 | TEXT OVERLAY TECHNIQUES IN REALTIME TRANSLATION - Architecture that employs techniques for overlaying (superimposing) translated text on top of (over) scanned text in realtime translation to provide clear visual correlation between original text and translated text. Algorithms are provided that overlay text in cases of translated scanned text of language written in first direction to a language written in same direction, translate scanned text from a first language written in a first direction to a second language written in the opposite direction, and translated scanned text from a language written in a first direction to language written in a different direction. | 06-19-2014 |