Patent application number | Description | Published |
20100188419 | SELECTIVE DISPLAY OF OCR'ED TEXT AND CORRESPONDING IMAGES FROM PUBLICATIONS ON A CLIENT DEVICE - Text is extracted from a source image of a publication using an Optical Character Recognition (OCR) process. A document is generated containing text segments of the extracted text. The document includes a control module that responds to user interactions with the displayed document. Responsive to a user selection of a displayed text segment, a corresponding image segment from the source image containing the text is retrieved and rendered in place of the selected text segment. The user can select again to toggle the display back to the text segment. Each text segment can be tagged with a garbage score indicating its quality. If the garbage score of a text segment exceeds a threshold value, the corresponding image segment can be automatically displayed instead. | 07-29-2010 |
20130002710 | Selective Display of OCR'ed Text and Corresponding Images from Publications on a Client Device - Text is extracted from a source image of a publication using an Optical Character Recognition (OCR) process. A document is generated containing text segments of the extracted text. The document includes a control module that responds to user interactions with the displayed document. Responsive to a user selection of a displayed text segment, a corresponding image segment from the source image containing the text is retrieved and rendered in place of the selected text segment. The user can select again to toggle the display back to the text segment. Each text segment can be tagged with a garbage score indicating its quality. If the garbage score of a text segment exceeds a threshold value, the corresponding image segment can be automatically displayed instead. | 01-03-2013 |
20130259378 | METHODS AND SYSTEMS FOR ASSESSING THE QUALITY OF AUTOMATICALLY GENERATED TEXT - A set of ordered characters is received in association with information specifying the locations of the characters within the image of the document. Language-conditional character probabilities for each character are determined based on a set of language models and the ordering of the characters. Neighbor characters associated with a target character are identified based on the locations of the characters. Language-conditional character probabilities associated with the neighbor characters and language-conditional character probabilities associated with the target character are combined to generate a local language-conditional likelihood associated with the target character, the local language-conditional likelihood representing a concordance of the target character to a language model. | 10-03-2013 |
20130265325 | Selective Display Of OCR'ed Text And Corresponding Images From Publications On A Client Device - Text is extracted from a source image of a publication using an Optical Character Recognition (OCR) process. A document is generated containing text segments of the extracted text. The document includes a control module that responds to user interactions with the displayed document. Responsive to a user selection of a displayed text segment, a corresponding image segment from the source image containing the text is retrieved and rendered in place of the selected text segment. The user can select again to toggle the display back to the text segment. Each text segment can be tagged with a garbage score indicating its quality. If the garbage score of a text segment exceeds a threshold value, the corresponding image segment can be automatically displayed instead. | 10-10-2013 |
20140125693 | Selective Display of OCR'ed Text and Corresponding Images From Publications on a Client Device - Text is extracted from a source image of a publication using an Optical Character Recognition (OCR) process. A document is generated containing text segments of the extracted text. The document includes a control module that responds to user interactions with the displayed document. Responsive to a user selection of a displayed text segment, a corresponding image segment from the source image containing the text is retrieved and rendered in place of the selected text segment. The user can select again to toggle the display back to the text segment. Each text segment can be tagged with a garbage score indicating its quality. If the garbage score of a text segment exceeds a threshold value, the corresponding image segment can be automatically displayed instead. | 05-08-2014 |
Patent application number | Description | Published |
20080243481 | Large Language Models in Machine Translation - Systems, methods, and computer program products for machine translation are provided. In some implementations a system is provided. The system includes a language model including a collection of n-grams from a corpus, each n-gram having a corresponding relative frequency in the corpus and an order n corresponding to a number of tokens in the n-gram, each n-gram corresponding to a backoff n-gram having an order of n-1 and a collection of backoff scores, each backoff score associated with an n-gram, the backoff score determined as a function of a backoff factor and a relative frequency of a corresponding backoff n-gram in the corpus. | 10-02-2008 |
20110129153 | Identifying Matching Canonical Documents in Response to a Visual Query - A server system receives a visual query from a client system. The visual query is an image containing text such as a picture of a document. At the receiving server or another server, optical character recognition (OCR) is performed on the visual query to produce text recognition data representing textual characters. Each character in a contiguous region of the visual query is individually scored according to its quality. The quality score of a respective character is influenced by the quality scores of neighboring or nearby characters. Using the scores, one or more high quality strings of characters are identified. Each high quality string has a plurality of high quality characters. A canonical document containing the one or more high quality textual strings is retrieved. At least a portion of the canonical document is sent to the client system. | 06-02-2011 |
20110202330 | Compound Splitting - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for decompounding compound words are disclosed. In one aspect, a method includes obtaining a token that includes a sequence of characters, identifying two or more candidate sub-words that are constituents of the token, and one or more morphological operations that are required to transform the sub-words into the token, where at least one of the morphological operations involves a use of a non-dictionary word, and determining a cost associated with each sub-word and a cost associated with each morphological operation. | 08-18-2011 |
20120047172 | PARALLEL DOCUMENT MINING - A technique includes providing a collection of documents in multiple languages, identifying, from the collection of documents, a group of candidate documents, where each candidate document in the group shares multiple corresponding rare features, evaluating pairs of candidate documents in the group using multiple common features present in the collection of documents, and determining, based on evaluating the pairs of candidate documents, whether each pair of candidate documents corresponds to a translated pair of documents. | 02-23-2012 |
20120128250 | Generating a Combination of a Visual Query and Matching Canonical Document - A server system receives a visual query from a client system distinct from the server system, performs optical character recognition (OCR) on the visual query to produce text recognition data representing textual characters, including a plurality of textual characters in a contiguous region of the visual query, and scores each textual character in the plurality of textual characters. The server system identifies, in accordance with the scoring, one or more high quality textual strings, each comprising a plurality of high quality textual characters from among the plurality of textual characters in the contiguous region of the visual query; retrieves a canonical document having the one or more high quality textual strings; generates a combination of the visual query and at least a portion of the canonical document; and sends the combination to the client system. | 05-24-2012 |
20120128251 | Identifying Matching Canonical Documents Consistent with Visual Query Structural Information - A server system receives a visual query from a client system, performs optical character recognition (OCR) on the visual query to produce text recognition data representing textual characters, including a plurality of textual characters in a contiguous region of the visual query. The server system also produces structural information associated with the textual characters in the visual query. Textual characters in the plurality of textual characters are scored. The method further includes identifying, in accordance with the scoring, one or more high quality textual strings, each comprising a plurality of high quality textual characters from among the plurality of textual characters in the contiguous region of the visual query. A canonical document that includes the one or more high quality textual strings and that is consistent with the structural information is retrieved. At least a portion of the canonical document is sent to the client system. | 05-24-2012 |
20120134590 | Identifying Matching Canonical Documents in Response to a Visual Query and in Accordance with Geographic Information - A server system receives a visual query from a client system distinct from the server system. The server system performs optical character recognition (OCR) on the visual query to produce text recognition data representing textual characters, including a plurality of textual characters in a contiguous region of the visual query. The server system scores each textual character in the plurality of textual characters in accordance with the geographic location of the client system. The server system identifies, in accordance with the scoring, one or more high quality textual strings, each comprising a plurality of high quality textual characters from among the plurality of textual characters in the contiguous region of the visual query. Then the server system retrieves a canonical document having the one or more high quality textual strings and sends at least a portion of the canonical document to the client system. | 05-31-2012 |
20130346059 | LARGE LANGUAGE MODELS IN MACHINE TRANSLATION - Systems, methods, and computer program products for machine translation are provided. In some implementations a system is provided. The system includes a language model including a collection of n-grams from a corpus, each n-gram having a corresponding relative frequency in the corpus and an order n corresponding to a number of tokens in the n-gram, each n-gram corresponding to a backoff n-gram having an order of n−1 and a collection of backoff scores, each backoff score associated with an n-gram, the backoff score determined as a function of a backoff factor and a relative frequency of a corresponding backoff n-gram in the corpus. | 12-26-2013 |
20140334746 | Identifying Matching Canonical Documents Consistent With Visual Query Structural Information - A server system receives a visual query from a client system, performs optical character recognition (OCR) on the visual query to produce text recognition data representing textual characters, including a plurality of textual characters in a contiguous region of the visual query. The server system also produces structural information associated with the textual characters in the visual query. Textual characters in the plurality of textual characters are scored. The method further includes identifying, in accordance with the scoring, one or more high quality textual strings, each comprising a plurality of high quality textual characters from among the plurality of textual characters in the contiguous region of the visual query. A canonical document that includes the one or more high quality textual strings and that is consistent with the structural information is retrieved. At least a portion of the canonical document is sent to the client system. | 11-13-2014 |
Patent application number | Description | Published |
20140270329 | EXTRACTION OF FINANCIAL ACCOUNT INFORMATION FROM A DIGITAL IMAGE OF A CARD - Capturing information from payment instruments comprises receiving, using one or more computer devices, an image of a back side of a payment instrument, the payment instrument comprising information imprinted thereon such that the imprinted information protrudes from a front side of the payment instrument and the imprinted information is indented into the back side of the payment instrument; extracting sets of characters from the image of the back side of the payment instrument based on the imprinted information indented into the back side of the payment instrument and depicted in the image of the back side of the payment instrument; applying a first character recognition application to process the sets of characters extracted from the image of the back side of the payment instrument; and categorizing each of the sets of characters into one of a plurality of categories relating to information required to conduct a payment transaction. | 09-18-2014 |
20150186738 | Text Recognition Based on Recognition Units - Methods and systems for grapheme splitting of text input for recognition are provided. A method may include receiving a text input in a script and segmenting the text input into one or more graphemes. Each of the one or more graphemes may be split into one or more recognition units based on one or more recognition unit identification criteria associated with the script. Next, a text recognition system may be trained using the recognition units. Text input may be handwritten text input received from a user or a scanned image of text. | 07-02-2015 |
20150287002 | EXTRACTION OF FINANCIAL ACCOUNT INFORMATION FROM A DIGITAL IMAGE OF A CARD - Capturing information from payment instruments comprises receiving, using one or more computer devices, an image of a back side of a payment instrument, the payment instrument comprising information imprinted thereon such that the imprinted information protrudes from a front side of the payment instrument and the imprinted information is indented into the back side of the payment instrument; extracting sets of characters from the image of the back side of the payment instrument based on the imprinted information indented into the back side of the payment instrument and depicted in the image of the back side of the payment instrument; applying a first character recognition application to process the sets of characters extracted from the image of the back side of the payment instrument; and categorizing each of the sets of characters into one of a plurality of categories relating to information required to conduct a payment transaction. | 10-08-2015 |
Patent application number | Description | Published |
20120058150 | Methods for Delivering Compositions by Electrospraying a Medical Device - Methods are provided for administering a phospholipid composition to a subject, comprising coating a medical device with at least one layer of a phospholipid composition, wherein the coating is achieved by electrospraying the device with the composition, and wherein the composition is carrying or can carry at least one therapeutic agent. | 03-08-2012 |
20150196688 | Glycosaminoglycan and Synthetic Polymer Material for Blood-Contacting Applications - Provided herein is a composite, comprising: a polymer host selected from the group consisting of low-density polyethylene (LDPE), linear low-density polyethylene (LLDPE), polyethylene terephthalate (PET), polytetrafluoroethylene (PTFE), and polypropylene (PP), polyurethane, polycaprolactone (PCL), polydimethylsiloxane (PDMS), polymethylmethacrylate (PMMA), and polyoxymethylene (POM); and a guest molecule comprising hyaluronic acid; wherein the guest molecule is disposed within the polymer host, and wherein the guest molecule is covalently bonded to at least one other guest molecule. Also provided herein are methods for forming the composite, and blood-contracting devices made from the composite, such as heart valves and vascular grafts. | 07-16-2015 |
Patent application number | Description | Published |
20130256149 | MICROBIAL ELECTROLYSIS CELLS AND METHODS FOR THE PRODUCTION OF CHEMICAL PRODUCTS - A microbial electrolysis cell having a brush anode is described. A method of producing products, such as hydrogen, at the cathode of the microbial electrolysis cell is also provided. The microbial electrolysis cell is configured in a cylindrical shape having an anode, cathode and anion exchange membrane all disposed concentrically. A brush anode spirally wound around the outside of the cylindrical microbial electrolysis cell is described. The method may include sparging the anode and/or cathode with air in some cases. In addition, CO | 10-03-2013 |
20130345990 | TOOL FOR OPTIMIZING CHLORINATED-SOLVENT BIOREMEDIATION THROUGH INTEGRATION OF CHEMICAL AND MOLECULAR DATA WITH ELECTRON AND ALKALINITY BALANCES - A prediction and assessment tool for bioremediation performance based on a comprehensive understanding of the link between chemical flow and microbial community interactions includes linking molecular microbial ecology data with electron and alkalinity balances to make it possible to understand dechlorinating microbial communities and their metabolic processes. The interactions of biological processes and site mineralogy result in changes to alkalinity and pH that can lead to incomplete reductive dechlorination resulting from suboptimal pH. Understanding these interactions allows for strategies to predict expected bioremediation outcomes and/or to mitigate incomplete reductive dechlorination. | 12-26-2013 |
20140273143 | Methods, Systems, and Culture Medium for Production of Dechlorinating Microorganisms - Methods, systems, and compositions for growing high density cultures of dechlorinating microorganisms, such as the bacteria | 09-18-2014 |
20150030888 | METHODS AND SYSTEMS FOR MICROBIAL FUEL CELLS WITH IMPROVED CATHODES - Methods and systems for microbial fuel cells with unproved cathodes are provided, in accordance with some embodiments, methods for microbial fuel cells with improved cathodes are provided. The methods comprising: abiotically reducing oxygen on a cathode having a catalyst layer bound to a gas diffusion layer using an anion conductive polymer, consequently accumulating Off at the catalyst layer, and reducing local pH by conducting the OH″ away from the catalyst layer, directly or by transport of anionic buffers that act as OH″ carriers, through the anion conductive polymer, in accordance with some embodiments, a system for microbial fuel cells is provided. The system comprising: a container, an anode, anode-respiring bacteria, and a cathode having a catalyst layer bound to a gas diffusion layer using an anion conductive polymer. | 01-29-2015 |