Patent application number | Description | Published |
20100268701 | NAVIGATIONAL RANKING FOR FOCUSED CRAWLING - Systems and methods of navigational ranking for focused crawling are disclosed. In an exemplary embodiment, a method may include using a classifier to distinguish at least one target web page from other web pages on a website. The method may also include modeling the web pages on the website by a directed graph G=(V, E), wherein each web page is represented by a vertex (V), and a link between two web pages is represented by an edge (E). The method may also include assigning each web page (u) in V is assigned a weight p(u) based on the classifier to calculate a navigational ranking indicating relevance of a web page. | 10-21-2010 |
20100293116 | URL AND ANCHOR TEXT ANALYSIS FOR FOCUSED CRAWLING - Systems and methods of URL and anchor text analysis for focused crawling are disclosed. In an exemplary embodiment, a method may include training a focused crawler by: obtaining a training set of at least URL's or anchor text for a website, computing a score for the training set, and extracting a plurality of features of the training set, and computing a score for each of the plurality of features. The features identify key information contained in the website. The method may also include executing a trained focused crawler on other websites. | 11-18-2010 |
20100293159 | SYSTEMS AND METHODS FOR EXTRACTING PHASES FROM TEXT - Systems and methods for extracting phrases from text are disclosed. In an exemplary embodiment, a method may include preprocessing desired phrases into at least one phrase indexing data structure for efficient matching. The method may also include scanning text to construct a hash table including keys and corresponding entries. The method may also include locating suffix trie trees for each word in the hash table. The method may also include matching each position in the hash table against the suffix trie trees, and outputting phrases matched in the scanned text. | 11-18-2010 |
20110173145 | CLASSIFICATION OF A DOCUMENT ACCORDING TO A WEIGHTED SEARCH TREE CREATED BY GENETIC ALGORITHMS - A device for classifying a document comprises a module to generate a data tree structure and configured to assign terms to a first plurality of nodes of the data tree structure, where each of the first plurality of nodes is assigned a weight. In assigning the weights of the first plurality of nodes, a first generation of combinations of possible weights assignable as the weights of the first plurality of nodes is obtained, and a second generation of combinations of possible weights assignable as the weights of the first plurality of nodes is obtained by performing the genetic algorithms in the first generation of combinations of possible weights. The device determines whether the document is in a document class based at least the weights of the first plurality of nodes. | 07-14-2011 |
20120021786 | METHOD OF SENDING A MESSAGE USING A MOBILE PHONE - Presented is a method and system for sending a message using a mobile phone. The method includes composing a message for sending to a recipient, generating a contact-label having contact information of the recipient, combining the message and the contact-label, capturing an image of the message and the contact-label combination for sending to the recipient using the mobile phone, decoding the captured image for identifying the contact information of the recipient, and sending the message to identified contact information. | 01-26-2012 |
20120059859 | Data Extraction Method, Computer Program Product and System - Disclosed is a method of automatically extracting data from a target web page, comprising selecting ( | 03-08-2012 |
20120066587 | Apparatus and Method for Text Extraction - A method of determining main text in a mark-up document is provided, which comprises determining a length of each paragraph in the mark-up document; and determining one or more main paragraphs of the mark-up document based upon the length of the paragraphs in the mark-up document. | 03-15-2012 |
20120109974 | Acronym Extraction - Disclosed is a system and computer-implemented method for extracting an acronym and one or more corresponding expansions of the acronym from a document represented in a markup language. The computer-implemented method comprises: identifying at least one acronym contained in the document; determining one or more expansions of the at least one identified acronym based on a portion of document located proximate the identified acronym; determining a ranking for each determined expansion based attributes of the document; and selecting one or more expansions for an identified acronym using the determined rankings. | 05-03-2012 |
20120130999 | Method and Apparatus for Searching Electronic Documents - Disclosed is a method and apparatus for searching electronic documents. The | 05-24-2012 |
20120143971 | Communicating Electronic Mail - Proposed is the use of an email-stamp for representing an email address. By comprising information about one or more email addresses of a recipient, an email stamp may be processed in accordance with an optical recognition process so as to identify the email address of the recipient and enable an email to be automatically sent to the recipient. | 06-07-2012 |
20120278705 | System and Method for Automatically Extracting Metadata from Unstructured Electronic Documents - A system and method for automatically extracting meta data from unstructured electronic documents is disclosed. In one embodiment, the unstructured electronic document is converted into a plain text document. Further, a document header of the unstructured electronic document is extracted from the plain text document using a rule-based document header extractor, where the rule-based document header extractor may be based on a rule that includes determining a ratio of a number of words with their initial letters capitalized in a text line over a total number of words in the text line in the plain text document. Moreover, meta data is extracted from the extracted document header using a heuristic approach. | 11-01-2012 |
20120303636 | System and Method for Web Content Extraction - A method and system for extracting Web content is disclosed. In one embodiment, Web content in a Webpage is extracted by identifying paragraphs in the Web content based on line-break node determination. A range of text-body associated with the identified paragraphs is then identified using a maximum scoring subsequence. Further, the identified text-body is refined using a heuristic rule of substantially horizontal alignment. Furthermore, one or more titles and one or more images associated with the Web content are extracted. Moreover, the Web content including the identified paragraphs, the one or more titles and the one or more images are outputted. | 11-29-2012 |
20130024250 | SYSTEMS AND METHODS FOR GROUP BUYING AND SOCIAL NETWORK - Computer-implemented methods, systems facilitating the methods, and computer-readable medium storing computer readable code, which, when executed, performs the methods are provided for group-buying and social network services. The methods may include steps of (1) receiving a request from a host to initiate an event that requires a consumption of an item offered by a provider, (2) performing at least one of (a) sending invitations of the event to at least one guest selected by the host, and (b) causing display of an advertisement of the event to be viewed by general public or a particular group, (3) receiving acceptance of participating the event from at least one participant, and (4) informing the host and the at least one participant whether sales of the item is executed or rescinded. | 01-24-2013 |
20130073385 | COMMUNICATION METHOD AND SYSTEM FOR ONLINE AND OFFLINE SOCIAL COMMERCE - Computer-implemented methods, systems facilitating the methods, and computer-readable medium storing computer readable code, which, when executed, performs the methods are provided for online to offline social commerce services. The methods may include (1) receiving a request from a content provider to post a content including an advertisement of a item for sale, (2) causing display of the content to a designated group, (3) receiving a request from a user to purchase the item after reviewing the content, (4) receiving payment for the item from the user, and (5) informing the user that the purchase is executed. The content can be a blog message, wherein the designated group includes followers of the content provider's blog, or the content can be a game, wherein the designated group includes players of the game. | 03-21-2013 |
20130091150 | DETERMIINING SIMILARITY BETWEEN ELEMENTS OF AN ELECTRONIC DOCUMENT - Disclosed is a computer-implemented method of determining smarty between first and second elements of an electronic document. The method uses a computer to calculate a plurality of measures of similarity between the first and second elements in at least two representations of the electronic document. A computer program product and system implementing this method are also disclosed. | 04-11-2013 |
20130114105 | Semantically Ranking Content in a Website - Semantically ranking content in a website ( | 05-09-2013 |
20130124953 | PRODUCING WEB PAGE CONTENT - A method for producing web page content includes identifying blocks within a web page. The blocks are selectively assembled into sections. The sections are selectively assembled into article candidates. An article candidate that includes article content is distinguished from article candidates that do not include article content. Content is produced only from the article candidate distinguished as including article content. | 05-16-2013 |
20130138496 | SYSTEMS, DEVICES AND METHODS FOR OFFLINE COUPON VERIFICATION - Computer-implemented methods, systems and devices facilitating the methods, and computer-readable medium storing computer readable code, which, when executed, performs the methods are provided for offline coupon verification. The methods may include inputting a verification code of a coupon to a coupon verification terminal, decrypting the verification code using a terminal private key to obtain a merchant ciphertext, decrypting the merchant ciphertext using a merchant-specific server public key to obtain a purchase information corresponding to the coupon, comparing the purchase information with a database to determine whether the purchase information is correct, displaying a negative verification result if the purchase information is not correct, and displaying a positive verification result if the purchase information is correct and the coupon has not been previously consumed, registering the coupon as consumed in the database for future verifications, when or after the coupon been verified, and synchronizing the database with that of a server. | 05-30-2013 |
20130238607 | SEED SET EXPANSION - Systems and methods for seed set expansion are provided. A context-based extractor ( | 09-12-2013 |