Patent application number | Description | Published |
20090171661 | METHOD FOR ASSESSING PRONUNCIATION ABILITIES - Techniques for assessing pronunciation abilities of a user are provided. The techniques include recording a sentence spoken by a user, performing a classification of the spoken sentence, wherein the classification is performed with respect to at least one N-ordered class, and wherein the spoken sentence is represented by a set of at least one acoustic feature extracted from the spoken sentence, and determining a score based on the classification, wherein the score is used to determine an optimal set of at least one question to assess pronunciation ability of the user without human intervention. | 07-02-2009 |
20100185648 | ENABLING ACCESS TO INFORMATION ON A WEB PAGE - Techniques for enabling voice access to information residing on the World Wide Web are provided. The techniques include receiving a query from a user, wherein the query comprises a voice-based request to access information residing on the World Wide Web, identifying one or more websites corresponding to the query, fetching the information from a website, wherein fetching the information comprises executing a hypertext transfer protocol (HTTP) request, organizing the information into a voice-based response and delivering the response to the user. | 07-22-2010 |
20110029491 | DYNAMICALLY DETECTING NEAR-DUPLICATE DOCUMENTS - Techniques for detecting one or more documents that are duplicate or near-duplicate of a first document are provided. The techniques include obtaining a first document, obtaining one or more additional documents, retrieving a set of one or more document signatures for each document, and detecting one or more documents that are duplicate or near-duplicate of the first document by detecting each of the one or more additional documents that have at least a minimum number of signatures in common with the first document, wherein detecting each of the one or more additional documents that have at least a minimum number of signatures in common with the first document comprises dynamically using at least one of a user-configurable similarity definition and a user-configurable similarity threshold value. | 02-03-2011 |
20110166850 | CROSS-GUIDED DATA CLUSTERING BASED ON ALIGNMENT BETWEEN DATA DOMAINS - A system and associated method for cross-guided data clustering by aligning target clusters in a target domain to source clusters in a source domain. The cross-guided clustering process takes the target domain and the source domain as inputs. A common word attribute shared by both the target domain and the source domain is a pivot vocabulary, and all other words in both domains are a non-pivot vocabulary. The non-pivot vocabulary is projected onto the pivot vocabulary to improve measurement of similarity between data items. Source centroids representing clusters in the source domain are created and projected to the pivot vocabulary. Target centroids representing clusters in the target domain are initially created by conventional clustering method and then repetitively aligned to converge with the source centroids by use of a cross-domain similarity graph that measures a respective similarity of each target centroid to each source centroid. | 07-07-2011 |
20110167064 | CROSS-DOMAIN CLUSTERABILITY EVALUATION FOR CROSS-GUIDED DATA CLUSTERING BASED ON ALIGNMENT BETWEEN DATA DOMAINS - A system and associated method for evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source-target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively. | 07-07-2011 |
20110270808 | Systems and Methods for Discovering Synonymous Elements Using Context Over Multiple Similar Addresses - A clustering-based approach to data standardization is provided. Certain embodiments take as input a plurality of addresses, identify one or more features of the addresses, cluster the addresses based on the one or more features, utilize the cluster(s) to provide a data-based context useful in identifying one or more synonyms for elements contained in the address(es), and standardize the address(es) to an acceptable format, with one or more synonyms and/or other elements being added to or taken away from the input address(es) as part of the standardization process. | 11-03-2011 |
20120047179 | SYSTEMS AND METHODS FOR STANDARDIZATION AND DE-DUPLICATION OF ADDRESSES USING TAXONOMY - Systems and associated methods for address standardization and applications related thereto are described. Embodiments exploit a common context in a taxonomy and a given address to detect and correct deviations in the address. Embodiments establish a possible path from a root of the taxonomy to a leaf in the taxonomy that can possibly generate a given address. Given a new address, embodiments use complete addresses, and/or segments or elements thereof, to compute the representations of the elements and find a closest matching leaf in the taxonomy. Embodiments then traverse the path to a root node to detect the agreement and disagreement between the path and the address entry. Taxonomical structured is thus used to detect, segregate and standardize the expected fields. | 02-23-2012 |
20120066209 | ELECTRONIC MAIL DUPLICATE DETECTION - Embodiments of the invention are related to a method and system for identifying linked electronic mails by receiving a query from a user, wherein the query comprises at least a segment of an electronic mail; and based on the segment received, rendering to the user at least one of related subsets or a related supersets of electronic mails related to the received segment, wherein the related subsets and related supersets are threads of the segment received and arranged in a hierarchical manner. | 03-15-2012 |
20120066227 | E-MAIL THREAD HIERARCHY DETECTION - A plurality of segments in an e-mail collection by parsing content of e-mails is generated. Corresponding segment signature for each segment is created and a signature index is populated using the generated segment signatures. After receiving a query e-mail, a plurality of query segments in the query e-mail is generated using content of the query e-mail and corresponding query segment signature for each query segment is generated. A query root segment is identified and corresponding query root segment signature is generated. A set of root segment signatures of the signature index is identified and the query root segment signature is compared with each root segment signature from the signature index. A subset of the signature index is identified, using a match between the root segment signature and the query root segment signature. An e-mail thread hierarchy is built using the identified subset of the signature index. | 03-15-2012 |
20120150867 | CLUSTERING A COLLECTION USING AN INVERTED INDEX OF FEATURES - Provided are techniques for creating an inverted index for features of a set of data elements, wherein each of the data elements is represented by a vector of features, wherein the inverted index, when queried with a feature, outputs one or more data elements containing the feature. The features of the set of data elements are ranked. For each feature in the ranked list, the inverted index is queried for data elements having the feature and not having any previously selected feature and a cluster of the data elements is created based on results returned in response to the query. | 06-14-2012 |
20120191712 | CROSS-DOMAIN CLUSTERABILITY EVALUATION FOR CROSS-GUIDED DATA CLUSTERING BASED ON ALIGNMENT BETWEEN DATA DOMAINS - A computer program product evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source-target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively. | 07-26-2012 |
20120191713 | CROSS-DOMAIN CLUSTERABILITY EVALUATION FOR CROSS-GUIDED DATA CLUSTERING BASED ON ALIGNMENT BETWEEN DATA DOMAINS - A process for evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source- target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively. | 07-26-2012 |
20120197892 | CROSS-DOMAIN CLUSTERABILITY EVALUATION FOR CROSS-GUIDED DATA CLUSTERING BASED ON ALIGNMENT BETWEEN DATA DOMAINS - A computer system for evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source-target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively. | 08-02-2012 |
20130018649 | System and a Method for Generating Semantically Similar Sentences for Building a Robust SLMAANM Deshmukh; Om D.AACI New DelhiAACO INAAGP Deshmukh; Om D. New Delhi INAANM Joshi; SachindraAACI New DelhiAACO INAAGP Joshi; Sachindra New Delhi INAANM Mohamed; Shajith I.AACI KarnatakaAACO INAAGP Mohamed; Shajith I. Karnataka INAANM Verma; AshishAACI New DelhiAACO INAAGP Verma; Ashish New Delhi IN - A system and method are described for generating semantically similar sentences for a statistical language model. A semantic class generator determines for each word in an input utterance a set of corresponding semantically similar words. A sentence generator computes a set of candidate sentences each containing at most one member from each set of semantically similar words. A sentence verifier grammatically tests each candidate sentence to determine a set of grammatically correct sentences semantically similar to the input utterance. Also note that the generated semantically similar sentences are not restricted to be selected from an existing sentence database. | 01-17-2013 |
20130339021 | Intent Discovery in Audio or Text-Based Conversation - Techniques, an apparatus and an article of manufacture identifying one or more utterances that are likely to carry the intent of a speaker, from a conversation between two or more parties. A method includes obtaining an input of a set of utterances in chronological order from a conversation between two or more parties, computing an intent confidence value of each utterance by summing intent confidence scores from each of the constituent words of the utterance, wherein intent confidence scores capture each word's influence on the subsequent utterances in the conversation based on (i) the uniqueness of the word in the conversation and (ii) the number of times the word subsequently occurs in the conversation, and generating a ranked order of the utterances from highest to lowest intent confidence value, wherein the highest intent value corresponds to the utterance which is most likely to carry intent of the speaker. | 12-19-2013 |
20140004495 | ENHANCING POSTED CONTENT IN DISCUSSION FORUMS | 01-02-2014 |
20140006524 | ENHANCING POSTED CONTENT IN DISCUSSION FORUMS | 01-02-2014 |
20140122492 | CROSS-DOMAIN CLUSTERABILITY EVALUATION FOR CROSS-GUIDED DATA CLUSTERING BASED ON ALIGNMENT BETWEEN DATA DOMAINS - A method and system for evaluating cross-domain clusterability upon a target domain and a source domain. Target clusterability is calculated as an average of a respective clusterability of at least one target data item comprised by the target domain. Target-side matchability is calculated as an average of a respective matchability of each target centroid of the target domain to source centroids of the source domain, wherein the source domain comprises at least one source data item. Source-side matchability is calculated as an average of a respective matchability of each source centroid of said source centroids to the target centroids. Source-target pair matchability is calculated as an average of the target-side matchability and the source-side matchability. Cross-domain clusterability between the target domain and the source domain is calculated as a linear combination of the calculated target clusterability and the calculated source-target pair matchability. The cross-domain clusterability is transferred to a device. | 05-01-2014 |
20140180692 | INTENT MINING VIA ANALYSIS OF UTTERANCES - According to example configurations, a speech processing system can include a syntactic parser, a word extractor, word extraction rules, and an analyzer. The syntactic parser of the speech processing system parses the utterance to identify syntactic relationships amongst words in the utterance. The word extractor utilizes word extraction rules to identify groupings of related words in the utterance that most likely represent an intended meaning of the utterance. The analyzer in the speech processing system maps each set of the sets of words produced by the word extractor to a respective candidate intent value to produce a list of candidate intent values for the utterance. The analyzer is configured to select, from the list of candidate intent values (i.e., possible intended meanings) of the utterance, a particular candidate intent value as being representative of the intent (i.e., intended meaning) of the utterance. | 06-26-2014 |
20150026164 | Utilizing Dependency Among Internet Search Results - Techniques, systems, and articles of manufacture for utilizing dependency among internet search results. A method includes associating a user search query with a search task, identifying multiple information documents that correspond to the search task, and generating a recommended sequence of the multiple information documents to present to the user in response to the user search query, wherein the recommended sequence is based on dependency information associated with the multiple information documents. | 01-22-2015 |