Clustering

Subclass of:

704 - Data processing: speech signal processing, linguistics, language translation, and audio compression/decompression

704200000 - SPEECH SIGNAL PROCESSING

704231000 - Recognition

704243000 - Creating patterns for matching

Patent class list (only not empty are listed)

Deeper subclasses:

Class / Patent application number	Description	Number of patent applications / Date published
704245000	Clustering	33
20080319746	KEYWORD OUTPUTTING APPARATUS AND METHOD - A keyword analysis device obtains word vectors represented by the documents by analyzing keywords contained in each of documents input in a designated period. A topic cluster extraction device extracts topic clusters belonging to the same topic from a plurality of documents. A keyword extraction device extracts, as a characteristic keyword group, a predetermined number of keywords from the topic cluster in descending order of appearance frequency. A topic structurization determination device determines whether the topic can be structurized, by segmenting the topic cluster into subtopic clusters with reference to the number of documents, the variance of dates contained in the documents, or the C-value of keyword contained in the documents, as a determination criterion. And a keyword presentation device presents the characteristic keyword group in the subtopic cluster upon arranging the keyword group on the basis of the date information.	12-25-2008
20080319747	SPOKEN MAN-MACHINE INTERFACE WITH SPEAKER IDENTIFICATION - The method of operating a man-machine interface unit includes classifying at least one utterance of a speaker to be of a first type or of a second type. If the utterance is classified to be of the first type, the utterance belongs to a known speaker of a speaker data base, and if the utterance is classified to be of the second type, the utterance belongs to an unknown speaker that is not included in the speaker data base. The method also includes storing a set of utterances of the second type, clustering the set of utterances into clusters, wherein each cluster comprises utterances having similar features, and automatically adding a new speaker to the speaker data base based on utterances of one of the clusters.	12-25-2008
20090106023	Speech recognition word dictionary/language model making system, method, and program, and speech recognition system - A speech recognition word dictionary/language model making system for creating a word dictionary for recognizing a word not appearing in a learning text by selecting a word-generation-model-learning-method-by-word-class according to the word to be added which does not appear in the learning text and for making a language model. The speech recognition word dictionary/language model making system (	04-23-2009
20090112588	METHOD FOR SEGMENTING COMMUNICATION TRANSCRIPTS USING UNSUPERVSED AND SEMI-SUPERVISED TECHNIQUES - A method is provided for forming discrete segment clusters of one or more sequential sentences from a corpus of communication transcripts of transactional communications that comprises dividing the communication transcripts of the corpus into a first set of sentences spoken by a caller and a second set of sentences spoken by a responder; generating a specified number of sentence clusters by grouping the first and second sets of sentences according to a measure of lexical similarity using an unsupervised partitional clustering method; generating a collection of sequences of sentence types by assigning a distinct sentence type to each sentence cluster and representing each sentence of each communication transcript of the corpus with the sentence type assigned to the sentence cluster into which the sentence is grouped; and generating a specified number of discrete segment clusters by successively merging sentence clusters according to a proximity-based measure between the sentence types assigned to the sentence clusters within sequences of the collection.	04-30-2009
20100138223	SPEECH CLASSIFICATION APPARATUS, SPEECH CLASSIFICATION METHOD, AND SPEECH CLASSIFICATION PROGRAM - An object of the present invention is to allow classification of sequentially input speech signals with good accuracy based on similarity of speakers and environments by using a realistic memory use amount, a realistic processing speed, and an on-line operation. A speech classification probability calculation means	06-03-2010
20100217593	Program for creating Hidden Markov Model, information storage medium, system for creating Hidden Markov Model, speech recognition system, and method of speech recognition - A program for generating Hidden Markov Models to be used for speech recognition with a given speech recognition system, the information storage medium storing a program, that renders a computer to function as a scheduled-to-be-used model group storage section that stores a scheduled-to-be-used model group including a plurality of Hidden Markov Models scheduled to be used by the given speech recognition system, and a filler model generation section that generates Hidden Markov Models to be used as filler models by the given speech recognition system based on all or at least a part of the Hidden Markov Model group in the scheduled-to-be-used model group.	08-26-2010
20100241430	SYSTEM AND METHOD FOR USING META-DATA DEPENDENT LANGUAGE MODELING FOR AUTOMATIC SPEECH RECOGNITION - Disclosed are systems and methods for providing a spoken dialog system using meta-data to build language models to improve speech processing. Meta-data is generally defined as data outside received speech; for example, meta-data may be a customer profile having a name, address and purchase history of a caller to a spoken dialog system. The method comprises building tree clusters from meta-data and estimating a language model using the built tree clusters. The language model may be used by various modules in the spoken dialog system, such as the automatic speech recognition module and/or the dialog management module. Building the tree clusters from the meta-data may involve generating projections from the meta-data and further may comprise computing counts as a result of unigram tree clustering and then building both unigram trees and higher-order trees from the meta-data as well as computing node distances within the built trees that are used for estimating the language model.	09-23-2010
20110161081	Speech Recognition Language Models - Methods, computer program products and systems are described for forming a speech recognition language model. Multiple query-website relationships are determined by identifying websites that are determined to be relevant to queries using one or more search engines. Clusters are identified in the query-website relationships by connecting common queries and connecting common websites. A speech recognition language model is created for a particular website based on at least one of analyzing at queries in a cluster that includes the website or analyzing webpage content of web pages in the cluster that includes the website.	06-30-2011
20120123780	Method and system for video summarization - A video summary method comprises dividing a video into a plurality of video shots, analyzing each frame in a video shot from the plurality of video shots, determining a saliency of each frame of the video shot, determining a key frame of the video shot based on the saliency of each frame of the video shot, extracting visual features from the key frame and performing shot clustering of the plurality of video shots to determine concept patterns based on the visual features. The method further comprises fusing different concept patterns using a saliency tuning method and generating a summary of the video based upon a global optimization method.	05-17-2012
20130006633	LEARNING SPEECH MODELS FOR MOBILE DEVICE USERS - Techniques are provided to recognize a speaker's voice. In one embodiment, received audio data may be separated into a plurality of signals. For each signal, the signal may be associated with value/s for one or more features (e.g., Mel-Frequency Cepstral coefficients). The received data may be clustered (e.g., by clustering features associated with the signals). A predominate voice cluster may be identified and associated with a user. A speech model (e.g., a Gaussian Mixture Model or Hidden Markov Model) may be trained based on data associated with the predominate cluster. A received audio signal may then be processed using the speech model to, e.g.: determine who was speaking; determine whether the user was speaking; determining whether anyone was speaking; and/or determine what words were said. A context of the device or the user may then be inferred based at least partly on the processed signal.	01-03-2013
20130006634	IDENTIFYING PEOPLE THAT ARE PROXIMATE TO A MOBILE DEVICE USER VIA SOCIAL GRAPHS, SPEECH MODELS, AND USER CONTEXT - Techniques are provided to improve identification of a person using speaker recognition. In one embodiment, a unique social graph may be associated with each of a plurality of defined contexts. The social graph may indicate speakers likely to be present in a particular context. Thus, an audio signal including a speech signal may be collected and processed. A context may be inferred, and a corresponding social graph may be identified. A set of potential speakers may be determined based on the social graph. The processed signal may then be compared to a restricted set of speech models, each speech model being associated with a potential speaker. By limiting the set of potential speakers, speakers may be more accurately identified.	01-03-2013
20130006635	METHOD AND SYSTEM FOR SPEAKER DIARIZATION - A method and system for speaker diarization are provided. Pre-trained acoustic models of individual speaker and/or groups of speakers are obtained. Speech data with multiple speakers is received and divided into frames. For a frame, an acoustic feature vector is determined extended to include log-likelihood ratios of the pre-trained models in relation to a background population model. The extended acoustic feature vector is used in segmentation and clustering algorithms.	01-03-2013
20130006636	MEANING EXTRACTION SYSTEM, MEANING EXTRACTION METHOD, AND RECORDING MEDIUM - A meaning extraction device includes a clustering unit, an extraction rule generation unit and an extraction rule application unit. The clustering unit acquires feature vectors that transform numerical features representing the features of words having specific meanings and the surrounding words into elements, and clusters the acquired feature vectors into a plurality of clusters on the basis of the degree of similarity between feature vectors. The extraction rule generation unit performs machine learning based on the feature vectors within a cluster for each cluster, and generates extraction rules to extract words having specific meanings. The extraction rule application unit receives feature vectors generated from the words in documents which are subject to meaning extraction, specifies the optimum extraction rules for the feature vectors, and extracts the meanings of the words on the basis of which the feature vectors were generated by applying the specified extraction rules to the feature vectors.	01-03-2013
20130191126	Subword-Based Multi-Level Pronunciation Adaptation for Recognizing Accented Speech - Techniques are described for training a speech recognition model for accented speech. A subword parse table is employed that models mispronunciations at multiple subword levels, such as the syllable, position-specific cluster, and/or phone levels. Mispronunciation probability data is then generated at each level based on inputted training data, such as phone-level annotated transcripts of accented speech. Data from different levels of the subword parse table may then be combined to determine the accented speech model. Mispronunciation probability data at each subword level is based at least in part on context at that level. In some embodiments, phone-level annotated transcripts are generated using a semi-supervised method.	07-25-2013
20130253931	MODELING DEVICE AND METHOD FOR SPEAKER RECOGNITION, AND SPEAKER RECOGNITION SYSTEM - A modeling device and method for speaker recognition and a speaker recognition system are provided. The modeling device comprises a front end which receives enrollment speech data from each target speaker, a reference anchor set generation unit which generates a reference anchor set using the enrollment speech data based on an anchor space, and a voice print generation unit which generates voice prints based on the reference anchor set and the enrollment speech data. With the present disclosure, by taking the enrollment speech and speaker adaptation technique into account, anchor models with smaller size can be generated, so reliable and robust speaker recognition with smaller size reference anchor set is possible. It brings great advantages for computation speed improvement and great memory reduction.	09-26-2013
20130311183	VOICED SOUND INTERVAL DETECTION DEVICE, VOICED SOUND INTERVAL DETECTION METHOD AND VOICED SOUND INTERVAL DETECTION PROGRAM - This invention provides a voiced sound interval detection device which enables appropriate detection of a voiced sound interval of an observation signal even when a volume of sound from a sound source varies or when the number of sound sources is unknown or when different kinds of microphones are used together.	11-21-2013
20130325472	METHODS AND APPARATUS FOR PERFORMING TRANSFORMATION TECHNIQUES FOR DATA CLUSTERING AND/OR CLASSIFICATION - Some aspects include transforming data, at least a portion of which has been processed to determine frequency information associated with features in the data. Techniques include determining a first transformation based, at least in part, on the frequency information, applying at least the first transformation to the data to obtain transformed data, and fitting a plurality of clusters to the transformed data to obtain a plurality of established clusters. Some aspects include classifying input data by transforming the input data using at least the first transformation and comparing the transformed input data to the established clusters.	12-05-2013
20140195232	Methods, systems, and circuits for text independent speaker recognition with automatic learning features - Embodiments provide a method and system of text independent speaker recognition with a complexity comparable to a text dependent version. The scheme exploits the fact that speech is a quasi-stationary signal and simplifies the recognition process based on this theory. The modeling allows the speaker profile to be updated progressively with the new speech sample that is acquired during usage time.	07-10-2014
20140303978	GRAMMAR FRAGMENT ACQUISITION USING SYNTACTIC AND SEMANTIC CLUSTERING - A method and apparatus are provided for automatically acquiring grammar fragments for recognizing and understanding fluently spoken language. Grammar fragments representing a set of syntactically and semantically similar phrases may be generated using three probability distributions: of succeeding words, of preceding words, and of associated call-types. The similarity between phrases may be measured by applying Kullback-Leibler distance to these tree probability distributions. Phrases being close in all three distances may be clustered into a grammar fragment.	10-09-2014
20140316784	UPDATING POPULATION LANGUAGE MODELS BASED ON CHANGES MADE BY USER CLUSTERS - Technology for improving the predictive accuracy of input word recognition on a device by dynamically updating the lexicon of recognized words based on the word choices made by similar users. The technology collects users' vocabulary choices (e.g., words that each user uses, or adds to or removes from a word recognition dictionary), associates users who make similar choices, aggregates related vocabulary choices, filters the words, and sends words identified as likely choices for that user to the user's device. Clusters may include, for example, users in a particular location (e.g., sets of people who use words such as “Puyallup,” “Gloucester,” or “Waiheke”), users with a particular professional or hobby vocabulary, or application-specific vocabulary (e.g., word choices in map searches or email messages).	10-23-2014
20140337027	VOICE PROCESSING DEVICE, VOICE PROCESSING METHOD, AND NON-TRANSITORY RECORDING MEDIUM THAT STORES PROGRAM - A voice processing device includes: an acquirer which acquires feature quantities of vowel sections included in voice data; a classifier which classifies, among the acquired feature quantities, feature quantities corresponding to a plurality of same vowels into a plurality of clusters for respective vowels with unsupervised classification; and a determiner which determines a combination of clusters corresponding to the same speaker from clusters classified for the plurality of vowels.	11-13-2014
20140358541	Method and Apparatus for Automatic Speaker-Based Speech Clustering - Reliable speaker-based clustering of speech utterances allows improved speaker recognition and speaker-based speech segmentation. According to at least one example embodiment, an iterative bottom-up speaker-based clustering approach employs voiceprints of speech utterances, such as i-vectors. At each iteration, a clustering confidence score in terms of Silhouette Width Criterion (SWC) values is evaluated, and a pair of nearest clusters is merged into a single cluster. The pair of nearest clusters merged is determined based on a similarity score indicative of similarity between voiceprints associated with different clusters. A final clustering pattern is then determined as a set of clusters associated with an iteration corresponding to the highest clustering confidence score evaluated. The SWC used may further be a modified SWC enabling detection of an early stop of the iterative approach.	12-04-2014
20150025887	Blind Diarization of Recorded Calls with Arbitrary Number of Speakers - In a method of diarization of audio data, audio data is segmented into a plurality of utterances. Each utterance is represented as an utterance model representative of a plurality of feature vectors. The utterance models are clustered. A plurality of speaker models are constructed from the clustered utterance models. A hidden Markov model is constructed of the plurality of speaker models. A sequence of identified speaker models is decoded.	01-22-2015
20150032452	SYSTEM AND METHOD FOR DISCOVERING AND EXPLORING CONCEPTS - A method for identifying concepts in a plurality of interactions includes: filtering, on a processor, the interactions based on intervals; creating, on the processor, a plurality of sentences from the filtered interactions; computing, on the processor, a saliency of each the sentences; pruning away, on the processor, sentences with low saliency for generating a set of informative sentences; clustering, on the processor, the sentences of the set of informative sentences for generating a plurality of sentence clusters, each of the clusters corresponding to a concept of the concepts; computing, on the processor, a saliency of each of the clusters; and naming, on the processor, each of the clusters.	01-29-2015
20150051910	Unsupervised Clustering of Dialogs Extracted from Released Application Logs - A natural language understanding system performs automatic unsupervised clustering of dialog data from a natural language dialog application. A log parser automatically extracts structured dialog data from application logs. A dialog generalizing module generalizes the extracted dialog data to generalization identifier vectors. A data clustering module automatically clusters the dialog data based on the generalization identifier vectors using an unsupervised density-based clustering algorithm without a predefined number of clusters and without a predefined distance threshold in an iterative approach based on a hierarchical ordering of the generalization.	02-19-2015
20150073798	AUTOMATIC GENERATION OF DOMAIN MODELS FOR VIRTUAL PERSONAL ASSISTANTS - Technologies for automatic domain model generation include a computing device that accesses an n-gram index of a web corpus. The computing device generates a semantic graph of the web corpus for a relevant domain using the n-gram index. The semantic graph includes one or more related entities that are related to a seed entity. The computing device performs similarity discovery to identify and rank contextual synonyms within the domain. The computing device maintains a domain model including intents representing actions in the domain and slots representing parameters of actions or entities in the domain. The computing device performs intent discovery to discover intents and intent patterns by analyzing the web corpus using the semantic graph. The computing device performs slot discovery to discover slots, slot patterns, and slot values by analyzing the web corpus using the semantic graph. Other embodiments are described and claimed.	03-12-2015
20150081298	SPEECH PROCESSING APPARATUS AND METHOD - In a speech processing apparatus, an acquisition unit is configured to acquire a speech. A separation unit is configured to separate the speech into a plurality of sections in accordance with a prescribed rule. A calculation unit is configured to calculate a degree of similarity in each combination of the sections. An estimation unit is configured to estimate, with respect to the each section, a direction of arrival of the speech. A correction unit is configured to group the sections whose directions of arrival are mutually similar into a same group and correct the degree of similarity with respect to the combination of the sections in the same group. A clustering unit is configured to cluster the sections by using the corrected degree of similarity.	03-19-2015
20150348571	SPEECH DATA PROCESSING DEVICE, SPEECH DATA PROCESSING METHOD, AND SPEECH DATA PROCESSING PROGRAM - A data processing device, method and non-transitory computer-readable storage medium are disclosed. A data processing device may include a memory storing instructions, and at least one processor configured to process the instructions to divide a first speech data into first segments based on a data structure of the first speech data, classify the first segments into first clusters through clustering, generate a first segment speech model for each of the first clusters, and calculate a similarity between the first segment speech models and a second speech data.	12-03-2015
20150356969	METHOD FOR RECOGNIZING STATISTICAL VOICE LANGUAGE - The present invention is a method for recognizing a statistical voice language using a statistical technique without using a manually tagged corpus. The method comprises: a dialog act clustering step of clustering speech utterances of sentences based on similar dialog acts; a named entity clustering step of extracting a named entity candidate group from the result of the dialog act clustering step and clustering named entities based on the neighboring contextual information of the extracted named entity candidate group; and a main act clustering step of clustering main acts for each region based on the clustered dialog acts and named entity.	12-10-2015
20160012818	SYSTEM AND METHOD FOR SEMANTICALLY EXPLORING CONCEPTS	01-14-2016
20160063995	DISPLAY APPARATUS AND METHOD FOR RECOGNIZING VOICE - A display apparatus which is capable of recognizing a voice and a method thereof are provided. The method includes receiving an uttered voice of a user, extracting a plurality of similar words which are similar to the uttered voice by extracting voice information from the uttered voice and measuring reliability of a plurality of words based on the extracted voice information, setting a word satisfying a predetermined condition from among the plurality of extracted similar words as a target word with respect to the uttered voice, and displaying at least one of the target word and a similar word list including similar words other than the target word. In this manner, a display apparatus may improve a recognition rate on an uttered voice of a user without changing an internal component related to voice recognition, such as, an acoustic model, a pronunciation dictionary, or the like.	03-03-2016
20160140959	SPEECH RECOGNITION SYSTEM ADAPTATION BASED ON NON-ACOUSTIC ATTRIBUTES - A method includes the following steps. A vicinity from which speech input to a speech recognition system originates is determined. Non-acoustic data from the vicinity of the speech is obtained using one or more non-acoustic sensors. A subject speaker is identified as the source of the speech input from the obtained non-acoustic data. One or more non-acoustic attributes of the subject speaker is analyzed. A speech recognition system is adjusted based on the one or more analyzed non-acoustic attributes.	05-19-2016
20160188712	RECIPE IDENTIFICATION METHOD AND APPARATUS - Disclosed embodiments include apparatus, method and storage medium associated with recipe identification. In embodiments, an apparatus may include a recipe identification function configured to receive or retrieve a text document, analyze the text document to identify a recipe, and return the identified recipe. Other embodiments may be described and claimed.	06-30-2016

Patent applications in class Clustering

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Clustering

Subclass of:

704 - Data processing: speech signal processing, linguistics, language translation, and audio compression/decompression

704200000 - SPEECH SIGNAL PROCESSING

704231000 - Recognition

704243000 - Creating patterns for matching

Patent class list (only not empty are listed)

Deeper subclasses: