Class / Patent application number | Description | Number of patent applications / Date published |
704238000 | Distance | 14 |
20080255839 | Speech Recognition Circuit and Method - A speech recognition circuit comprising a circuit for providing state identifiers which identify states corresponding to nodes or groups of adjacent nodes in a lexical tree, and for providing scores corresponding to said state identifiers, the lexical tree comprising a model of words; a memory structure for receiving and storing state identifiers identified by a node identifier identifying a node or group of adjacent nodes, said memory structure being adapted to allow lookup to identify particular state identifiers, reading of the scores corresponding to the state identifiers, and writing back of the scores to the memory structure after modification of the scores; an accumulator for receiving score updates corresponding to particular state identifiers from a score update generating circuit which generates the score updates using audio input, for receiving scores from the memory structure, and for modifying said scores by adding said score updates to said scores; and a selector circuit for selecting at least one node or group of adjacent nodes of the lexical tree according to said scores. | 10-16-2008 |
20100324896 | TURN-TAKING CONFIDENCE - A method for managing interactive dialog between a machine and a user. In one embodiment, an interaction between the machine and the user is managed by determining at least one likelihood value which is dependent upon a possible speech onset of the user. In another embodiment, the likelihood value can be dependent on a model of a desire of the user for specific items, a model of an attention of the user to specific items, or a model of turn-taking cues. The values can be used to determine a mode confidence value that is used by the system to determine the nature of prompts provided to the user. | 12-23-2010 |
20110166857 | Human Voice Distinguishing Method and Device - A human voice distinguishing method and device are provided. The method involves: taking every n sampling points of the current frame of audio signals as one subsection, wherein n is a positive integer, judging whether two adjacent subsections have transition relative to a distinguishing threshold, wherein the sliding maximum absolute value of the two adjacent subsections is more and less than the distinguishing threshold respectively, if so, then determining the current frame to be human voice, where the sliding maximum absolute value of the subsection is obtained by the following method: taking the maximum value of absolute intensity of every sampling point in this subsection as the initial maximum absolute value of this subsection, and taking the maximum value of the initial maximum absolute value of this subsection and m subsections following this subsection as the sliding maximum absolute value of this subsection, wherein m is a positive integer. | 07-07-2011 |
20110172999 | System and Method for Building Emotional Machines - A system, method and computer-readable medium for practicing a method of emotion detection during a natural language dialog between a human and a computing device are disclosed. The method includes receiving an utterance from a user in a natural language dialog, receiving contextual information regarding the natural language dialog which is related to changes of emotion over time in the dialog, and detecting an emotion of the user based on the received contextual information. Examples of contextual information include, for example, differential statistics, joint statistics and distance statistics. | 07-14-2011 |
20120109650 | APPARATUS AND METHOD FOR CREATING ACOUSTIC MODEL - Disclosed herein is an apparatus and method for creating an acoustic model. The apparatus includes a binary tree creation unit, an information creation unit, and a binary tree reduction unit. The binary tree creation unit creates a binary tree by repeatedly merging a plurality of Gaussian components for each Hidden Markov Model (HMM) state of an acoustic model based on a distance measure reflecting a variation in likelihood score. The information creation unit creates information about information about the largest size of the acoustic model in accordance with a platform including a speech recognizer. The binary tree reduction unit reduces the binary tree in accordance with the information about the largest size of the acoustic model. | 05-03-2012 |
20120166194 | METHOD AND APPARATUS FOR RECOGNIZING SPEECH - Disclosed herein are an apparatus and method for recognizing speech. The apparatus includes a frame-based speech recognition unit, a segment division unit, a segment feature extraction unit, a segment speech recognition performance unit, and a combination and synchronization unit. The frame-based speech recognition unit extracts frame speech feature vectors from a speech signal, and performs speech recognition on frames of the speech signal using the frame speech feature vectors and a frame-based probability model. The segment division unit divides the speech signal into segments. The segment feature extraction unit extracts segment speech feature vectors around a boundary between the segments. The segment speech recognition performance unit performs speech recognition on the segments of the speech signal using the segment speech feature vectors and a segment-based probability model. The combination and synchronization unit combines results of the speech recognition for the frames with results of the speech recognition for the segments. | 06-28-2012 |
20120310646 | SPEECH RECOGNITION DEVICE AND SPEECH RECOGNITION METHOD - A speech recognition device and a speech recognition method thereof are disclosed. In the speech recognition method, a key phrase containing at least one key word is received. The speech recognition method comprises steps: receiving a sound source signal of a key word and generating a plurality of audio signals; transforming the audio signals into a plurality of frequency signals; receiving the frequency signals to obtain a space-frequency spectrum and an angular estimation value thereof; receiving the space-frequency spectrum to define and output at least one spatial eigenparameter and, and using the angular estimation value and the frequency signals to perform spotting and evaluation and outputting a Bhattacharyya distance; and receiving the spatial eigenparameter and the Bhattacharyya distance and using corresponding thresholds to determine correctness of the key phrase. Thereby this invention robustly achieves high speech recognition rate under very low SNR conditions. | 12-06-2012 |
20130166294 | Frame Erasure Concealment Technique for a Bitstream-Based Feature Extractor - A frame erasure concealment technique for a bitstream-based feature extractor in a speech recognition system particularly suited for use in a wireless communication system operates to “delete” each frame in which an erasure is declared. The deletions thus reduce the length of the observation sequence, but have been found to provide for sufficient speech recognition based on both single word and “string” tests of the deletion technique. | 06-27-2013 |
20130325468 | CONVERSATION MANAGEMENT METHOD, AND DEVICE FOR EXECUTING SAME - Disclosed are a conversation management method and a device for executing same are disclosed. The device includes: a calculation unit for calculating the importance of an utterance intention, the similarity between utterance intentions, and the relative distance between utterance intentions using at least one of a plurality of utterance intentions in a corpus and an utterance intention in a sequence relationship with the at least one utterance intention; a similarity calculating unit for calculating the similarity between conversation flows by comparing a conversation flow obtained from a corpus and a conversation flow obtained from a user utterance by means of the importance and similarity of an utterance intention; and an utterance intention verifying unit for calculating an evaluation score of an utterance intention by evaluating a user utterance according to the relative distance between utterance intentions. Accordingly, the present invention enables the flow-processing of various conversations which are not processed in a typical information obtaining conversation device, and also determines the ranking of appropriate utterance intentions in each conversational situation by means of a score calculated by a ranking function in order to output the ranking. | 12-05-2013 |
20140025376 | METHOD AND APPARATUS FOR REAL TIME SALES OPTIMIZATION BASED ON AUDIO INTERACTIONS ANALYSIS - The subject matter discloses a computerized method for sales optimization comprising: receiving at a computer server a digital representation of a portion of an interaction between a customer and an organization representative, the portion of an interaction comprises a speech signal of the customer and a speech signal of the organization representative; analyzing the speech signal of the organization representative; analyzing the speech signal of the customer; determining a distance vector between the speech signal of the organization representative and the speech signal of the customer; and predicting a sale success probability score for the captured speech signal portion. | 01-23-2014 |
20140330564 | FRAME ERASURE CONCEALMENT TECHNIQUE FOR A BITSTREAM-BASED FEATURE EXTRACTOR - A frame erasure concealment technique for a bitstream-based feature extractors in a speech recognition system particularly suited for use in a wireless communication system operates to “delete” each frame in which an erasure is declared. The deletions thus reduce the length of the observation sequence, but have been found to provide for sufficient speech recognition based on both single word and “string” tests of the deletion technique. | 11-06-2014 |
20140330565 | Apparatus and Method for Model Adaptation for Spoken Language Understanding - An apparatus and a method are provided for building a spoken language understanding model. Labeled data may be obtained for a target application. A new classification model may be formed for use with the target application by using the labeled data for adaptation of an existing classification model. In some implementations, the existing classification model may be used to determine the most informative examples to label. | 11-06-2014 |
20160027444 | METHOD AND APPARATUS FOR DETECTING SPLICING ATTACKS ON A SPEAKER VERIFICATION SYSTEM - A method of detecting an occurrence of splicing in a speech signal includes comparing one or more discontinuities in the test speech signal to one or more reference speech signals corresponding to the test speech signal. The method may further include calculating a frame-based spectral-like representation S | 01-28-2016 |
20160133259 | DETERMINING HOTWORD SUITABILITY - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user. | 05-12-2016 |