Class / Patent application number | Description | Number of patent applications / Date published |
704214000 | Voiced or unvoiced | 27 |
20080243494 | DIALOG DETECTING APPARATUS, DIALOG DETECTING METHOD, AND COMPUTER PROGRAM PRODUCT - A speech receiving unit receives a user ID, a speech obtained at a terminal, and an utterance duration, from the terminal. A proximity determining unit calculates a correlation value expressing a correlation between speeches received from plural terminals, compares the correlation value with a first threshold value, and determines that the plural terminals that receive the speeches whose correlation value is calculated are close to each other, when the correlation value is larger than the first threshold value. A dialog detecting unit determines whether a relationship between the utterance durations received from the plural terminals that are determined to be close to each other within an arbitrarily target period during which a dialog is to be detected fits a rule. When the relationship is determined to fit the rule, the dialog detecting unit detects dialog information containing the target period and the user ID. | 10-02-2008 |
20080243495 | Adaptive Voice Playout in VOP - Packetized CELP-encoded speech playout with frame truncation during silence and frame expansion method dependent upon voicing classification with voiced frame expansion maintaining phasealignment. | 10-02-2008 |
20080281586 | REAL-TIME DETECTION AND PRESERVATION OF SPEECH ONSET IN A SIGNAL - A “speech onset detector” provides a variable length frame buffer in combination with either variable transmission rate or temporal speech compression for buffered signal frames. The variable length buffer buffers frames that are not clearly identified as either speech or non-speech frames during an initial analysis. Buffering of signal frames continues until a current frame is identified as either speech or non-speech. If the current frame is identified as non-speech, buffered frames are encoded as non-speech frames. However, if the current frame is identified as a speech frame, buffered frames are searched for the actual onset point of the speech. Once that onset point is identified, the signal is either transmitted in a burst, or a time-scale modification of the buffered signal is applied for compressing buffered frames beginning with the frame in which onset point is detected. The compressed frames are then encoded as one or more speech frames. | 11-13-2008 |
20080288247 | Speech Activity Detection - A method for detecting the presence or absence of an audio signal in a communications system in which an audio signal is encoded by a delta modulation encoding algorithm, and in which a step size parameter is adapted according to characteristics of the encoded signal, the method comprising determining based on the magnitude of the step size parameter whether the encoded signal represents audio activity, and adapting the operation of the communication system based on that determination. | 11-20-2008 |
20090292532 | RECOGNITION PROCESSING OF A PLURALITY OF STREAMING VOICE SIGNALS FOR DETERMINATION OF A RESPONSIVE ACTION THERETO - Streaming voice signals, such as might be received at a contact center or similar operation, are analyzed to detect the occurrence of one or more unprompted, predetermined utterances. The predetermined utterances preferably constitute a vocabulary of words and/or phrases having particular meaning within the context in which they are uttered. Detection of one or more of the predetermined utterances during a call causes a determination of response-determinative significance of the detected utterance(s). Based on the response-determinative significance of the detected utterance(s), a responsive action may be further determined. Additionally, long term storage of the call corresponding to the detected utterance may also be initiated. Conversely, calls in which no predetermined utterances are detected may be deleted from short term storage. In this manner, the present invention simplifies the storage requirements for contact centers and provides the opportunity to improve caller experiences by providing shorter reaction times to potentially problematic situations. | 11-26-2009 |
20090292533 | TREATMENT PROCESSING OF A PLURALITY OF STREAMING VOICE SIGNALS FOR DETERMINATION OF A RESPONSIVE ACTION THERETO - Streaming voice signals, such as might be received at a contact center or similar operation, are analyzed to detect the occurrence of one or more unprompted, predetermined utterances. The predetermined utterances preferably constitute a vocabulary of words and/or phrases having particular meaning within the context in which they are uttered. Detection of one or more of the predetermined utterances during a call causes a determination of response-determinative significance of the detected utterance(s). Based on the response-determinative significance of the detected utterance(s), a responsive action may be further determined. Additionally, long term storage of the call corresponding to the detected utterance may also be initiated. Conversely, calls in which no predetermined utterances are detected may be deleted from short term storage. In this manner, the present invention simplifies the storage requirements for contact centers and provides the opportunity to improve caller experiences by providing shorter reaction times to potentially problematic situations. | 11-26-2009 |
20090306975 | SYSTEM AND METHOD OF AN IN-BAND MODEM FOR DATA COMMUNICATIONS OVER DIGITAL WIRELESS COMMUNICATION NETWORKS - A system is provided for transmitting information through a speech codec (in-band) such as found in a wireless communication network. A modulator transforms the data into a spectrally noise-like signal based on the mapping of a shaped pulse to predetermined positions within a modulation frame, and the signal is efficiently encoded by a speech codec. A synchronization sequence provides modulation frame timing at the receiver and is detected based on analysis of a correlation peak pattern. A request/response protocol provides reliable transfer of data using message redundancy, retransmission, and/or robust modulation modes dependent on the communication channel conditions. | 12-10-2009 |
20090306976 | SYSTEM AND METHOD OF AN IN-BAND MODEM FOR DATA COMMUNICATIONS OVER DIGITAL WIRELESS COMMUNICATION NETWORKS - A system is provided for transmitting information through a speech codec (in-band) such as found in a wireless communication network. A modulator transforms the data into a spectrally noise-like signal based on the mapping of a shaped pulse to predetermined positions within a modulation frame, and the signal is efficiently encoded by a speech codec. A synchronization sequence provides modulation frame timing at the receiver and is detected based on analysis of a correlation peak pattern. A request/response protocol provides reliable transfer of data using message redundancy, retransmission, and/or robust modulation modes dependent on the communication channel conditions. | 12-10-2009 |
20100198590 | VOICE AND DATA EXCHANGE OVER A PACKET BASED NETWORK WITH VOICE DETECTION - A signal processing system which discriminates between voice signals and data signals modulated by a voiceband carrier. The signal processing system includes a voice exchange, a data exchange and a call discriminator. The voice exchange is capable of exchanging voice signals between a switched circuit network and a packet based network. The signal processing system also includes a data exchange capable of exchanging data signals modulated by a voiceband carrier on the switched circuit network with unmodulated data signal packets on the packet based network. The data exchange is performed by demodulating data signals from the switched circuit network for transmission on the packet based network, and modulating data signal packets from the packet based network for transmission on the switched circuit network. The call discriminator is used to selectively enable the voice exchange and data exchange. | 08-05-2010 |
20100211385 | IMPROVED VOICE ACTIVITY DETECTOR - The present invention relates to a voice activity detector (VAD) comprising at least a first primary voice detector. The voice activity detector is configured to output a speech decision “vad_flag” indicative of the presence of speech in an input signal based on at least a primary speech decision “vad_prim_A” produced by said first primary voice detector. The voice activity detector further comprises a short term activity detector and the voice activity detector is further configured to produce a music decision “vad_music” indicative of the presence of music in the input signal based on a short term primary activity signal αvad_act_prim_A″ produced by said short term activity detector based on the primary speech decision “vad_prim_A” produced by the first voice detector. The short term primary activity signal “vad_act_prim_A” is proportional to the presence of music in the input signal. The invention also relates to a node, e.g. a terminal, in a communication system comprising such a VAD. | 08-19-2010 |
20100268532 | SYSTEM, METHOD AND PROGRAM FOR VOICE DETECTION - A system for voice detection includes a feature value calculation unit that calculates a feature value from an input signal sliced on a per frame basis, a provisional voice/non-voice decision unit that provisionally decides a voiced interval and a non-voiced interval from the feature value calculated on a per frame basis, and a voice/non-voice decision unit that determines a voiced interval duration threshold value or a non-voiced interval duration threshold value, using a ratio of the feature value found on a per frame basis to a threshold value for the feature value and that re-decides the voiced interval and the non-voiced interval, using the voiced interval duration threshold value determined and the non-voiced interval duration threshold value determined. By determining the voiced interval duration threshold value and the non-voiced interval duration threshold value, using the feature value found on a per frame basis and the threshold value for the feature value, the constraint of the shaping rule may be made weaker, or stronger in case the feature value found on a per frame basis can be regarded as being reliable or not, thereby allowing voice detection to be made without dependency upon a noise environment. | 10-21-2010 |
20100280824 | Wind Suppression/Replacement Component for use with Electronic Systems - Systems and methods to reduce the negative impact of wind on an electronic system include use of a first detector that receives a first signal and a second detector that receives a second signal. A voice activity detector (VAD) coupled to the first detector generates a VAD signal when the first signal corresponds to voiced speech. A wind detector coupled to the second detector correlates signals received at the second detector and derives from the correlation wind metrics that characterize wind noise that is acoustic disturbance corresponding to at least one of air flow and air pressure in the second detector. The wind detector controls a configuration of the second detector according to the wind metrics. The wind detector uses the wind metrics to dynamically control mixing of the first signal and the second signal to generate an output signal for transmission. | 11-04-2010 |
20100318351 | SYSTEM AND METHOD FOR OBTAINING A MESSAGE TYPE IDENTIFIER THROUGH AN IN-BAND MODEM - A system and method is provided for obtaining a message type identifier through a speech codec (in-band) such as found in a wireless communication network. A first predetermined sequence with noise-like characteristics is detected and identifies a first message type. A second predetermined sequence with noise-like characteristics is detected and identifies a second message type. | 12-16-2010 |
20100332222 | INTELLIGENT CLASSIFICATION METHOD OF VOCAL SIGNAL - An intelligent classification method is proposed. The method extracts vocal features from the temporal domain, spectral domain and statistical features for measuring the vocal signal. The measured result is grouped by comparing with the trained data with single voiced source, and then different voices can be separated from the vocal signal to be classified. The vocal features are evaluated from temporal domain and spectral domain and the statistical features, and the method can improve the accuracy of the voice classification. | 12-30-2010 |
20110004468 | HEARING AID AND HEARING-AID PROCESSING METHOD - A hearing aid for improving diminished hearing caused by reduced temporal resolution includes: a speech input unit ( | 01-06-2011 |
20110106531 | PROGRAM ENDPOINT TIME DETECTION APPARATUS AND METHOD, AND PROGRAM INFORMATION RETRIEVAL SYSTEM - This invention relates to retrieval for multimedia content, and provides a program endpoint time detection apparatus for detecting an endpoint time of a program by performing processing on audio signals of said program, comprising an audio classification unit for classifying said audio signals into a speech signal portion and a non-speech signal portion; a keyword retrieval unit for retrieving, as a candidate endpoint keyword, an endpoint keyword indicating start or end of the program from said speech signal portion; a content analysis unit for performing content analysis on context of the candidate endpoint keyword retrieved by the keyword retrieval unit to determine whether the candidate endpoint keyword is a valid endpoint keyword; and a program endpoint time determination unit for performing statistics analysis based on the retrieval result of said keyword retrieval unit and the determination result of said content analysis unit, and determining the endpoint time of the program. In addition, this invention also provides a program information retrieval system. With present invention, program information regarding a program attended by user can be rapidly obtained. | 05-05-2011 |
20110196675 | OPERATING METHOD FOR VOICE ACTIVITY DETECTION/SILENCE SUPPRESSION SYSTEM - A VAD/SS system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel. | 08-11-2011 |
20110257966 | SYSTEM AND METHOD OF PROVIDING VOICE UPDATES - A method of providing voice updates is disclosed and may include receiving a voice update. The method may also include scheduling a voice update window. The voice update window may be a predetermined time window in which a voice update is broadcast. | 10-20-2011 |
20120084082 | Adaptive Voice Activity Detection - Encoding audio signals with selecting an encoding mode for encoding the signal categorizing the signal into active segments having voice activity and non-active segments having substantially no voice activity by using categorization parameters depending on the selected encoding mode and encoding at least the active segments using the selected encoding mode. | 04-05-2012 |
20120253796 | SPEECH INPUT DEVICE, METHOD AND PROGRAM, AND COMMUNICATION APPARATUS - A sound is picked up by a microphone. A speech waveform signal is generated based on the picked up sound. A speech segment or a non-speech segment is detected based on the speech waveform signal. The speech segment corresponds to a voice input period during which a voice is input. The non-speech segment corresponds to a non-voice input period during which no voice is input. A determination signal is generated that indicates whether the picked up sound is the speech segment or the non-speech segment. A detected state of the speech segment is indicated based on the determination signal. | 10-04-2012 |
20130138433 | Switching Off DTX for Music - The invention relates to a method for disabling a discontinuous transmission node DTX of a speech encoder if a music signal is detected in a call input signal. The music signal is detected by determining an activity factor corresponding to the relation of sound signal periods relative to scheme signal periods. If the activity factor is higher than a specified activity factor, the DTX is disabled. | 05-30-2013 |
20130151246 | ADAPTIVE VOICE ACTIVITY DETECTION - Encoding audio signals with selecting an encoding mode for encoding the signal categorizing the signal into active segments having voice activity and non-active segments having substantially no voice activity by using categorization parameters depending on the selected encoding mode and encoding at least the active segments using the selected encoding mode. | 06-13-2013 |
20130231928 | SOUND SYNTHESIZING APPARATUS, SOUND PROCESSING APPARATUS, AND SOUND SYNTHESIZING METHOD - A sound synthesizing apparatus includes a waveform storing section which stores a plurality of unit waveforms extracted from different positions, on a time axis, of a sound waveform indicating a voiced sound, and a waveform generating section which generates a synthesized waveform by arranging the plurality of unit waveforms on the time axis. | 09-05-2013 |
20140088958 | SYSTEM AND METHOD FOR SPEECH SYNTHESIS - The present invention is a method and system to convert speech signal into a parametric representation in terms of timbre vectors, and to recover the speech signal thereof. The speech signal is first segmented into non-overlapping frames using the glottal closure instant information, each frame is converted into an amplitude spectrum using a Fourier analyzer, and then using Laguerre functions to generate a set of coefficients which constitute a timbre vector. A sequence of timbre vectors can be subject to a variety of manipulations. The new timbre vectors are converted back into voice signals by first transforming into amplitude spectra using Laguerre functions, then generating phase spectra from the amplitude spectra using Kramers-Knonig relations. A Fourier transformer converts the amplitude spectra and phase spectra into elementary acoustic waves, then superposed to become the output voice. The method and system can be used for voice transformation, speech synthesis, and automatic speech recognition. | 03-27-2014 |
20150073783 | Unvoiced/Voiced Decision for Speech Processing - In accordance with an embodiment of the present invention, a method for speech processing includes determining an unvoicing/voicing parameter reflecting a characteristic of unvoiced/voicing speech in a current frame of a speech signal comprising a plurality of frames. A smoothed unvoicing/voicing parameter is determined to include information of the unvoicing/voicing parameter in a frame prior to the current frame of the speech signal. A difference between the unvoicing/voicing parameter and the smoothed unvoicing/voicing parameter is computed. The method further includes generating an unvoiced/voiced decision point for determining whether the current frame comprises unvoiced speech or voiced speech using the computed difference as a decision parameter. | 03-12-2015 |
20160189107 | APPARATUS AND METHOD FOR AUTOMATICALLY CREATING AND RECORDING MINUTES OF MEETING - A computing device for automatically acquiring and revising minutes of a meeting and a method thereof includes the steps of: identifying one or more silences or notional silences (unvoiced segments) in voice data; determining a segment as being a satisfactory unvoiced segment if the gap of silence lasts for a time period equal to or larger than a predetermined period; dividing the audio data or text representing the audio data into one or more passages of text according to the satisfactory unvoiced segment, and creating an original minutes of the meeting according to the audio data or the representative text being divided into passages and a meeting minutes template stored in the non-transitory storage medium. | 06-30-2016 |
20160203834 | METHOD AND APPARATUS FOR EXEMPLARY SEGMENT CLASSIFICATION | 07-14-2016 |