Entries |
Document | Title | Date |
20080208571 | Maximum-Likelihood Universal Speech Iconic Coding-Decoding System (MUSICS) - This application for patent describes an invention toward achieving potentially hundred- to thousand-fold enhancement in the efficiency of the utilization of frequency-bandwidth for digital transmission of speech. This invention is based on the observation that human speech can be assumed to be composed of a series of contiguous fundamental ‘phonic elements’ (“phonoms”) that could be judiciously used toward developing an extremely low bit-rate digital coding of the speech signals. A generic example of a simple implementation of this invention—the basic equipment and associated device(s), methodologies and technologies—for ultr-low bit-rate voice-telecommunications over any transmission channel is also presented. The present invention is universally applicable to any language of the world, and to voice-telecommunications employing various media and service-applications including, but not limited to, land-line copper-wire networks, satellite telephony, satellite radio, fiber-optical cables, terrestrial wireless, voice over Internet Protocols (VoIP), and similar media and services. | 08-28-2008 |
20080228471 | Intelligent solo-mute switching - Methods and apparatus are disclosed for approximating an MDCT coefficient of a block of windowed sinusoid having a defined frequency, the block being multiplied by a window sequence and having a block length and a block index. A finite trigonometric series is employed to approximate the window sequence. A window summation table is pre-computed using the finite trigonometric series and the defined frequency of the sinusoid. A block phase is computed for each block with the defined frequency, the block length and the block index. An MDCT coefficient is approximated by the dot product of a phase vector computed using the block phase with a corresponding row of the window summation table. | 09-18-2008 |
20080235007 | SYSTEM AND METHOD FOR ADDRESSING CHANNEL MISMATCH THROUGH CLASS SPECIFIC TRANSFORMS - A method and system for speaker recognition and identification includes transforming features of a speaker utterance in a first condition state to match a second condition state and provide a transformed utterance. A discriminative criterion is used to generate a transform that maps an utterance to obtain a computed result. The discriminative criterion is maximized over a plurality of speakers to obtain a best transform for recognizing speech and/or identifying a speaker under the second condition state. Speech recognition and speaker identity may be determined by employing the best transform for decoding speech to reduce channel mismatch. | 09-25-2008 |
20080249765 | Audio Signal Decoding Using Complex-Valued Data - A decoder particularly, but not exclusively, for MPEG-1 layer III data signals, in which recovered spectral coefficients are transformed into time domain signal components, the time domain signal components then being transformed, using a forward transform which is orthogonally modulated with respect to the forward transform that was used at the encoder, to produce a set of second spectral coefficients. In this way, the first and second spectral coefficients may be used as complex-valued spectral coefficients which are amenable to post-processing. In the preferred embodiment, the complex-valued frequency components are, after post-processing, transformed to the time domain using an odd-frequency modulated Discrete Fourier Transform (DFT). | 10-09-2008 |
20080249766 | Scalable Decoder And Expanded Layer Disappearance Hiding Method - A scalable decoder which does not frequently switch the band of the decoded signal even if the signal in an expanded layer in band scalable encoding disappear and does not give any strangeness or discomfort to the subjective quality. If frame disappearance does not occur, the signal is a signal (S | 10-09-2008 |
20080312912 | AUDIO SIGNAL ENCODING/DECODING METHOD AND APPARATUS - Provided is an audio signal encoding method including transforming an input signal from a time domain to a time/frequency domain using a first transformation method, extracting a stereo parameter from a signal of the time/frequency domain, encoding the stereo parameter, and down-mixing the signal of the time/frequency domain, transforming each of sub-bands of the down-mixed signal to a frequency domain by using a second transformation method, and encoding the signal of the frequency domain in the frequency domain. | 12-18-2008 |
20090018824 | AUDIO ENCODING DEVICE, AUDIO DECODING DEVICE, AUDIO ENCODING SYSTEM, AUDIO ENCODING METHOD, AND AUDIO DECODING METHOD - Provided is an audio encoding device for modeling a spectrum waveform and accurately restoring the spectrum waveform. The audio encoding device includes: an FFT unit ( | 01-15-2009 |
20090030676 | METHOD OF DERIVING A COMPRESSED ACOUSTIC MODEL FOR SPEECH RECOGNITION - A method of deriving a compressed acoustic model for speech recognition is disclosed herein. In a described embodiment, the method comprises transforming an acoustic model into an eigenspace at step | 01-29-2009 |
20090043566 | SPEECH PROCESSING APPARATUS AND METHOD - A speech processing apparatus includes a plurality of microphones which receive speech produced by a first sound source to obtain first speech signals for a plurality of channels having one-to-one correspondence with the plurality of microphones, a calculation unit configured to calculate a first characteristic amount indicative of an inter-channel correlation of the first speech signals, a storage unit configured to store in advance a second characteristic amount indicative of an inter-channel correlation of second speech signals for the plurality of channels obtained by receiving speech produced by a second sound source by the plurality of microphones, and a collation unit configured to collate the first characteristic amount with the second characteristic amount to determine whether the first sound source matches with the second sound source. | 02-12-2009 |
20090076804 | ASSISTIVE LISTENING SYSTEM WITH MEMORY BUFFER FOR INSTANT REPLAY AND SPEECH TO TEXT CONVERSION - A portable assistive listening system for enhancing sound for hearing impaired individuals includes a fully functional hearing aid and a separate handheld digital signal processing (DSP) device. The focus of the present invention is directed to the handheld DSP device. The DSP device includes a programmable digital signal processor, a UWB transceiver for communicating with the hearing aid and/or other wireless audio sources, an LCD display, a user input device (keypad) and at least one memory device for storing programming settings and data. Specifically, the invention focuses on a memory buffer within the DSP device, and configuration of the device to buffer an incoming audio signal, to enhance the audio for replay, and to replay that audio. The device is further configured to convert the audio signal to text for display on the handheld DSP device. The device enhances the audio and then buffers the enhanced audio stream prior to output. | 03-19-2009 |
20090094022 | APPARATUS FOR CREATING SPEAKER MODEL, AND COMPUTER PROGRAM PRODUCT - A transformation-parameter calculating unit calculates a first model parameter indicating a parameter of a speaker model for causing a first likelihood for a clean feature to maximum, and calculates a transformation parameter for causing the first likelihood to maximum. The transformation parameter transforms, for each of the speakers, a distribution of the clean feature corresponding to the identification information of the speaker to a distribution represented by the speaker model of the first model parameter. A model-parameter calculating unit transforms a noisy feature corresponding to identification information for each of speakers by using the transformation parameter, and calculates a second model parameter indicating a parameter of the speaker model for causing a second likelihood for the transformed noisy feature to maximum. | 04-09-2009 |
20090132241 | METHOD AND SYSTEM FOR REDUCING A VOICE SIGNAL NOISE - A method is provided whereby, before being subjected to a low rate voice coding, an incoming digital voice signal is chronologically segmented into blocks, the blocks are broken down respectively, in chronological order, into frequency components by a transformation in the frequency range and the frequency components are multiplied by weight factors depending on the frequency and modifiable in time, a frequency component being multiplied by the last weight factor calculated for the frequency component if the factor is less than the current weight factor. | 05-21-2009 |
20090150143 | MDCT domain post-filtering apparatus and method for quality enhancement of speech - A post-filtering apparatus and method for speech enhancement in a modified discrete cosine transform (MDCT) domain are disclosed. In the apparatus and method, previous and current MDCT coefficients are used for obtaining a speech spectrum coefficient similar to a real speech spectrum, and a convex function is used for transforming the speech spectrum coefficient and obtaining a post-filter coefficient so that difference can increase in the case where the speech spectrum coefficient is small but decrease in the case where the coefficient is large. Then, the post-filter coefficient is applied to the MDCT coefficient. With this configuration, both the current and previous MDCT values are used, so that it is possible to obtain a spectrum coefficient similar to the real speech spectrum and to obtain a more accurate filter coefficient. Further, the coefficient is adaptively transformed through the convex function, thereby enhancing speech quality. | 06-11-2009 |
20090157393 | ENCODING DEVICE AND DECODING DEVICE - An encoding device ( | 06-18-2009 |
20090210219 | APPARATUS AND METHOD FOR CODING AND DECODING RESIDUAL SIGNAL - Provided is a residual signal coding/decoding apparatus and method. The residual signal coding apparatus includes a transformer, a band splitter, a pulse searcher, and a pulse quantizer. The transformer transforms time-domain residual signals into a frequency domain to output transform coefficients. The band splitter splits the transform coefficients into bands to output the transform coefficients. The pulse searcher searches the transform coefficients for the respective bands to select optimal pulses and output parameters of the optimal pulses. The pulse quantizer quantizes the parameters of the optimal pulses. | 08-20-2009 |
20090222258 | VOICE ACTIVITY DETECTION SYSTEM, METHOD, AND PROGRAM PRODUCT - A voice activity detection method in a low SNR environment. The voice activity detection is performed by extracting a long-term spectrum variation component and a harmonic structure as feature vectors from a speech signal and increasing difference in feature vectors between speech and non-speech (i) using the long-term spectrum variation component feature or (ii) using a long-term spectrum variation component extraction and a harmonic structure feature extraction. A correct rate and an accuracy rate of the voice activity detection is improved over conventional methods by using a long-term spectrum variation component having a window length over an average phoneme duration of an utterance in the speech signal. The voice activity detection system and method provides speech processing, automatic speech recognition, and speech output capable of very accurate voice activity detection. | 09-03-2009 |
20090234644 | Low-complexity encoding/decoding of quantized MDCT spectrum in scalable speech and audio codecs - A scalable speech and audio codec is provided that implements combinatorial spectrum encoding. A residual signal is obtained from a Code Excited Linear Prediction (CELP)-based encoding layer, where the residual signal is a difference between an original audio signal and a reconstructed version of the original audio signal. The residual signal is transformed at a Discrete Cosine Transform (DCT)-type transform layer to obtain a corresponding transform spectrum having a plurality of spectral lines. The transform spectrum spectral lines are transformed using a combinatorial position coding technique. | 09-17-2009 |
20090306971 | AUDIO SIGNAL QUALITY ENHANCEMENT APPARATUS AND METHOD - An audio signal quality enhancement apparatus and method. The apparatus includes a pitch calculating unit to extract a pitch period of an audio signal, a frequency domain transforming unit to transform the audio signal to a frequency domain, a frequency band dividing unit to classify the transformed audio signal into audio signals for each of the plurality of frequency bands based on the extracted pitch period, and a pitch enhancement unit to determine a gain based on a volume of the transformed audio signal, and to generate an output signal by multiplying each of the classified audio signals with respect to each of the plurality of frequency bands by the gain, thereby enhancing quality of the audio signal. | 12-10-2009 |
20090306972 | Dropout Concealment for a Multi-Channel Arrangement - A method conceals dropouts in one or more audio channels of a multi-channel arrangement. The method maps transmitted signals into a frequency domain during an error-free signal transmission of two or more channels. A magnitude spectra and spectral filter coefficients are derived. The spectral filter coefficients relate the magnitude spectrum of the audio channel to the magnitude spectrum of at least one other channel. When a dropout occurs, a replacement signal is generated through the filter coefficients and a substitution signal. The filter coefficients may be generated prior to the detection of the dropout. | 12-10-2009 |
20090313009 | Method for Trained Discrimination and Attenuation of Echoes of a Digital Signal in a Decoder and Corresponding Device - The invention concerns a method for trained discrimination and attenuation of echoes of a digital audio signal generated from a transform coding, which consists, for each current frame of the signal. In comparing (A) in real time, in at least one frequency band a variable derived from one characteristic of the echo generating signal with that of a non-echo generating signal at a threshold value, and deducing therefrom (B) the existence or non-existence (C) of an echo derived from the transform coding, discriminating the existence of the echo and defining (D) a false alarm zone in the high-energy parts of the digital audio signal, determining an initial processing and attenuating the echoes (E) in the parts complementary to the low-energy false alarm zone and inhibiting (F) the attenuation of echoes in the false alarm zone. The invention is applicable to the technology of coders/decoders in particular hierarchical coders/decoders. | 12-17-2009 |
20100010808 | Method, Apparatus and Computer Program for Suppressing Noise - To provide a noise suppressing method and apparatus capable of achieving high-quality noise suppression using a lower amount of operations. Noise contained in an input signal is suppressed by transforming the input signal into frequency-domain signals; integrating bands of the frequency-domain signals to determine integrated frequency-domain signals; determining estimated noise based on the integrated frequency-domain signals; determining spectral gains based on the estimated noise and said integrated frequency-domain signals; and weighting said frequency-domain signals by the spectral gains. | 01-14-2010 |
20100070268 | MULTIMODAL UNIFICATION OF ARTICULATION FOR DEVICE INTERFACING - A system for a multimodal unification of articulation includes a voice signal modality to receive a voice signal, and a control signal modality which receives an input from a user and generates a control signal from the input which is selected from predetermined inputs directly corresponding to the phonetic information. The interactive voice based phonetic input system also includes a multimodal integration system to receive and integrates the voice signal and the control signal. The multimodal integration system delimits a context of a spoken utterance of the voice signal by using the control signal to preprocess and discretize into phonetic frames. A voice recognizer analyzing the voice signal integrated with the control signal to output a voice recognition result. This new paradigm helps overcome constraints found in interfacing mobile devices. Context information facilitates the handling of the commands in the application environment. | 03-18-2010 |
20100076754 | LOW-DELAY TRANSFORM CODING USING WEIGHTING WINDOWS - The invention relates to transform coding/decoding of a digital audio signal represented by a succession of frames, using windows of different lengths. For the coding within the meaning of the invention, it is sought to detect ( | 03-25-2010 |
20100082335 | SYSTEM AND METHOD FOR TRANSMITTING AND RECEIVING WIDEBAND SPEECH SIGNALS - The system for transmitting and receiving a wideband speech signal includes an A/D converter for receiving an analog speech signal to convert it into a digital speech signal, a transmitter analysis filter for receiving the digital speech signal and dividing it into a baseband signal and an enhancement residual band signal, a standard baseband encoder for accepting the baseband signal and coding it using an ITU-T encoder, an additional baseband encoder for reducing standard coding distortion in the baseband signal, an enhancement residual band encoder for coding a signal obtained by removing the coded baseband signal from the original digital speech signal, and an IP network interface for multiplexing the coded standard and additional baseband signals and enhancement residual band signal. | 04-01-2010 |
20100161320 | METHOD AND APPARATUS FOR ADAPTIVE SUB-BAND ALLOCATION OF SPECTRAL COEFFICIENTS - An apparatus and method for adaptive sub-band allocation of spectral coefficients are disclosed. The sizes of sub-bands are determined according to the distribution of spectral coefficients transformed from an input speech/audio signal to perform more elaborate quantization in units of sub-bands. Thus, quantization noise of the spectral coefficients is reduced, and sound quality in a frequency region is enhanced, thereby improving the quality of the signal. | 06-24-2010 |
20100161321 | SAMPLING RATE CONVERSION APPARATUS, CODING APPARATUS, DECODING APPARATUS AND METHODS THEREOF - A coding apparatus reduces a circuit scale and the amount of coding processing calculation. A frequency domain conversion section performs a frequency analysis of the signal sampled at a sampling rate Fx with an analysis length of 2·Na and calculates first spectrum S | 06-24-2010 |
20100169081 | ENCODING DEVICE, DECODING DEVICE, AND METHOD THEREOF - An encoding device includes: a frequency region converter which converts an inputted audio signal into a frequency region; a band selector which selects a quantization object band from a plurality of sub bands obtained by dividing the frequency region; and a shape quantizer which quantizes the shape of the frequency region parameter of the quantization object band. When a prediction encoding presence/absence determiner determines that the number of common sub bands between the quantization object band and the quantization object band selected in the past is not smaller than a predetermined value, a gain quantizer performs prediction encoding on the gain of the frequency region parameter of the quantization object band. When the number of common sub bands is smaller than the predetermined value, the gain quantizer non-predictively encodes the gain of the frequency region parameter of the quantization object band. | 07-01-2010 |
20100169082 | Enhancing Receiver Intelligibility in Voice Communication Devices - The intelligibility of speech signals is improved in the many situations where a voice signal is communicated or stored. Means and methods are disclosed for developing a scheme with high voice signal intelligibility without sacrificing the voice quality. The disclosed method comprises certain steps, including, but not limited to: Learning the noise on near-end side and enhancing the far-end voice as a function of the noise type and noise level on the near-end side. The disclosed method and apparatus are especially useful to increase the intelligibility of the communication device's loudspeaker output. The invention includes processing of an input speech signal to generate an enhanced intelligent signal. The FFT spectrum of the speech received from the far-end is modified in accordance with the LPC spectrum of the local background noise to generate an enhanced intelligent signal. | 07-01-2010 |
20100169083 | METHODS AND APPARATUS FOR PROCESSING AND DISPLAY OF VOICE DATA - Apparatus and methods for creating a composite data source having a common data representation from disparate sources of voice data. Data transmission links are established to heterogeneous messaging data sources, requests for voice data is sent using data access protocols, the voice data is received, and a set of voice data transformation rules are selectively applied to the voice data to transform the data into a common data representation. The common data representation can also be used as a source for reporting and graphical displays to monitor the operational aspects of the sources of voice data. | 07-01-2010 |
20100185440 | TRANSCODING METHOD, TRANSCODING DEVICE AND COMMUNICATION APPARATUS - The embodiments of a transcoding method, a transcoding device, and a communication apparatus are provided. The embodiment of a method includes: receiving a bit stream input from a sending end; determining an attribute of discontinuous transmission (DTX) used by a receiving end and a frame type of the input bit stream; and transcoding the input bit stream in a corresponding processing manner according to a determination result. Thereby, a corresponding transcoding operation is performed on the input bit stream according to the attribute of DTX used by the receiving end and the frame type of the input bit stream. In such a manner, input bit streams of various types can be processed, and the input bit streams can be correspondingly transcoded according to the requirements of the receiving end. Therefore, the average computational complexity and peak computational complexity can be effectively decreased without decreasing the quality of the synthesized speech. | 07-22-2010 |
20100198586 | AUDIO TRANSFORM CODING USING PITCH CORRECTION - A processed representation of an audio signal having a sequence of frames is generated by sampling the audio signal within first and second frames of the sequence of frames, the second frame following the first frame, the sampling using information on a pitch contour of the first and second frames to derive a first sampled representation. The audio signal is sampled within the second and third frames, the third frame following the second frame in the sequence of frames. The sampling uses the information on the pitch contour of the second frame and information on a pitch contour of the third frame to derive a second sampled representation. A first scaling window is derived for the first sampled representation, and a second scaling window is derived for the second sampled representation, the scaling windows depending on the samplings applied to derive the first sampled representations or the second sampled representation. | 08-05-2010 |
20100228541 | SUBBAND CODING APPARATUS AND METHOD OF CODING SUBBAND - A subband coding apparatus carries out subband coding which prevents deterioration in coding performance and improves audio quality of decoded signals. The subband coding apparatus includes a low-band coding section ( | 09-09-2010 |
20100228542 | Method and System for Hiding Lost Packets - A method and system for hiding lost packets are disclosed. The method includes: obtaining a time-domain signal segment from a frame prior to a lost signal and a frame subsequent to the lost signal respectively according to periodicity of pitch and phase of the signal, and performing transformation from a time domain to a frequency domain to obtain a frequency-domain coefficient of the prior frame and a frequency-domain coefficient of the subsequent frame; interpolating values into an amplitude value of the frequency-domain coefficient of the prior frame and an amplitude value of the frequency-domain coefficient of the subsequent frame to obtain an amplitude value of the frequency-domain coefficient of multiple reconstructed signals; selecting a phase most similar to the phase of the reconstructed signals from the prior frame and/or the subsequent frame as a phase value of the frequency-domain coefficient of the reconstructed signals; and performing transformation from the frequency domain to the time domain according to the amplitude value and the phase value of the frequency-domain coefficient of the reconstructed signals to obtain time-domain signals of the reconstructed signals, and superposing the time-domain signals of the reconstructed signals to recover the lost signal. | 09-09-2010 |
20100250244 | ENCODER AND DECODER - There is provided an encoder capable of improving inter-channel prediction (ICP) performance in scalable stereo sound encoding using an ICP. In the encoder, ICP analysis units ( | 09-30-2010 |
20100262421 | ENCODING DEVICE, DECODING DEVICE, AND METHOD THEREOF - Provided is an encoding device which improves the sound quality of a stereo signal while maintaining a low bit rate. The encoding device includes: an LP inverse filter ( | 10-14-2010 |
20100312551 | METHOD AND AN APPARATUS FOR PROCESSING A SIGNAL - Disclosed is a method of processing a signal, which includes receiving at least one of a first signal and a second signal, receiving mode information, and coding the at least one of the first signal and the second signal using at least one of a first coding scheme and a second coding scheme according to the mode information, wherein the mode information is information for indicating that a prescribed mode corresponds to which one of at least three modes. | 12-09-2010 |
20110015922 | Speech Intelligibility Improvement Method and Apparatus - Prevalence detection is advantageously applied to the result of specific spectral discrimination to adaptively determine prevalent frequencies existing within an audio signal containing speech. Prevalent frequencies in this audio signal so isolated are attenuated in a highly selective manner, thus reducing the masking potential of pervasive resonances and obfuscative energy within the speech itself over low energy language-imparting speech elements. | 01-20-2011 |
20110035212 | TRANSFORM CODING OF SPEECH AND AUDIO SIGNALS - In a method of perceptual transform coding of audio signals in a telecommunication system, performing the steps of determining transform coefficients representative of a time to frequency transformation of a time segmented input audio signal; determining a spectrum of perceptual sub-bands for said input audio signal based on said determined transform coefficients; determining masking thresholds for each said sub-band based on said determined spectrum; computing scale factors for each said sub-band based on said determined masking thresholds, and finally adapting said computed scale factors for each said sub-band to prevent energy loss for perceptually relevant sub-bands. | 02-10-2011 |
20110046946 | ENCODER, DECODER, AND THE METHODS THEREFOR - Provided is an encoder which can decode a high-quality stereo signal while keeping the amount of information in the bit allocation information to a minimum when a scalable coding technique is used for a stereo signal. In the encoder, a principal component analysis (PCA) converter ( | 02-24-2011 |
20110054885 | Device and Method for a Bandwidth Extension of an Audio Signal - For a bandwidth extension of an audio signal, in a signal spreader the audio signal is temporally spread by a spread factor greater than 1. The temporally spread audio signal is then supplied to a demicator to decimate the temporally spread version by a decimation factor matched to the spread factor. The band generated by this decimation operation is extracted and distorted, and finally combined with the audio signal to obtain a bandwidth extended audio signal. A phase vocoder in the filterbank implementation or transformation implementation may be used for signal spreading. | 03-03-2011 |
20110119054 | APPARATUS FOR ENCODING AND DECODING OF INTEGRATED SPEECH AND AUDIO - Provided is an apparatus for integrally encoding and decoding a speech signal and an audio signal. An encoding apparatus for integrally encoding a speech signal and an audio signal, may include: a module selection unit to analyze a characteristic of an input signal and to select a first encoding module for encoding a first frame of the input signal; a speech encoding unit to encode the input signal according to a selection of the module selection unit and to generate a speech bitstream; an audio encoding unit to encode the input signal according to the selection of the module selection unit and to generate an audio bitstream; and a bitstream generation unit to generate an output bitstream from the speech encoding unit or the audio encoding unit according to the selection of the module selection unit. | 05-19-2011 |
20110137643 | SPECTRAL SMOOTHING DEVICE, ENCODING DEVICE, DECODING DEVICE, COMMUNICATION TERMINAL DEVICE, BASE STATION DEVICE, AND SPECTRAL SMOOTHING METHOD - Disclosed is a spectral smoothing device with a structure whereby smoothing is performed after a nonlinear conversion has been performed for a spectrum calculated from an audio signal, and with which the amount of processing calculation is significantly reduced while maintaining excellent audio quality. With this spectral smoothing device, a sub band division unit ( | 06-09-2011 |
20110153315 | AUDIO AND SPEECH PROCESSING WITH OPTIMAL BIT-ALLOCATION FOR CONSTANT BIT RATE APPLICATIONS - Methods and apparatus for audio and speech processing including generating a plurality of frames, each of the frames comprising a plurality of transform coefficients, and allocating bits to the transform coefficients in each of the frames such that at least two of the transform coefficients in the same frame have different bit allocations and the total number of the bits allocated to the transform coefficients in at least two of the frames is equal. | 06-23-2011 |
20110208515 | SYSTEMS AND METHODS FOR GATHERING RESEARCH DATA - Methods and systems are provided for gathering research data that includes information pertaining to audio signals received on a portable device, such as a cell phone. Frequency domain data is received or produced, a signature is extracted from the frequency domain data and an ancillary code is read from the frequency domain data. | 08-25-2011 |
20110218799 | DECODER FOR AUDIO SIGNAL INCLUDING GENERIC AUDIO AND SPEECH FRAMES - A method for decoding audio frames includes producing a first frame of coded audio samples, producing at least a portion of a second frame of coded audio samples, generating audio gap filler samples based on parameters representative of a weighted segment of the first frame of coded audio samples or a weighted segment of the portion of the second frame of coded audio samples, and forming a sequence including the audio gap filler samples and the portion of the second frame of coded audio samples. | 09-08-2011 |
20110224975 | LOW-DELAY AUDIO CODER - The present invention relates to methods and devices for encoding and decoding digital audio signals, e.g. a speech signal. An audio coder and a decoder are provided wherein a modeller adds a first distribution model obtained from model parameters of past segments of the digital audio signal and a fixed distribution model, each of the models being multiplied by a weighting coefficient, for obtaining a combined distribution model. The weighting coefficients are selected to minimize a code length of a current segment of the digital audio signal. As the combined distribution model is a sum of several distribution models, wherein at least some of the models is based on the model parameters, flexibility is introduced in the signal model used to encode the digital audio signal. Thus, an audio coder and decoder providing a low bit rate in average, low bit rate variations and low error propagation are provided. | 09-15-2011 |
20110282655 | Voice band enhancement apparatus and voice band enhancement method - A voice band enhancement apparatus is used that includes a frequency transform unit to perform frequency transform on an input signal to calculate a spectrum, a mapping function calculating unit to calculate, by use of the spectrum, a mapping function for generating high-range components from low-range components of the spectrum, a wide-band spectrum generating unit to generate, in a higher range than a band of the spectrum, a high-range spectrum based on the mapping function and to integrate the generated high-range spectrum and the spectrum calculated by the frequency transform unit, thereby generating a wide-band spectrum wider than the band of the spectrum calculated by the frequency transform unit, and an inverse frequency transform unit to perform inverse frequency transform on the wide-band spectrum to calculate an output signal. | 11-17-2011 |
20110282656 | Method And Arrangement For Processing Of Audio Signals - Method and decoder for processing of audio signals. The method and decoder relate to deriving a processed vector {circumflex over (d)} by applying a post-filter directly on a vector d comprising quantized MDCT domain coefficients of a time segment of an audio signal. The post-filter is configured to have a transfer function H which is a compressed version of the envelope of the vector d. A signal waveform is reconstructed by performing an inverse MDCT transform on the processed vector {circumflex over (d)}. | 11-17-2011 |
20110282657 | CODING METHOD, DECODING METHOD, AND APPARATUSES, PROGRAMS AND RECORDING MEDIA THEREFOR - An object of the present invention is to achieve high coding efficiency for a companded signal sequence and reduce the amount of codes. A coding method according to the present invention includes an analysis step and a signal sequence transformation step. The analysis step is to check whether or not there is a number that is included in a particular range but does not occur in a second signal sequence (a number sequence that indicates the magnitude (magnitude relationship) of original signals) and output information that indicates the number that does not occur. The signal sequence transformation step is to output a transformed second signal sequence (which is formed by assigning new numbers to indicate the magnitudes of original signals (the magnitude relationship among original signals) excluding the magnitude of the original signal indicated by the number that does not occur and replacing the numbers in the second signal sequence with the newly assigned numbers) in the case where it is determined in the analysis step that there is a number that does not occur. The particular range is defined as a number that indicates a positive value having a minimum absolute value and a number that indicates a negative value having a minimum absolute value, for example. | 11-17-2011 |
20120010878 | COMMUNICATION APPARATUS - Provided is a communication apparatus for direct communication between networks of different types. The communication apparatus includes a transmission data selector determining whether or not data input from a first communication network is speech data, a data processor digitizing and packetizing the data transferred from the transmission data selector, and a modem for converting the digitized and packetized data into analog data and then directly transmitting the analog data to a second communication network different from the first communication network through a speech channel. | 01-12-2012 |
20120010879 | SPEECH ENCODING/DECODING DEVICE - A linear prediction coefficient of a signal represented in a frequency domain is obtained by performing linear prediction analysis in a frequency direction by using a covariance method or an autocorrelation method. After the filter strength of the obtained linear prediction coefficient is adjusted, filtering may be performed in the frequency direction on the signal by using the adjusted coefficient, whereby the temporal envelope of the signal is transformed. This reduces the occurrence of pre-echo and post-echo and improves the subjective quality of the decoded signal, without significantly increasing the bit rate in a band extension technique in the frequency domain represented by SBR. | 01-12-2012 |
20120016667 | Spectrum Flatness Control for Bandwidth Extension - In accordance with an embodiment, a method of decoding an encoded audio bitstream at a decoder includes receiving the audio bitstream, decoding a low band bitstream of the audio bitstream to get low band coefficients in a frequency domain, and copying a plurality of the low band coefficients to a high frequency band location to generate high band coefficients. The method further includes processing the high band coefficients to form processed high band coefficients. Processing includes modifying an energy envelope of the high band coefficients by multiplying modification gains to flatten or smooth the high band coefficients, and applying a received spectral envelope decoded from the received audio bitstream to the high band coefficients. The low band coefficients and the processed high band coefficients are then inverse-transformed to the time domain to obtain a time domain output signal. | 01-19-2012 |
20120016668 | Energy Envelope Perceptual Correction for High Band Coding - In accordance with an embodiment, A method of encoding an audio bitstream at an encoder includes encoding an original low band signal at the encoder by using a closed loop analysis-by-synthesis approach to obtain a coded low band signal, encoding an original high band signal at the encoder by using an open loop energy matching approach to obtain coded high band energy envelopes, comparing an energy of the coded low band signal with an energy of a corresponding original low band signal for a subframe, and generating an indication flag that indicates whether an energy envelope perceptual correction is needed for the subframe based on comparing the energy. | 01-19-2012 |
20120065965 | Apparatus and method for encoding and decoding signal for high frequency bandwidth extension - An apparatus and method for encoding and decoding a signal for high frequency bandwidth extension are provided. An encoding apparatus may down-sample a time domain input signal, may core-encode the down-sampled time domain input signal, may transform the core-encoded time domain input signal to a frequency domain input signal, and may perform bandwidth extension encoding using a basic signal of the frequency domain input signal. | 03-15-2012 |
20120089389 | Flexible and Scalable Combined Innovation Codebook for Use in CELP Coder and Decoder - In a CELP coder, a combined innovation codebook coding device comprises a pre-quantizer of a first, adaptive-codebook excitation residual, and a CELP innovation-codebook search module responsive to a second excitation residual produced from the first, adaptive-codebook excitation residual. In a CELP decoder, a combined innovation codebook comprises a de-quantizer of pre-quantized coding parameters into a first excitation contribution, and a CELP innovation-codebook structure responsive to CELP innovation-codebook parameters to produce a second excitation contribution. | 04-12-2012 |
20120095754 | METHOD AND APPARATUS FOR ENCODING AND DECODING AUDIO SIGNAL USING LAYERED SINUSOIDAL PULSE CODING - Provided are a method and an apparatus for encoding and decoding an audio signal. A method for encoding an audio signal includes receiving a transformed audio signal, dividing the transformed audio signal into a plurality of subbands, performing a first sinusoidal pulse coding operation on the subbands, determining a performance region of a second sinusoidal pulse coding operation among the subbands on the basis of coding information of the first sinusoidal pulse coding operation, and performing the second sinusoidal pulse coding operation on the determined performance region, wherein the first sinusoidal pulse coding operation is performed variably according to the coding information. Accordingly, it is possible to further improve the quality of a synthesized signal by considering the sinusoidal pulse coding of a lower layer when encoding or decoding an audio signal in an upper layer by a layered sinusoidal pulse coding scheme. | 04-19-2012 |
20120116753 | METHOD AND DEVICE FOR REDUCING INTERFERENCE IN AN AUDIO SIGNAL DURING A CALL - In order to reduce interference in an audio signal during a call on a mobile communication device, a plurality of transforms of the audio signal is performed, each transform containing phase information and amplitude information of corresponding samples of the audio signal. The results of the transforms are then averaged in order to generating a compensation signal that can be subtracted from the audio signal. | 05-10-2012 |
20120136653 | TRANSFORM CODER AND TRANSFORM CODING METHOD - A transform coding apparatus includes an input scale factor calculating section that calculates an input scale factor having a predetermined number of scale factors associated with an input spectrum as an element, and a codebook that stores a plurality of scale factor candidates having a predetermined number of elements and outputs one scale factor candidate. The transform coding apparatus also includes an error calculating section that calculates an error on a per element basis, a weighted error calculating section that determines a weight on a per element basis and calculates a sum of products of the error and the weight to calculate a weighted error, and a searching section that searches for a scale factor candidate that minimizes the weighted error in the codebook. | 05-31-2012 |
20120179458 | APPARATUS AND METHOD FOR ESTIMATING NOISE BY NOISE REGION DISCRIMINATION - Provided are an apparatus and method for estimating noise that changes with time. The apparatus may calculate a speech absence probability that indicates the possibility of the absence of speech in each frequency component of an input acoustic signal, may discriminate between a speech-dominant region and a noise region from the acoustic signals based on the speech absence probability, and may estimate noise according to the discrimination result. | 07-12-2012 |
20120179459 | METHOD AND APPARATUS FOR PROCESSING AUDIO SIGNALS - A method of pre-processing an audio signal transmitted to a user terminal via a communication network and an apparatus using the method are provided. The method of pre-processing the audio signal may prevent deterioration of a sound quality of the audio signal transmitted to the user terminal by pre-processing the audio signal, and by enabling a codec module, encoding the audio signal, to determine the audio signal as a speech signal. Also, the method of pre-processing the audio signal may improve a probability that the codec module may determine a corresponding audio signal as a speech when the audio signal is transmitted via the communication network by pre-processing the audio signal using a speech codec. | 07-12-2012 |
20120185242 | EXCITATION VECTOR GENERATOR, SPEECH CODER AND SPEECH DECODER - A noise estimating apparatus estimates two types of noise spectra for removing a noise component using the two types of noise spectra. The noise estimating apparatus includes an A/D converter that converts an input speech signal to a digital signal, and a Fourier transformer that performs a discrete Fourier transform on the digital signal having a predetermined time length to obtain an input spectrum and a complex spectrum. The noise estimating apparatus also includes a noise spectrum storage device that stores the two types of noise spectra, including a mean noise spectrum and a compensation noise spectrum, and a noise estimator that estimates a new compensation noise spectrum and a new mean noise spectrum as new two types of noise spectra. | 07-19-2012 |
20120215524 | TONE DETERMINATION DEVICE AND METHOD - A tone determination device, which determines the tonality of an input signal, is capable of reducing calculation complexity. Therein a frequency conversion unit ( | 08-23-2012 |
20120239387 | VOICE TRANSFORMATION WITH ENCODED INFORMATION - Method, system, and computer program product for voice transformation are provided. The method includes transforming a source speech using transformation parameters, and encoding information on the transformation parameters in an output speech using steganography, wherein the source speech can be reconstructed using the output speech and the information on the transformation parameters. A method for reconstructing voice transformation is also provided including: receiving an output speech of a voice transformation system wherein the output speech is transformed speech which has encoded information on the transformation parameters using steganography; extracting the information on the transformation parameters; and carrying out an inverse transformation of the output speech to obtain an approximation of an original source speech. | 09-20-2012 |
20120245927 | SYSTEM AND METHOD FOR MONAURAL AUDIO PROCESSING BASED PRESERVING SPEECH INFORMATION - A method, system and machine readable medium for noise reduction is provided. The method includes: (1) receiving a noise corrupted signal; (2) transforming the noise corrupted signal to a time-frequency domain representation; (3) determining probabilistic bases for operation, the probabilistic bases being priors in a multitude of frequency bands calculated online; (4) adapting longer term internal states of the method; (5) calculating present distributions that fit data; (6) generating non-linear filters that minimize entropy of speech and maximize entropy of noise, thereby reducing the impact of noise while enhancing speech; (7) applying the filters to create a primary output in a frequency domain; and (8) transforming the primary output to the time domain and outputting a noise suppressed signal. | 09-27-2012 |
20130030795 | ENCODING METHOD AND APPARATUS, AND DECODING METHOD AND APPARATUS - An encoding method of an encoder is provided. The encoder generates first MDCT coefficients by transforming an input signal, and generates MDCT indices by quantizing the first MDCT coefficients. The encoder generates second MDCT coefficients by dequantizing the MDCT indices, and calculates MDCT residual coefficients using differences between the first MDCT coefficients and the second MDCT coefficients. The encoder generates a residual index by encoding the MDCT residual coefficients, and generates gain indices corresponding to gains from the first MDCT coefficients and the second MDCT coefficients. | 01-31-2013 |
20130226565 | APPARATUS AND METHOD OF ENCODING AND DECODING SIGNALS - A method of encoding an audio signal, where signals including two or more channel signals are downmixed to a mono signal, the mono signal is divided into a low-frequency signal and a high-frequency signal, the low-frequency signal is encoded through algebraic code excited linear prediction (ACELP) or transform coded excitation (TCX), and the high-frequency signal is encoded using the low-frequency signal. A method of decoding of an audio signal, a low-frequency signal encoded through ACELP or TCX is decoded, a high-frequency signal is decoded using the low-frequency signal, the low-frequency signal and the high-frequency signal are combined to generate a mono signal, and the mono signal is upmixed by decoding spatial parameters regarding signals including two or more channel signals. | 08-29-2013 |
20130226566 | METHOD AND APPARATUS FOR ENCODING AND DECODING HIGH FREQUENCY SIGNAL - Provided are a method and apparatus for encoding and decoding a high frequency signal by using a low frequency signal. The high frequency signal can be encoded by extracting a coefficient by linear predicting a high frequency signal, and encoding the coefficient, generating a signal by using the extracted coefficient and a low frequency signal, and encoding the high frequency signal by calculating a ratio between the high frequency signal and an energy value of the generated signal. Also, the high frequency signal can be decoded by decoding a coefficient, which is extracted by linear predicting a high frequency signal, and a low frequency signal, and generating a signal by using the decoded coefficient and the decoded low frequency signal, and adjusting the generated signal by decoding a ratio between the generated signal and an energy value of the high frequency signal. | 08-29-2013 |
20130246054 | SPEECH SIGNAL ENCODING METHOD AND SPEECH SIGNAL DECODING METHOD - A speech signal encoding method and a speech signal decoding method are provided. The speech signal encoding method includes the steps of specifying an analysis frame in an input signal; generating a modified input based on the analysis frame; applying a window to the modified input; generating a transform coefficient by performing an MDCT (Modified Discrete Cosine Transform) on the modified input to which the window has been applied; and encoding the transform coefficient. The modified input includes the analysis frame and a self replication of all or a part of the analysis frame. | 09-19-2013 |
20130268264 | SIGNAL ANALYZER, SIGNAL ANALYZING METHOD, SIGNAL SYNTHESIZER, SIGNAL SYNTHESIZING, WINDOWER, TRANSFORMER AND INVERSE TRANSFORMER - The present disclosure relates to a signal analyzer for processing an overlapped input signal frame comprising 2N subsequent input signal values. The signal analyzer comprises: a windower adapted to window the overlapped input signal frame to obtain a windowed signal, wherein the windower is adapted to zero M+N/2 subsequent input signal values of the overlapped input signal frame, wherein M is equal or greater than 1 and smaller than N/2; and a transformer adapted to transform the remaining 3N/2−M subsequent windowed signal values of the windowed signal using N−M sets of transform parameters to obtain a transformed-domain signal comprising N−M transformed-domain signal values. | 10-10-2013 |
20130297296 | SOURCE SEPARATION BY INDEPENDENT COMPONENT ANALYSIS IN CONJUNCTION WITH SOURCE DIRECTION INFORMATION - Methods and apparatus for signal processing are disclosed. Source separation can be performed to extract source signals from mixtures of source signals by way of independent component analysis. Source direction information is utilized in the separation process, and independent component analysis techniques described herein use multivariate probability density functions to preserve the alignment of frequency bins in the source separation process. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. | 11-07-2013 |
20130317811 | Efficient Encoding/Decoding of Audio Signals - A method for encoding of an audio signal comprises performing ( | 11-28-2013 |
20130317812 | METHOD AND DEVICE FOR BANDWIDTH EXTENSION - Method and device of extending a signal band of a voice or audio signal are provided. The bandwidth extension method includes the steps of: performing a modified discrete cosine transform (MDCT) process on an input signal to generate a first transform signal; generating a second transform signal and a third transform signal on the basis of the first transform signal; generating normalized components and energy components of the first transform signal, the second transform signal, and the third transform signal therefrom; generating an extended normalized component from the normalized components and generating an extended energy component from the energy components; generating an extended transform signal on the basis of the extended normalized component and the extended energy component; and performing an inverse MDCT (IMDCT) process on the extended transform signal. | 11-28-2013 |
20130332148 | APPARATUS AND METHOD FOR ENCODING AND DECODING AN AUDIO SIGNAL USING AN ALIGNED LOOK-AHEAD PORTION - An apparatus for encoding an audio signal having a stream of audio samples has: a windower for applying a prediction coding analysis window to the stream of audio samples to obtain windowed data for a prediction analysis and for applying a transform coding analysis window to the stream of audio samples to obtain windowed data for a transform analysis, wherein the transform coding analysis window is associated with audio samples within a current frame of audio samples and with audio samples of a predefined portion of a future frame of audio samples being a transform-coding look-ahead portion, wherein the prediction coding analysis window is associated with at least the portion of the audio samples of the current frame and with audio samples of a predefined portion of the future frame being a prediction coding look-ahead portion, wherein the transform coding look-ahead portion and the prediction coding look-ahead portion are identically to each other or are different from each other by less than 20%; and an encoding processor for generating prediction coded data or for generating transform coded data. | 12-12-2013 |
20130339010 | SPEECH DECODER, SPEECH ENCODER, SPEECH DECODING METHOD, SPEECH ENCODING METHOD, STORAGE MEDIUM FOR STORING SPEECH DECODING PROGRAM, AND STORAGE MEDIUM FOR STORING SPEECH ENCODING PROGRAM - A speech decoder includes a demultiplexing unit, a low frequency band decoding unit, a band splitting filter bank unit, a coded sequence analysis unit, a coded sequence decoding/dequantization unit, a high frequency band generation unit, low frequency band time envelope calculation units that acquire a plurality of low frequency band time envelopes, a time envelope calculation unit that calculates high frequency band time envelopes using time envelope information and the plurality of low frequency band time envelopes, a time envelope adjustment unit that adjusts the time envelope of high frequency band components using the time envelopes obtained by the time envelope calculation unit, and a band synthesis filter bank unit. | 12-19-2013 |
20140025374 | SPEECH ENHANCEMENT TO IMPROVE SPEECH INTELLIGIBILITY AND AUTOMATIC SPEECH RECOGNITION - The present invention provides a system and method to enhance speech intelligibility and improve the detection rate of automatic speech recognizer in noisy environments. The present invention reduces an acoustically coupled loudspeaker signal from a plurality of microphone signals to enhance a near end user speech signal. A decision unit checks a system configuration parameter to determine if the cleaned speech is intended for human communication and/or Automatic Speech Recognition (ASR). A formant emphasis filer and a spectrum band reconstruction unit are used to further enhance the speech quality and improve the ASR recognition rate. The present invention can also apply to devices which has a foreground microphone(s) and a background microphone(s). | 01-23-2014 |
20140074459 | AUTOMATIC CONVERSION OF SPEECH INTO SONG, RAP OR OTHER AUDIBLE EXPRESSION HAVING TARGET METER OR RHYTHM - Captured vocals may be automatically transformed using advanced digital signal processing techniques that provide captivating applications, and even purpose-built devices, in which mere novice user-musicians may generate, audibly render and share musical performances. In some cases, the automated transformations allow spoken vocals to be segmented, arranged, temporally aligned with a target rhythm, meter or accompanying backing tracks and pitch corrected in accord with a score or note sequence. Speech-to-song music applications are one such example. In some cases, spoken vocals may be transformed in accord with musical genres such as rap using automated segmentation and temporal alignment techniques, often without pitch correction. Such applications, which may employ different signal processing and different automated transformations, may nonetheless be understood as speech-to-rap variations on the theme. | 03-13-2014 |
20140114650 | Method for Transforming Non-Stationary Signals Using a Dynamic Model - An input signal, in the form of a sequence of feature vectors, is transformed to an output signal by first storing parameters of a model of the input signal in a memory. Using the vectors and the parameters, a sequence of vectors of hidden variables is inferred. There is at least one vector h | 04-24-2014 |
20140114651 | DEVICE AND METHOD FOR EXECUTION OF HUFFMAN CODING - In this invention, the design of the Huffman table can be done offline with a large input sequence database. The range of the quantization indices (or differential indices) for Huffman coding is identified. For each value of range, all the input signal which have the same range will be gathered and the probability distribution of each value of the quantization indices (or differential indices) within the range is calculated. For each value of range, one Huffman table is designed according to the probability. And in order to improve the bits efficiency of the Huffman coding, apparatus and methods to reduce the range of the quantization indices (or differential indices) are also introduced. | 04-24-2014 |
20140142930 | ADAPTATIONS OF ANALYSIS OR SYNTHESIS WEIGHTING WINDOWS FOR TRANSFORM CODING OR DECODING - A method and device are provided for coding or decoding a digital audio signal by transform using analysis or synthesis weighting windows applied to sample frames. The method includes an irregular sampling of an initial window provided for a transform of given initial size N, to apply a secondary transform of size M different from N. | 05-22-2014 |
20140236581 | VOICE SIGNAL ENCODING METHOD, VOICE SIGNAL DECODING METHOD, AND APPARATUS USING SAME - The present invention relates to a method and apparatus for processing a voice signal, and the voice signal encoding method according to the present invention comprises the steps of: generating transform coefficients of sine wave components forming an input voice signal by transforming the sine wave components; determining transform coefficients to be encoded from the generated transform coefficients; and transmitting indication information indicating the determined transform coefficients, wherein the indication information may include position information, magnitude information, and sign information of the transform coefficients. | 08-21-2014 |
20140244244 | APPARATUS AND METHOD FOR PROCESSING FREQUENCY SPECTRUM USING SOURCE FILTER - A frequency spectrum processing apparatus and method using a source filter are disclosed. The frequency spectrum processing apparatus may include a first excitation spectrum generation unit to generate a first excitation spectrum using a tonal excitation spectrum according to an input signal and a gain of the tonal excitation spectrum, a second excitation spectrum generation unit to generate a second excitation spectrum using a non-tonal excitation spectrum according to the input signal and a gain of the non-tonal excitation spectrum, and an output spectrum generation unit to generate an output spectrum using the first excitation spectrum and the second excitation spectrum. | 08-28-2014 |
20140324417 | METHOD AND APPARATUS FOR ENCODING AND DECODING AUDIO SIGNAL USING LAYERED SINUSOIDAL PULSE CODING - Provided are a method and an apparatus for encoding and decoding an audio signal. A method for encoding an audio signal includes receiving a transformed audio signal, dividing the transformed audio signal into a plurality of subbands, performing a first sinusoidal pulse coding operation on the subbands, determining a performance region of a second sinusoidal pulse coding operation among the subbands on the basis of coding information of the first sinusoidal pulse coding operation, and performing the second sinusoidal pulse coding operation on the determined performance region, wherein the first sinusoidal pulse coding operation is performed variably according to the coding information. Accordingly, it is possible to further improve the quality of a synthesized signal by considering the sinusoidal pulse coding of a lower layer when encoding or decoding an audio signal in an upper layer by a layered sinusoidal pulse coding scheme. | 10-30-2014 |
20140343931 | ROBUST SIGNATURES DERIVED FROM LOCAL NONLINEAR FILTERS - Content signal recognition is based on a multi-axis filtering of the content signal. The signatures are calculated, formed into data structures and organized in a database for quick searching and matching operations used in content recognition. For content recognition, signals are sampled and transformed into signatures using the multi axis filter. The database is searched to recognize the signals as part of a content item in the database. Using the content identification, content metadata is retrieved and provided for a variety of applications. In one application, the metadata is provided in response to a content identification request. | 11-20-2014 |
20140358527 | Inactive Sound Signal Parameter Estimation Method and Comfort Noise Generation Method and System - A parameter estimation method for inactive voice signals and a system thereof and comfort noise generation method and system are disclosed. The method includes: for an inactive voice signal frame, performing time-frequency transform on a sequence of time domain signals containing the inactive voice signal frame to obtain a frequency spectrum sequence, calculating frequency spectrum coefficients according to the frequency spectrum sequence, performing smooth processing on the frequency spectrum coefficients, obtaining a smoothly processed frequency spectrum sequence according to the smoothly processed frequency spectrum coefficients, performing inverse time-frequency transform on the smoothly processed frequency spectrum sequence to obtain a reconstructed time domain signal, and estimating an inactive voice signal parameter according to the reconstructed time domain signal to obtain a frequency spectrum parameter and an energy parameter. With the present solution, it can provide stable background noise parameters in a comfort noise generation system at decoding. | 12-04-2014 |
20150066486 | METHODS AND SYSTEMS FOR IMPROVED SIGNAL DECOMPOSITION - A method for improving decomposition of digital signals using training sequences is presented. A method for improving decomposition of digital signals using initialization is also provided. A method for sorting digital signals using frames based upon energy content in the frame is further presented. A method for utilizing user input for combining parts of a decomposed signal is also presented. | 03-05-2015 |
20150112669 | REGULARIZED FEATURE SPACE DISCRIMINATION ADAPTATION - A method and apparatus are provided for training a transformation matrix of a feature vector for an acoustic model. The method includes training the transformation matrix of the feature vector. The transformation matrix maximizes an objective function having a regularization term. The method further includes transforming the feature vector using the transformation matrix of the feature vector, and updating the acoustic model stored in a memory device using the transformed feature vector. | 04-23-2015 |
20150120285 | Systems and Methods for Reconstructing an Audio Signal from Transformed Audio Information - A system and method may be configured to reconstruct an audio signal from transformed audio information. The audio signal may be resynthesized based on individual harmonics and corresponding pitches determined from the transformed audio information. Noise may be subtracted from the transformed audio information by interpolating across peak points and across trough points of harmonic pitch paths through the transformed audio information, and subtracting values associated with the trough point interpolations from values associated with the peak point interpolations. Noise between harmonics of the sound may be suppressed in the transformed audio information by centering functions at individual harmonics in the transformed audio information, the functions serving to suppress noise between the harmonics. | 04-30-2015 |
20150371647 | IMPROVED CORRECTION OF FRAME LOSS DURING SIGNAL DECODING - A signal processing device, media, and method are provided, where a signal comprises a succession of samples distributed in successive frames. The processing is implemented during decoding of such a signal in order to replace at least one signal frame lost in decoding, and comprising in particular: a) searching, in a valid signal available to the decoder, for a signal segment of length corresponding to a period set as a function of the valid signal; b) analyzing a spectrum of the segment in order to determine spectral components of the segment; and c) synthesizing at least one replacement frame for the lost frame by construction of a synthesized signal from at least a portion of the spectral components. | 12-24-2015 |
20150379998 | FRAME ERROR CONCEALMENT - A frame error concealment method based on frames including transform coefficient vectors including the following steps: It tracks (S | 12-31-2015 |
20160012807 | AUDIO MATCHING WITH SUPPLEMENTAL SEMANTIC AUDIO RECOGNITION AND REPORT GENERATION | 01-14-2016 |
20160035354 | BIT ALLOCATING, AUDIO ENCODING AND DECODING - A bit allocating method is provided that includes determining the allocated number of bits in decimal point units based on each frequency band so that a Signal-to-Noise Ratio (SNR) of a spectrum existing in a predetermined frequency band is maximized within a range of the allowable number of bits for a given frame; and adjusting the allocated number of bits based on each frequency band. | 02-04-2016 |
20160035361 | Harmonic Transposition in an Audio Coding Method and System - The present invention relates to transposing signals in time and/or frequency and in particular to coding of audio signals. More particular, the present invention relates to high frequency reconstruction (HFR) methods including a frequency domain harmonic transposer. A method and system for generating a transposed output signal from an input signal using a transposition factor T is described. The system comprises an analysis window of length L | 02-04-2016 |
20160042743 | REDUCED DIGITAL AUDIO SAMPLING RATES IN DIGITAL AUDIO PROCESSING CHAIN - Reduced digital audio sampling rates are described in a digital audio processing chain. In one embodiment, an audio signal is received. A convolution operation is performed on the received audio signal. The convoluted audio signal is sampled. An interrupt is received to process the audio sample, and the sample is processed in response to the interrupt. The processed samples are collected to form a frame and the frame is transmitted to a remote device. | 02-11-2016 |
20160055863 | SIGNAL PROCESSING APPARATUS, SIGNAL PROCESSING METHOD, SIGNAL PROCESSING PROGRAM - This invention provides a signal processing apparatus for changing an input sound into an easy-to-hear sound. The signal processing apparatus includes a transformer that transforms an input signal into an amplitude component signal in a frequency domain, a stationary component estimator that estimates a stationary component signal having a frequency spectrum with a stationary characteristic based on the amplitude component signal in the frequency domain, a replacement unit that generates a new amplitude component signal using the amplitude component signal obtained by the transformer and the stationary component signal, and replaces the amplitude component signal by the new amplitude component signal, and an inverse transformer that inversely transforms the new amplitude component signal into an enhanced signal. | 02-25-2016 |
20160064007 | AUDIO ENCODER AND DECODER - The present document relates an audio encoding and decoding system (referred to as an audio codec system). In particular, the present document relates to a transform-based audio codec system which is particularly well suited for voice encoding/decoding. A transform-based speech encoder ( | 03-03-2016 |
20160078873 | SIGNAL ENCODING METHOD AND DEVICE - A signal encoding method and device are disclosed. The method includes, when an encoding manner of a previous frame of a currently-input frame is a continuous encoding manner, predicting a comfort noise that is generated by a decoder according to the currently-input frame when the currently-input frame is encoded into an SID frame, determining an actual silence signal, determining a deviation degree between the comfort noise and the actual silence signal, determining an encoding manner of the currently-input frame according to the deviation degree, and encoding the currently-input frame according to the encoding manner of the currently-input frame. It is determined, according to the deviation degree between the comfort noise and the actual silence signal, that the encoding manner of the currently-input frame is the hangover frame encoding manner or the SID frame encoding manner, which can save communication bandwidth. | 03-17-2016 |
20160133272 | ADAPTIVE INTERCHANNEL DISCRIMINATIVE RESCALING FILTER - A method for adjusting a degree of filtering applied to an audio signal includes modeling a probability density function (PDF) of a fast Fourier transform (FFT) coefficient of a primary channel and reference channel of the audio signal; maximizing at least one of PDFs to provide a discriminative relevance difference (DRD) between a noise magnitude estimate of the reference channel and a noise magnitude estimate of the primary channel. The method further includes emphasizing the primary channel when the spectral magnitude of the primary channel is stronger than the spectral magnitude of the reference channel; and deemphasizing the primary channel when the spectral magnitude of the reference channel is stronger than the spectral magnitude of the primary channel. The emphasizing and deemphasizing includes computing a multiplicative rescaling factor and applying the multiplicative rescaling factor to a gain computed in a prior stage of a speech enhancement filter chain when there is a prior stage, and directly applying a gain when there is no prior stage. | 05-12-2016 |
20160140975 | LINEAR PREDICTION ANALYSIS DEVICE, METHOD, PROGRAM, AND STORAGE MEDIUM - An autocorrelation calculation unit | 05-19-2016 |
20160155451 | APPARATUS AND METHOD FOR AUDIO SIGNAL ENVELOPE ENCODING, PROCESSING, AND DECODING BY MODELLING A CUMULATIVE SUM REPRESENTATION EMPLOYING DISTRIBUTION QUANTIZATION AND CODING | 06-02-2016 |