Class / Patent application number | Description | Number of patent applications / Date published |
704200100 | Psychoacoustic | 53 |
20080221875 | Bit rate reduction in audio encoders by exploiting inharmonicity effects and auditory temporal masking - The present invention relates to a method for encoding an audio signal. In a first embodiment a model relating to temporal masking of sound provided to a human ear is provided. A temporal masking index is determined in dependence upon a received audio signal and the model using a forward and a backward masking function. Using a psychoacoustic model a masking threshold is determined in dependence upon the temporal masking index. Finally, the audio signal is encoded in dependence upon the masking threshold. The method has been implemented using the MPEG- | 09-11-2008 |
20080270123 | System for Indicating Emotional Attitudes Through Intonation Analysis and Methods Thereof - The present invention discloses means and method for indicating emotional attitudes of a speaker, either human or animal, according to voice intonation. The invention also discloses a method for advertising, marketing, educating, or lie detecting by indicating emotional attitudes of a speaker and a method of providing remote service by a group comprising at least one observer to at least one speaker. The invention also discloses a system for indicating emotional attitudes of a speaker comprising a glossary of intonations relating intonations to emotions attitudes. | 10-30-2008 |
20080319739 | LOW COMPLEXITY DECODER FOR COMPLEX TRANSFORM CODING OF MULTI-CHANNEL SOUND - A multi-channel audio decoder provides a reduced complexity processing to reconstruct multi-channel audio from an encoded bitstream in which the multi-channel audio is represented as a coded subset of the channels along with a complex channel correlation matrix parameterization. The decoder translates the complex channel correlation matrix parameterization to a real transform that satisfies the magnitude of the complex channel correlation matrix. The multi-channel audio is derived from the coded subset of channels via channel extension processing using a real value effect signal and real number scaling. | 12-25-2008 |
20090006081 | METHOD, MEDIUM AND APPARATUS FOR ENCODING AND/OR DECODING SIGNAL - Provided are a method and apparatus for encoding or decoding an audio signal or a speech signal. In the encoding method, encoding is performed by performing domain transformation on a received signal in units of frequency bands by applying a psychoacoustic model, encoding the transformation result with respect to predetermined one or more frequency bands by using a high temporal resolution coding tool, and then quantizing the encoding result. In the decoding method, decoding is performed by inversely quantizing signals obtained by encoding in units of frequency bands, decoding one or more signals from among the inversely quantized signals, which are allocated to one or more frequency bands which have a predetermined domain resolution, determined by applying the psychoacoustic model, that is greater than a predetermined value, according to a predetermined method, and then inversely transforming either the inversely quantized or the one or more decoded signals. | 01-01-2009 |
20090037166 | AUDIO ENCODING METHOD WITH FUNCTION OF ACCELERATING A QUANTIZATION ITERATIVE LOOP PROCESS - An audio encoding method previously estimates better initial iterative values of global-gain and scalefactor for avoiding heavy calculation. The estimating process of the encoding method includes calculating the bit allocation of one frequency sample based on a sampling rate, a bit rate, and the number of audio channels according to an input frame, and the psychoacoustic model, searching one frequency sample having the greatest sample energy in each of a plurality of scalefactor bands, quantizing the frequency sample to comply with the bit allocation and to generate a corresponding scalefactor, searching a maximum scalefactor of all scalefactor bands corresponding to the input frame, and setting initial values of scalefactors and an initial value of global-gain for the quantization iterative loop process according to the corresponding scalefactor and the maximum scalefactor. | 02-05-2009 |
20090048826 | ENCODING METHOD AND APPARATUS FOR EFFICIENTLY ENCODING SINUSOIDAL SIGNAL WHOSE MAGNITUDE IS LESS THAN MASKING VALUE ACCORDING TO PSYCHOACOUSTIC MODEL AND DECODING METHOD AND APPARATUS FOR DECODING ENCODED SINUSOIDAL SIGNAL - Provided are an encoding method and apparatus for efficiently encoding a sinusoidal signal whose magnitude is less than a masking value according to a psychoacoustic model, a decoding method and apparatus for decoding an encoded sinusoidal signal, and a computer-readable recording medium having recorded thereon a program for executing the encoding method/the decoding method. By using a particular code indicating that the magnitude of a first sinusoidal signal is less than a masking value according to a psychoacoustic model to encode the first sinusoidal signal, difference coding for a third sinusoidal signal of a next frame, which is connected to the first sinusoidal signal, is performed using a sinusoidal signal or sinusoidal signals selected according to a method to use the particular code, and a decoding apparatus obtains a sum with a transmitted difference using the selected sinusoidal signal(s). | 02-19-2009 |
20090063137 | Method and Apparatus of Low-Complexity Psychoacoustic Model Applicable for Advanced Audio Coding Encoders - A method and an apparatus of a low-complexity psychoacoustic model applicable for advanced audio coding encoders use a modified discrete cosine transform based (MDCT-based) psychoacoustic model and a simplified look-up table to compute the MDCT-based psychoacoustic model by a logarithm based logarithmic method to simplify the computational complexity, and then computing a quantization loop (Q loop) by the logarithm based logarithmic method to further reduce the computational quantity of the MDCT-based psychoacoustic model, so as to achieve the real-time playback effect by a very low operating frequency. | 03-05-2009 |
20090076801 | Method and Apparatus for Introducing Information into a Data Stream and Method and Apparatus for Encoding an Audio Signal - An inventive method for introducing information into a data stream including data about spectral values representing a short-term spectrum of an audio signal first performs a processing of the data stream to obtain the spectral values of the short-term spectrum of the audio signal. Apart from that, the information to be introduced are combined with a spread sequence to obtain a spread information signal, whereupon a spectral representation of the spread information is generated which will then be weighted with an established psychoacoustic maskable noise energy to generate a weighted information signal, wherein the energy of the introduced information is substantially equal to or below the psychoacoustic masking threshold. The weighted information signal and the spectral values of the short-term spectrum of the audio signal will then be summed and afterwards processed again to obtain a processed data stream including both audio information and information to be introduced. By the fact that the information to be introduced are introduced into the data stream without changing to the time domain, the block rastering underlying the short-term spectrum will not be touched, so that introducing a watermark will not lead to tandem encoding effects. | 03-19-2009 |
20090089049 | METHOD AND APPARATUS FOR ADAPTIVELY DETERMINING QUANTIZATION STEP ACCORDING TO MASKING EFFECT IN PSYCHOACOUSTICS MODEL AND ENCODING/DECODING AUDIO SIGNAL BY USING DETERMINED QUANTIZATION STEP - Provided are a method of adaptively determining a quantization step according to a masking effect in a psychoacoustics model and a method of encoding/decoding an audio signal by using the determined quantization step. The method of adaptively determining a quantization step includes calculating a first ratio value indicating an intensity of an input audio signal with respect to a masking threshold; and determining the maximum value of the quantization step in a range in which noise generated when the audio signal is quantized is masked, according to the first ratio value. According to the present invention, quantization noise may be removed and the number of bits required to encode an audio signal may be reduced, by using auditory characteristics of humans. | 04-02-2009 |
20090099843 | METHOD AND SYSTEM FOR THE INTEGRAL AND DIAGNOSTIC ASSESSMENT OF LISTENING SPEECH QUALITY - A method for determining a speech quality measure of an output speech signal with respect to an input speech signal, wherein the input signal passes through a signal path of a data transmission system resulting in the output signal, includes the steps of pre-processing the output signal; determining at least one of an interruption rate of the pre-processed output signal and a measure for an intensity of musical tones present in the pre-processed output signal; and determining the speech quality measure from at least one of the interruption rate and the measure for the intensity of the musical tones. | 04-16-2009 |
20090125298 | VIBRATO DETECTION MODULES IN A SYSTEM FOR AUTOMATIC TRANSCRIPTION OF SUNG OR HUMMED MELODIES - The technology disclosed relates to audio signal processing. It includes a series of modules that individually are useful to solve audio signal processing problems. Among the problems addressed are buzz removal, selecting a pitch candidate among pitch candidates based on local continuity of pitch and regional octave consistency, making small adjustments in pitch, ensuring that a selected pitch is consistent with harmonic peaks, determining whether a given frame or region of frames includes harmonic, voiced signal, extracting harmonics from voice signals and detecting vibrato. One environment in which these modules are useful is transcribing singing or humming into a symbolic melody. Another environment that would usefully employ some of these modules is speech processing. Some of the modules, such as buzz removal, are useful in many other environments as well. | 05-14-2009 |
20090132238 | EFFICIENT METHOD FOR REUSING SCALE FACTORS TO IMPROVE THE EFFICIENCY OF AN AUDIO ENCODER - An audio encoding system that accepts an audio signal as an input to the system. The system includes a filter bank that splits the audio signal into a plurality of frames, and a bit allocation unit that assigns a number of bits for a current frame of the plurality of frames. The system further includes a scale factor unit that calculates a scale factor, identifies a block type of a first block of a current frame, identifies a block type of a second block consecutive to the first block, and reuses a scale factor of the first block for the second block, when the block type of the first block and the block type of the second block match. The system additionally includes a quantization and coding unit that quantizes and codes the signal, and a bit rate checker that verifies whether a bit rate requirement is satisfied. | 05-21-2009 |
20090132239 | Audio Signal De-Identification - Techniques are disclosed for automatically de-identifying spoken audio signals. In particular, techniques are disclosed for automatically removing personally identifying information from spoken audio signals and replacing such information with non-personally identifying information. De-identification of a spoken audio signal may be performed by automatically generating a report based on the spoken audio signal. The report may include concept content (e.g., text) corresponding to one or more concepts represented by the spoken audio signal. The report may also include timestamps indicating temporal positions of speech in the spoken audio signal that corresponds to the concept content. Concept content that represents personally identifying information is identified. Audio corresponding to the personally identifying concept content is removed from the spoken audio signal. The removed audio may be replaced with non-personally identifying audio. | 05-21-2009 |
20090138259 | Method and Apparatus for Introducing Information into a Data Stream and Method and Apparatus for Encoding an Audio Signal - An inventive method for introducing information into a data stream including data about spectral values representing a short-term spectrum of an audio signal first performs a processing of the data stream to obtain the spectral values of the short-term spectrum of the audio signal. Apart from that, the information to be introduced are combined with a spread sequence to obtain a spread information signal, whereupon a spectral representation of the spread information is generated which will then be weighted with an established psychoacoustic maskable noise energy to generate a weighted information signal, wherein the energy of the introduced information is substantially equal to or below the psychoacoustic masking threshold. The weighted information signal and the spectral values of the short-term spectrum of the audio signal will then be summed and afterwards processed again to obtain a processed data stream including both audio information and information to be introduced. By the fact that the information to be introduced are introduced into the data stream without changing to the time domain, the block rastering underlying the short-term spectrum will not be touched, so that introducing a watermark will not lead to tandem encoding effects. | 05-28-2009 |
20090157391 | Extraction and Matching of Characteristic Fingerprints from Audio Signals - An audio fingerprint is extracted from an audio sample, where the fingerprint contains information that is characteristic of the content in the sample. The fingerprint may be generated by computing an energy spectrum for the audio sample, resampling the energy spectrum logarithmically in the time dimension, transforming the resampled energy spectrum to produce a series of feature vectors, and computing the fingerprint using differential coding of the feature vectors. The generated fingerprint can be compared to a set of reference fingerprints in a database to identify the original audio content. | 06-18-2009 |
20090319259 | Enhancing Perceptual Performance of SBR and Related HFR Coding Methods by Adaptive Noise-Floor Addition and Noise Substitution Limiting - Methods and an apparatus for enhancement of source coding systems utilizing high frequency reconstruction (HFR) are introduced. The problem of insufficient noise contents is addressed in a reconstructed highband, by using Adaptive Noise-floor Addition. New methods are also introduced for enhanced performance by means of limiting unwanted noise, interpolation and smoothing of envelope adjustment amplification factors. The methods and apparatus used are applicable to both speech coding and natural audio coding systems. | 12-24-2009 |
20090326928 | Audio Stream Notification and Processing - Various embodiments provide techniques for allowing an application to opt out of system default audio stream behavior, as well as techniques for notifying applications on a computing device that a communication audio stream has been initiated. The techniques may differentiate between communication-related audio streams and audio streams that are not communication-related. In some embodiments, an application may register to receive notification that a communication stream has been initiated. The application may be configured to comply with system default audio stream handling policies, or it can perform custom behavior in response to the audio stream notification. In some embodiments, an application may register for filtered or unfiltered notification. In a filtered notification scenario, an application is notified that a communication stream has been initiated when an audio stream associated with the application has not already been modified in response to the initiation of a different communication stream. In an unfiltered notification scenario, an application/process is notified whenever a communication stream is been initiated. | 12-31-2009 |
20100010807 | METHOD AND APPARATUS TO ENCODE AND DECODE AN AUDIO/SPEECH SIGNAL - A method and apparatus to encode and decode an audio/speech signal is provided. An inputted audio signal or speech signal may be transformed into at least one of a high frequency resolution signal and a high temporal resolution signal. The signal may be encoded by determining an appropriate resolution, the encoded signal may be decoded, and thus the audio signal, the speech signal, and a mixed signal of the audio signal and the speech signal may be processed. | 01-14-2010 |
20100017195 | Filter Unit and Method for Generating Subband Filter Impulse Responses - A filter compressor for generating compressed subband filter impulse responses from input subband filter impulse responses corresponding to subbands, which include filter impulse response values at filter taps, includes a processor for examining the filter impulse response values from at least two input subband filter input responses to find filter impulse response values having higher values and at least one filter impulse response value having a value being lower than the higher values, and a filter impulse response constructor for constructing the compressed subband filter impulse responses using the filter impulse response values having the higher values, wherein the compressed subband filter impulse responses do not include filter impulse response values corresponding to filter taps of the at least one filter impulse response value having the lower value or include zero-valued values corresponding to filter taps of the at least one filter impulse response value having the lower value. | 01-21-2010 |
20100042406 | Audio signal processing using improved perceptual model - A perceptual model based on psychoacoustic auditory experiments is based on the (time domain) roughness of an input signal envelope in particular cochlea filter bands rather than the noise-like vs. tonal nature of the input signal. In illustrative embodiments, frequency domain techniques are used to develop envelope and envelope roughness measures, and such roughness measures are then used to derive Noise Masking Ratio (NMR) values for achieving a high level of noise masking in coder embodiments. Coder embodiments based on present inventive teachings are compatible with well-known AAC coding standards. | 02-18-2010 |
20100042407 | HIGH QUALITY TIME-SCALING AND PITCH-SCALING OF AUDIO SIGNALS - In one alternative, an audio signal is analyzed using multiple psychoacoustic criteria to identify a region of the signal in which time scaling and/or pitch shifting processing would be inaudible or minimally audible, and the signal is time scaled and/or pitch shifted within that region. In another alternative, the signal is divided into auditory events, and the signal is time scaled and/or pitch shifted within an auditory event. In a further alternative, the signal is divided into auditory events, and the auditory events are analyzed using a psychoacoustic criterion to identify those auditory events in which the time scaling and/or pitch shifting processing of the signal would be inaudible or minimally audible. Further alternatives provide for multiple channels of audio. | 02-18-2010 |
20100121632 | STEREO AUDIO ENCODING DEVICE, STEREO AUDIO DECODING DEVICE, AND THEIR METHOD - Provided is a stereo audio encoding device which can improve the ICP (Inter-channel Prediction) performance of a stereo audio signal while suppressing the bit rate. The device ( | 05-13-2010 |
20100145681 | Method and system to identify, quantify, and display acoustic transformational structures in speech - The invention for processing speech that is described herein measures the periodic changes of multiple acoustic features in a digitized utterance without regard for lexical, sublexical, or prosodic features. These measurements of periodic, simultaneous changes of multiple acoustic features are assembled into transformational structures. Various types of transformational structures are identified, quantified, and displayed by the invention. The invention is useful for the study of such speaker characteristics as cognitive, emotional, linguistic, and behavioral functioning, and may be employed in the study of other phenomena of interest to the user. | 06-10-2010 |
20100145682 | Method and Related Device for Simplifying Psychoacoustic Analysis with Spectral Flatness Characteristic Values - The present invention applies spectral flatness characteristic values to simplify psychoacoustic analysis of a sound signal. If the sound signal comprises a plurality of frames, the present invention calculates the energy of the sound signal in a frequency domain, calculates a plurality of spectral flatness, and decides to use a short-block or a long-block Modified Discrete Cosine Transform accordingly. If the sound signal comprises left and right channel signals, the present invention performs psychoacoustic analysis on the sound signal to count energy of the left and right channel signals in a frequency domain, counts spectral flatness of the left and right channel signals, and decides to use middle/side transform or left and right channel encoding to transform the left and right channel signals accordingly. | 06-10-2010 |
20100161319 | DEVICE AND METHOD FOR GENERATING A COMPLEX SPECTRAL REPRESENTATION OF A DISCRETE-TIME SIGNAL - A filter bank device for generating a complex spectral representation of a discrete-time signal includes a generator for generating a block-wise real spectral representation, which, for example, implements an MDCT, to obtain temporally successive blocks of real spectral coefficients. The output values of this spectral conversion device are fed to a post-processor for post-processing the block-wise real spectral representation to obtain an approximated complex spectral representation having successive blocks, each block having a set of complex approximated spectral coefficients, wherein a complex approximated spectral coefficient can be represented by a first partial spectral coefficient and by a second partial spectral coefficient, wherein at least one of the first and second partial spectral coefficients is determined by combining at least two real spectral coefficients. A good approximation for a complex spectral representation of the discrete-time signal is obtained by combining two real spectral coefficients, preferably by a weighted linear combination, wherein additionally more degrees of freedom for optimizing the entire system are available. | 06-24-2010 |
20100169079 | PSYCHOACOUSTIC TIME ALIGNMENT - A method of providing a quality measure for an output voice signal generated to reproduce an input voice signal, the method comprising: partitioning the input and output signals into frames; for each frame of the input signal, determining a disturbance relative to each of a plurality of frames of the output signal; determining a subset of the determined disturbances comprising one disturbance for each input frame such that a sum of the disturbances in the subset set is a minimum; and using the set of disturbances to provide the measure of quality. | 07-01-2010 |
20100169080 | AUDIO ENCODING APPARATUS - An audio encoding apparatus that encodes audio signals of a plurality of channels, includes an adaptive bit allocation control unit that adaptively controls a number of encoding bits assigned to the audio signal of each channel in accordance with perceptual entropy of the audio signal of each of the channels,
| 07-01-2010 |
20100185439 | SEGMENTING AUDIO SIGNALS INTO AUDITORY EVENTS - In one aspect, the invention divides an audio signal into auditory events, each of which tends to be perceived as separate and distinct, by calculating the spectral content of successive time blocks of the audio signal, calculating the difference in spectral content between successive time blocks of the audio signal, and identifying an auditory event boundary as the boundary between successive time blocks when the difference in the spectral content between such successive time blocks exceeds a threshold. In another aspect, the invention generates a reduced-information representation of an audio signal by dividing an audio signal into auditory events, each of which tends to be perceived as separate and distinct, and formatting and storing information relating to the auditory events. Optionally, the invention may also assign a characteristic to one or more of the auditory events. Auditory events may be determined according to the first aspect of the invention or by another method. | 07-22-2010 |
20100198585 | QUANTIZATION AFTER LINEAR TRANSFORMATION COMBINING THE AUDIO SIGNALS OF A SOUND SCENE, AND RELATED CODER - The invention relates to a method for quantifying components, wherein certain components are each determined based on a plurality of audio signals and can be calculated by the application of a linear conversion on the audio signals, said method comprising: determining a quantification function to be applied to the components by testing a condition relative to an audio signal and depending on a comparison made between a psycho-acoustic masking threshold relative to the audio signal and a value determined based on the reverse linear conversion and quantification errors of the components by the function. | 08-05-2010 |
20100250242 | METHOD AND APPARATUS FOR PROCESSING AUDIO AND SPEECH SIGNALS - A method and device for processing signals representing speech or audio via a plurality of filters that approximate behaviors of the basilar membrane of human cochlea. Each of the plurality of filters is formed from a mother filter via the dilation and a shift in time and has the similar impulse response of the basilar membrane to the frequency band for which the filter represents. Any process can be conducted and any feature can be extracted in the domain of the filters' outputs for applications, such as noise reduction, speech synthesis, coding, and speech and speaker recognition. Processed signals can be synthesized back to the time domain via an inverse cochlear transform | 09-30-2010 |
20110144979 | DEVICE AND METHOD FOR ACOUSTIC COMMUNICATION - Disclosed is an acoustic communication method that includes filtering an audio signal to attenuate a high frequency section of the audio signal; generating a residual signal which corresponds to a difference between the audio signal and the filtered signal; generating a psychoacoustic mask for the audio signal based on a predetermined psychoacoustic model; generating a psychoacoustic spectrum mask by combining the residual signal with the psychoacoustic mask; generating an acoustic communication signal by modulating digital data according to the acoustic signal spectrum mask; and combining the acoustic communication signal with the filtered signal. | 06-16-2011 |
20110153313 | Method And Apparatus For The Detection Of Impulsive Noise In Transmitted Speech Signals For Use In Speech Quality Assessment - A method and apparatus for performing speech quality assessment in a speech communication system (such as, for example, a VoIP communication system) which detects and measures the presence of impulsive noise is provided. Specifically, in one illustrative embodiment, an autoregressive (AR) model of speech (and, in particular, of the excitation of the vocal tract) is advantageously employed to estimate a short-term variance of the speech excitation, and the standard deviation of the speech excitation (i.e., the square root of the variance) is then advantageously compared to a predetermined threshold to identify whether impulsive noise is present. Then, based on a statistic analysis of any such identified impulsive noise, a speech quality assessment is generated. | 06-23-2011 |
20110153314 | METHOD FOR DYNAMICALLY ADJUSTING THE SPECTRAL CONTENT OF AN AUDIO SIGNAL - A method for dynamically adjusting the spectral content of an audio signal, which increases the harmonic content of said audio signal, said method comprising translating an encoded digital signal into data bands, creating a psychoacoustic model to identify sections of said data bands that are deficient in harmonic quality, analyzing the fundamental frequency and amplitude of said harmonically deficient data bands, creating additional higher order harmonics for said harmonically deficient data bands, adding said higher order harmonics back to said encoded digital signal to form a newly enhanced signal, inverse filtering said newly enhanced signal, and converting said inverse filtered signal to an analog waveform for consumption by the listener. | 06-23-2011 |
20110282654 | QUALITY EVALUATION METHOD AND QUALITY EVALUATION APPARATUS - Quality of industrial products is evaluated by evaluating non-stationary operation sound, which is a kind of operation sound, from an aspect of tone, using closely simulated evaluation levels of evaluation of non-stationary sound by used of a human sense of hearing. | 11-17-2011 |
20120016665 | SOUND MASKING SYSTEM AND MASKING SOUND GENERATION METHOD - In a masking sound generation apparatus, a CPU analyzes a speech utterance speed of a received sound signal. Then, the CPU copies the received sound signal into a plurality of sound signals and performs the following processing on each of the sound signals. Namely, the CPU divides each of the sound signals into frames on the basis of a frame length determined on the basis of the speech utterance speed. Reverse process is performed on each of the frames to replace a waveform of the frame with a reverse waveform, and a windowing process is performed to achieve a smooth connection between the frames. Then, the CPU randomly rearranges the order of the frames and mixes the plurality of sound signals to generate a masking sound signal. | 01-19-2012 |
20120035917 | SYSTEM AND METHOD FOR AUTOMATIC DETECTION OF ABNORMAL STRESS PATTERNS IN UNIT SELECTION SYNTHESIS - Disclosed herein are systems, methods, and non-transitory computer-readable storage media for detecting and correcting abnormal stress patterns in unit-selection speech synthesis. A system practicing the method detects incorrect stress patterns in selected acoustic units representing speech to be synthesized, and corrects the incorrect stress patterns in the selected acoustic units to yield corrected stress patterns. The system can further synthesize speech based on the corrected stress patterns. In one aspect, the system also classifies the incorrect stress patterns using a machine learning algorithm such as a classification and regression tree, adaptive boosting, support vector machine, and maximum entropy. In this way a text-to-speech unit selection speech synthesizer can produce more natural sounding speech with suitable stress patterns regardless of the stress of units in a unit selection database. | 02-09-2012 |
20120065964 | METHOD AND APPARATUS FOR INTRODUCING INFORMATION INTO A DATA STREAM AND METHOD AND APPARATUS FOR ENCODING AN AUDIO SIGNAL - Techniques for introducing information into a data stream first obtains the spectral values of the short-term spectrum of the audio signal. Separately, information to be introduced are combined with a spread sequence obtaining a spread information signal, whereupon a spectral representation of the spread information is generated, then weighted with an established psychoacoustic maskable noise energy to generate a weighted information signal, wherein energy of the introduced information is substantially equal to or below the psychoacoustic masking threshold. The weighted information signal and the spectral values of the short-term spectrum of the audio signal are then summed and afterwards processed again to obtain a processed data stream including audio information and information to be introduced. Because the information to be introduced are introduced without changing to the time domain, the block rastering underlying the short-term spectrum are not touched, thus introducing a watermark will not lead to tandem encoding effects. | 03-15-2012 |
20120179456 | LOUDNESS MAXIMIZATION WITH CONSTRAINED LOUDSPEAKER EXCURSION - An original loudness level of an audio signal is maintained for a mobile device while maintaining sound quality as good as possible and protecting the loudspeaker used in the mobile device. The loudness of an audio (e.g., speech) signal may be maximized while controlling the excursion of the diaphragm of the loudspeaker (in a mobile device) to stay within the allowed range. In an implementation, the peak excursion is predicted (e.g., estimated) using the input signal and an excursion transfer function. The signal may then be modified to limit the excursion and to maximize loudness. | 07-12-2012 |
20120239385 | SOUND PROCESSING BASED ON A CONFIDENCE MEASURE - A method for processing sound that includes, generating one or more noise component estimates relating to an electrical representation of the sound and generating an associated confidence measure for the one or more noise component estimates. The method further comprises processing, based on the confidence measure, the sound. | 09-20-2012 |
20130151241 | METHOD OF EMBEDDING DIGITAL INFORMATION INTO AUDIO SIGNAL MACHINE-READABLE STORAGE MEDIUM AND COMMUNICATION TERMINAL - A method for embedding digital information into an audio signal, is provided. The method includes dividing the digital information into low-priority data and high-priority data; dividing the audio signal into first and second signal parts; embedding at least one echo signal into the first signal part; embedding a communication signal modulated with low-priority data, which has a spectrum according to psychoacoustic analysis of the second signal part, into the second signal part; and combining the embedded first and second signal parts. | 06-13-2013 |
20130253917 | PSYCHOACOUSTIC FILTER DESIGN FOR RATIONAL RESAMPLERS - The present document relates to the design of anti-aliasing and/or anti-imaging filters for resamplers using rational resampling factors. In particular, the present document relates to a method for designing such filters having a reduced number of filter coefficients or an increased perceptual performance, as well as to the filters designed using such method. A method for designing a filter ( | 09-26-2013 |
20130346070 | DEVICE AND METHOD FOR ACOUSTIC COMMUNICATION - Disclosed is an acoustic communication method that includes filtering an audio signal to attenuate a high frequency section of the audio signal; generating a residual signal which corresponds to a difference between the audio signal and the filtered signal; generating a psychoacoustic mask for the audio signal based on a predetermined psychoacoustic model; generating a psychoacoustic spectrum mask by combining the residual signal with the psychoacoustic mask; generating an acoustic communication signal by modulating digital data according to the acoustic signal spectrum mask; and combining the acoustic communication signal with the filtered signal. | 12-26-2013 |
20140081627 | METHOD FOR OPTIMIZATION OF MULTIPLE PSYCHOACOUSTIC EFFECTS - A method for optimizing multiple psychoacoustic effects in a sound system includes synthesizing a high-frequency restored version of a input signal; adding the high-frequency restored version of the input signal to the input signal to create a second signal; synthesizing a third signal having enhanced spatialization from the second signal; synthesizing a fourth signal having virtual bass from the second signal; and, adding the third and fourth signals, or second, third and fourth signals, together to create an output signal. | 03-20-2014 |
20140122063 | METHOD AND SYSTEM FOR ESTIMATING PHYSIOLOGICAL PARAMETERS OF PHONATION - The invention consists of a method and computing system for recording and analyzing the voice which allows a series of parameters of phonation to be calculated. These transmit relevant information regarding effects caused by organic disorders (which affect the physiology of the larynx) or neurological disorders (which affect the cerebral centers of speech). The classification methods are also considered an essential part of the invention which allow estimations of the existing dysfunction to be obtained and for the allocation of personality. The usefulness of the invention lies in the possibility of applying the dysfunction estimation in primary care service centers for patient screening to specialist care centers, simplifying examination protocols, saving costs and reducing waiting lists. This methodology can also be used for detecting the personality of a speaker by their voice, allowing access to installations or services. | 05-01-2014 |
20150106083 | AUDIO SIGNAL LOUDNESS DETERMINATION AND MODIFICATION IN THE FREQUENCY DOMAIN - Methods of, apparatuses for, and non-transitory computer readable media having instructions thereon that when executed cause carrying out methods of determining and modifying the perceived loudness of a frequency domain audio signal where the frequency resolution, and corresponding temporal coverage of the frequency domain information is not constant. The frequency (and thus temporal) resolution of the perceived loudness processing is maintained constant at the longest block size. One method includes a block combiner and a loudness modification interpolator. | 04-16-2015 |
20160005421 | LANGUAGE ANALYSIS BASED ON WORD-SELECTION, AND LANGUAGE ANALYSIS APPARATUS - The invention relates to a method for wording-based speech analysis. In order to provide a method that allows automated analysis of largely arbitrary features of a person from whom a voice file that needs to be analysed comes, the invention detaches itself from the known concept of evaluating static keyword lists for the personality type. The method according to the invention comprises the preparation of a computer system by formation of a reference sample that allows the comparison that is necessary for feature recognition with other persons. The preparation of the computer system involves the recording and storage of a further voice file in addition to the voice files of the reference sample, the analysis of the additionally recorded voice file and the output of the recognized features using at least one output unit connected to the computer system. Furthermore, the invention relates to a speech analysis device for carrying out the method. | 01-07-2016 |
20160104497 | APPARATUS AND METHOD FOR GENERATING AN ADAPTIVE SPECTRAL SHAPE OF COMFORT NOISE - An apparatus for decoding an encoded audio signal to obtain a reconstructed audio signal is provided, having: a receiving interface for receiving one or more frames, a coefficient generator, and a signal reconstructor. The coefficient generator is configured to determine one or more first audio signal coefficients, and one or more noise coefficients. Moreover, the coefficient generator is configured to generate one or more second audio signal coefficients, depending on the one or more first audio signal coefficients and depending on the one or more noise coefficients. The audio signal reconstructor is configured to reconstruct a first portion of the reconstructed audio signal depending on the one or more first audio signal coefficients and the audio signal reconstructor is configured to reconstruct a second portion of the reconstructed audio signal depending on the one or more second audio signal coefficients, if the current frame is not received by the receiving interface or if the current frame being received by the receiving interface is corrupted. | 04-14-2016 |
20160140973 | APPARATUS AND METHOD FOR DECODING AND ENCODING AN AUDIO SIGNAL USING ADAPTIVE SPECTRAL TILE SELECTION - An apparatus for decoding an encoded signal includes: an audio decoder for decoding an encoded representation of a first set of first spectral portions to obtain a decoded first set of first spectral portions; a parametric decoder for decoding an encoded parametric representation of a second set of second spectral portions to obtain a decoded representation of the parametric representation, wherein the parametric information includes, for each target frequency tile, a source region identification as a matching information; and a frequency regenerator for regenerating a target frequency tile using a source region from the first set of first spectral portions identified by the matching information. | 05-19-2016 |
20160140979 | APPARATUS AND METHOD FOR DECODING AN ENCODED AUDIO SIGNAL USING A CROSS-OVER FILTER AROUND A TRANSITION FREQUENCY - Apparatus for decoding an encoded audio signal including an encoded core signal, including: a core decoder for decoding the encoded core signal to obtain a decoded core signal; a tile generator for generating one or more spectral tiles having frequencies not included in the decoded core signal using a spectral portion of the decoded core signal; and a cross-over filter for spectrally cross-over filtering the decoded core signal and a first frequency tile having frequencies extending from a gap filling frequency to an upper border frequency or for spectrally cross-over filtering a first frequency tile and a second frequency tile. | 05-19-2016 |
20160140980 | APPARATUS FOR DECODING AN ENCODED AUDIO SIGNAL WITH FREQUENCY TILE ADAPTION - Apparatus for decoding an encoded audio signal including an encoded core signal and parametric data, including: a core decoder for decoding the encoded core signal to obtain a decoded core signal; an analyzer for analyzing the decoded core signal before or after performing a frequency regeneration operation to provide an analysis result; and a frequency regenerator for regenerating spectral portions not included in the decoded core signal using a spectral portion of the decoded core signal, the parametric data, and the analysis result. | 05-19-2016 |
20160140981 | APPARATUS AND METHOD FOR DECODING OR ENCODING AN AUDIO SIGNAL USING ENERGY INFORMATION VALUES FOR A RECONSTRUCTION BAND - An apparatus for decoding an encoded audio signal having an encoded representation of a first set of first spectral portions and an encoded representation of parametric data indicating spectral energies for a second set of second spectral portions, has: an audio decoder for decoding the encoded representation of the first set of the first spectral portions to obtain a first set of first spectral portions and for decoding the encoded representation of the parametric data to obtain a decoded parametric data for the second set of second spectral portions indicating, for individual reconstruction bands, individual energies; a frequency regenerator for reconstructing spectral values in a reconstruction band having a second spectral portion using a first spectral portion of the first set of the first spectral portions and an individual energy for the reconstruction band, the reconstruction band having a first spectral portion and the second spectral portion. | 05-19-2016 |
20160196826 | METHOD AND APPARATUS FOR ENCODING AND DECODING AUDIO SIGNAL | 07-07-2016 |
20160254005 | METHOD AND APPARATUS TO ENCODE AND DECODE AN AUDIO/SPEECH SIGNAL | 09-01-2016 |