Patent application number | Description | Published |
20080236364 | Tone processing apparatus and method - For at least one music piece, a storage section stores tone data of each of a plurality of fragments segmented from the music piece and stores a first descriptor indicative of a musical character of each of the fragments in association with the fragment. Descriptor generation section receives input data based on operation by a user and generates a second descriptor, indicative of a musical character, on the basis of the received input data. Determination section determines similarity between the second descriptor and the first descriptor of each of the fragments. Selection section selects the tone data of at least one fragment on the basis of a result of the similarity determination by the determination section. On the basis of the tone data of the selected at least one fragment, a data generation section generates tone data to be outputted. | 10-02-2008 |
20080300702 | MUSIC SIMILARITY SYSTEMS AND METHODS USING DESCRIPTORS - Systems and methods for determining similarity between two or more audio pieces are disclosed. An illustrative method for determining musical similarities includes extracting one or more descriptors from each audio piece, generating a vector for each of the audio pieces, extracting one or more audio features from each of the audio pieces, calculating values for each audio feature, calculating a distance between a vector containing the normalized values and the vectors containing the audio pieces, and outputting a response to a user or process indicating the similarity between the audio pieces. The descriptors can be used in performing content-based audio classification and for determining similarities between music. The descriptors that can be extracted from each audio piece can include tonal descriptors, dissonance descriptors, rhythm descriptors, and spatial descriptors. | 12-04-2008 |
20090019996 | Music piece processing apparatus and method - Storage section has stored therein music piece data sets of a plurality of music pieces, each of the music piece data sets including respective tone data of a plurality of fragments of the music piece and respective character values indicative of musical characters of the fragments. Each of the fragments of a selected main music piece is selected as a main fragment, and each one, other than the selected main fragment, of a plurality of fragments of two or more music pieces is selected as a sub fragment. A similarity index value indicative of a degree of similarity between the character value of the main fragment and the character value of the specified sub fragment is calculated. For each of the main fragments, a sub fragment presenting a similarity index value that satisfies a predetermined selection condition is selected for processing the tone data of the main music piece. | 01-22-2009 |
20090095145 | Fragment search apparatus and method - Analysis section divides waveform data of a given music piece into waveform data of a plurality of fragments and divides the waveform data of each of the fragments into one or more events of sound, and obtains a character value indicative of a character of the waveform data pertaining to each of the divided events. Storage section stores respective music piece data and music piece composing data of one or more music pieces. The music piece composing data include a character value indicative of a character of the waveform data pertaining to each of the events of each of the fragments. Search section searches (or retrieves) for, from among the stored music piece composing data, one event or a plurality of successive events having a character value of a high degree of similarity to one or more events included in a designated fragment. | 04-16-2009 |
20110000360 | Apparatus and Method for Creating Singing Synthesizing Database, and Pitch Curve Generation Apparatus and Method - Waveform data representative of singing voices of a singing music piece are analyzed to generate melody component data representative of variation over time in fundamental frequency component presumed to represent a melody in the singing voices. Then, through machine learning that uses score data representative of a musical score of the singing music piece and the melody component data, a melody component model, representative of a variation component presumed to represent the melody among the variation over time in fundamental frequency component, is generated for each combination of notes. Parameters defining the melody component models and note identifiers indicative of the combinations of notes whose variation over time in fundamental frequency component are represented by the melody component models are stored into a pitch curve generating database in association with each other. | 01-06-2011 |
20110004476 | Apparatus and Method for Creating Singing Synthesizing Database, and Pitch Curve Generation Apparatus and Method - Variation over time in fundamental frequency in singing voices is separated into a melody-dependent component and a phoneme-dependent component, modeled for each of the components and stored into a singing synthesizing database. In execution of singing synthesis, a pitch curve indicative of variation over time in fundamental frequency of the melody is synthesized in accordance with an arrangement of notes represented by a singing synthesizing score and the melody-dependent component, and the pitch curve is corrected, for each of pitch curve sections corresponding to phonemes constituting lyrics, using a phoneme-dependent component model corresponding to the phoneme. Such arrangements can accurately model a singing expression, unique to a singing person and appearing in a melody singing style of the person, while taking into account phoneme-dependent pitch variation, and thereby permits synthesis of singing voices that sound more natural. | 01-06-2011 |
20120106746 | Technique for Estimating Particular Audio Component - Candidate frequencies per unit segment of an audio signal are identified. First processing section identifies an estimated train that is a time series of candidate frequencies, each selected for a different one of the segments, arranged over a plurality of the unit segments and that has a high likelihood of corresponding to a time series of fundamental frequencies of a target component. Second processing section identifies a state train of states, each indicative of one of sound-generating and non-sound-generating states of the target component in a different one of the segments, arranged over the unit segments. Frequency information which designates, as a fundamental frequency of the target component, a candidate frequency corresponding to the unit segment in the estimated train is generated for each unit segment corresponding to the sound-generating state. Frequency information indicative of no sound generation is generated for each unit segment corresponding to the non-sound-generating state. | 05-03-2012 |
20120106758 | Technique for Suppressing Particular Audio Component - A coefficient train processing section, which sequentially generates per unit segment a processing coefficient train for suppressing a target component of an audio signal, includes a basic coefficient train generation section and coefficient train processing section. The basic coefficient train generation section generates a basic coefficient train where basic coefficient values corresponding to frequencies within a particular frequency band range are each set at a suppression value that suppresses the audio signal while coefficient values corresponding to frequencies outside the particular frequency band range are each set at a pass value that maintains the audio signal. The coefficient train processing section generates the processing coefficient train, per unit segment, by changing, to the pass value, each of the coefficient values corresponding to frequencies other than the target component among the coefficient values corresponding to the frequencies within the particular frequency band range. | 05-03-2012 |
20120201385 | Graphical Audio Signal Control - Signal processing section of a terminal converts acquired audio signals of a plurality of channels into frequency spectra set, calculates sound image positions corresponding to individual frequency components, and displays, on a display screen, the calculated sound image positions results by use of a coordinate system having coordinate axes of the frequency components and sound image positions. User-designated partial region of the coordinate system is set as a designated region and an amplitude-level adjusting amount is set for the designated region, so that the signal processing section adjusts amplitude levels of frequency components included in the frequency spectra and in the designated region, converts the adjusted frequency components into audio signals and outputs the converted audio signals. | 08-09-2012 |
20120310650 | VOICE SYNTHESIS APPARATUS - In a voice synthesis apparatus, a phoneme piece interpolator acquires first phoneme piece data corresponding to a first value of sound characteristic, and second phoneme piece data corresponding to a second value of the sound characteristic. The first and second phoneme piece data indicate a spectrum of each frame of a phoneme piece. The phoneme piece interpolator interpolates between each frame of the first phoneme piece data and each frame of the second phoneme piece data so as to create phoneme piece data of the phoneme piece corresponding to a target value of the sound characteristic which is different from either of the first and second values of the sound characteristic. A voice synthesizer generates a voice signal having the target value of the sound characteristic based on the created phoneme piece data. | 12-06-2012 |
20130311189 | VOICE PROCESSING APPARATUS - In a voice processing apparatus, a processor performs generating a converted feature by applying a source feature of source voice to a conversion function, generating an estimated feature based on a probability that the source feature belongs to each element distribution of a mixture distribution model that approximates distribution of features of voices having different characteristics, generating a first conversion filter based on a difference between a first spectrum corresponding to the converted feature and an estimated spectrum corresponding to the estimated feature, generating a second spectrum by applying the first conversion filter to a source spectrum corresponding to the source feature, generating a second conversion filter based on a difference between the first spectrum and the second spectrum, and generating target voice by applying the first conversion filter and the second conversion filter to the source spectrum. | 11-21-2013 |
20140006018 | VOICE PROCESSING APPARATUS | 01-02-2014 |