Entries |
Document | Title | Date |
20080201135 | Spoken Dialog System and Method - A spoken dialog system stores a history of dialog states in a memory, outputs a system response in a current dialog state, inputs a user utterance, performs speech recognition of the user utterance, to obtain one or a plurality of recognition candidates of the user utterance and likelihoods thereof with respect to the user utterance, calculates a degree of state conformance of each of the current and the preceding dialog states stored in the memory with respect to the user utterance, selects one of the current and the preceding dialog states and one of the recognition candidates based on a combination of the degree of state conformance of each dialog state and the likelihood of each recognition candidate, and performs transition from the current dialog state to a new dialog state based on dialog state selected and recognition candidate selected. | 08-21-2008 |
20080201136 | Apparatus and Method for Speech Recognition - A speech recognition apparatus includes a first storing unit configured to store a first acoustic model invariable regardless of speaker and environment, a second storing unit configured to store a classification model that has shared parameters and non-shared parameters with the first acoustic model to classify second acoustic models, a recognizing unit configured to calculate a first likelihood with regard to the input speech by applying the first acoustic model to the input speech and obtain calculation result on the shared parameter and a plurality of candidate words that have relatively large values as the first likelihood, and a calculating unit configured to calculate a second likelihood for each of the groups with regard to the input speech by use of the calculation result on the shared parameters and the non-shared parameters of the classification model. | 08-21-2008 |
20080208570 | Methods and Apparatus for Blind Separation of Multichannel Convolutive Mixtures in the Frequency Domain - A method and apparatus performing blind source separation using frequency-domain normalized multichannel blind deconvolution. Multichannel mixed signals are frames of N samples including r consecutive blocks of M samples. The frames are separated using separating filters in frequency domain in an overlap-save manner by discrete Fourier transform (DFT). The separated signals are then converted back into time domain using inverse DFT applied to a nonlinear function. Cross-power spectra between separated signals and nonlinear-transformed signals are computed and normalized by power spectra of both separated signals and nonlinear-transformed signals to have flat spectra. Time domain constraint is then applied to preserve first L cross-correlations. These alias-free normalized cross-power spectra are further constrained by nonholonomic constraints. Then, natural gradient is computed by convolving alias-free normalized cross-power spectra with separating filters. After the separating filters length is constrained to L, the separating filters are updated using the natural gradient and normalized to have unit norm. Terminating conditions are checked to determine if separating filters converged. | 08-28-2008 |
20080221876 | Method for processing audio data into a condensed version - Recorded audio data is compressed to obtain a condensed version, by first selecting a number of subsequent non-overlapping segments of the audio data, then reducing each segment by temporal compression and combining the reduced segments into a shortened version which can be output. The temporal compression may be made with a local compression factor which varies between the segments. The segmenting may be chosen based on an innovation signal derived from the audio data itself to indicate a content change rate in the audio data. | 09-11-2008 |
20080235006 | Method and Apparatus for Decoding an Audio Signal - An apparatus for decoding an audio signal and method thereof are disclosed. The present invention includes receiving the audio signal and spatial information, identifying a type of modified spatial information, generating the modified spatial information using the spatial information, and decoding the audio signal using the modified spatial information, wherein the type of the modified spatial information includes at least one of partial spatial information, combined spatial information and expanded spatial information. Accordingly, an audio signal can be decoded into a configuration different from a configuration decided by an encoding apparatus. Even if the number of speakers is smaller or greater than that of multi-channels before execution of downmixing, it is able to generate output channels having the number equal to that of the speakers from a downmix audio signal. | 09-25-2008 |
20080243489 | Multiple stream decoder - A method is provided for decoding data streams in a voice communication system. The method includes: receiving two or more data streams having voice data encoded therein; decoding each data stream into a set of speech coding parameters; forming a set of combined speech coding parameters by combining the sets of decoded speech coding parameters, where speech coding parameters of a given type are combined with speech coding parameters of the same type; and inputting the set of combined speech coding parameters into a speech synthesizer. | 10-02-2008 |
20080243490 | MULTI-LANGUAGE TEXT FRAGMENT TRANSCODING AND FEATURIZATION - Embodiments of the present invention provide methods and apparatus for transcoding received text fragments and documents. A featurization configuration is produced to create token components for evaluating the content of the text fragment. Other embodiments may be described and claimed. | 10-02-2008 |
20080243491 | Modulation Device, Modulation Method, Demodulation Device, and Demodulation Method - A modulation device including: a modulation unit for modulating a carrier in an audible sound range by an encoded transmission signal to generate a modulated signal; a masker sound generation unit for generating a masker signal outputted as a masker sound for making the modulated signal harder to hear when transmitted with the modulated signal; and an acoustic signal generation unit for inserting the masker signal in the modulated signal to generate an acoustic signal. | 10-02-2008 |
20080255827 | Voice Conversion Training and Data Collection - It may be desirable to provide a way to collect high quality speech training data without undue burden to the user. Speech training data may be collected during normal usage of a device. In this way, the collection of speech training data may be effectively transparent to the user, without the need for a distinct collection mode from the user's point of view. For example, where the device is or includes a phone (such as a cellular phone), when the user makes or receives a phone call to/from another party, speech training data may be automatically collected from one or both of the parties during the phone call. | 10-16-2008 |
20080255828 | DATA COMMUNICATION VIA A VOICE CHANNEL OF A WIRELESS COMMUNICATION NETWORK USING DISCONTINUITIES - A system and method for data communication over a cellular communications network that allows the transmission of digital data over a voice channel using a vocoder that operates in different modes depending upon characteristics of the inputted signal it receives. To prepare the digital data for transmission, one or more carrier signals are encoded with the digital data using one of a number of modulation schemes that utilize differential phase shift keying to give the modulated carrier signal certain periodicity and energy characteristics that allow it to be transmitted by the vocoder at full rate. The modulation schemes include DPSK using either a single or multiple frequency carriers, combined FSK-DPSK modulation, combined ASK-DPSK, PSK with a phase tracker in the demodulator, as well as continuous signal modulation (ASK or FSK) with inserted discontinuities that can be independent of the digital data. | 10-16-2008 |
20080255829 | Method and Test Signal for Measuring Speech Intelligibility - A method and apparatus for estimating speech intelligibility in a mobile communications network component handling two-way communication between two ends of a signal path. Test signals adapted for speech intelligibility measurements are inserted into the signal path to simulate two-way communication. Double-talk is detected during the communication, and speech intelligibility measurements are performed only during periods of double-talk. This enables the effect of echo to be taken into account while avoiding undesirable effects from non-linear processing, and comfort noise if present, in the signal path. Voice enhancement devices may then be adjusted in response to the estimated speech intelligibility. | 10-16-2008 |
20080281585 | Voice, location finder, modulation format selectable Wi-Fi, cellular mobile systems - Voice and location finder Analog to digital (A/D) converted signal is processed and provided to Orthogonal Frequency Division Multiplexed (OFDM), Orthogonal Frequency Division Multiple Access (OFDMA), Time Division Multiple Access (TDMA), Global Mobile System (GSM), spread spectrum and Wideband Code Division Multiple Access (WCDMA) baseband processor, filter, modulation format selectable modulator and transmitter. Receiver and modulation format selectable demodulator for location finder satellite and ground based location finder, Wireless Local Area Network (WLAN), wireless fidelity (Wi-Fi), world wide web (www) and cellular network provided signals. Modulator and a transmitter for modulation and transmission of selected or combined signal having two distinct transmitters operated in separate radio frequency (RF) bands. Receiver with two antennas for receiving transmitted signal, a demodulator, receive filter and receiver processor for demodulation, filtering and processing of received TDMA signal. Receive filter for filtering of the TDMA signal is mismatched to the transmit TDMA baseband filter. Receive processor provides received baseband mis-match filtered cross-correlated in-phase and quadrature-phase TDMA signal. | 11-13-2008 |
20080288245 | TANDEM-FREE INTERSYSTEM VOICE COMMUNICATION - Techniques are presented herein to provide tandem-free operation between two wireless terminals through two otherwise incompatible wireless networks. Specifically, embodiments provide tandem-free operation between a wireless terminal communicating through a continuous transmission (CTX) wireless channel to a wireless terminal communicating through a discontinuous transmission (DTX) wireless channel. In a first aspect, inactive speech frames are translated between DTX and CTX formats. In a second aspect, each wireless terminal includes an active speech decoder that is compatible with the active speech encoder on the opposite end of the mobile-to-mobile connection. | 11-20-2008 |
20090006082 | ACTIVITY-WARE FOR NON-TEXTUAL OBJECTS - Providing for summarization and analysis of audio content is described herein. By way of example, an oral conversation can be analyzed, such that points of interest within the oral conversation can be identified and file locations related to such points of interest can be marked. Points of interest can be inferred based on a level of energy, e.g., excitement, pitch, tone, pace, or the like, associated with one or more speakers. Alternatively, or in addition, speaker and/or reviewer activity can form the basis for identifying points of interest within the conversation. Moreover, a compilation of the identified points of interest and portions of the original oral conversation related thereto can be assembled. As described herein, audio content can be succinctly summarized with respect to inferred and/or indicated points of interest, to facilitate an efficient and pertinent review of such content. | 01-01-2009 |
20090006083 | Systems And Methods For Spoken Information - A communication system, according to various aspects of the present invention, communicates via a network with a person who operates a voice input/output device. The communication system provides collecting and reporting services for journals and surveys all via spoken information. The communication system includes a meta-data database, a constructing engine, and a conversing engine. The constructing engine directs collecting of meta-data directs constructing a journal in accordance with the meta-data database. The conversing engine collects the meta-data from the person for storage in the meta-data database and collects data from the person for storage in the journal. The conversing engine collects further spoken information for a survey. | 01-01-2009 |
20090018823 | Speech coding - A method of encoding a speech signal for transmission in a communications network involves transforming the signal into a sequence of frames, each frame including a plurality of coefficients; dividing the frame into a set of sub-bands each containing a sub-set of the plurality of coefficients; applying an optimization function to calculate respective test values corresponding to respective candidate sets of pulses representing a coded form of at least some of the coefficients; and selecting a set of pulses having a test value which meets a selectability criterion. If the optimisation function is an error function, the selectability criterion is minimization of the function. If the optimization function is an iterative function, the selectability criterion is selecting an iteration in which a certain condition is reached. | 01-15-2009 |
20090024386 | Multi-mode speech encoding system - A method comprises analyzing each frame of a plurality of frames of the speech signal to determine one or more speech parameters for the speech signal; deciding, for each frame of the plurality of frames of the speech signal, based on the one or more speech parameters of the speech signal, to select one of a plurality of encoding modes including a first encoding mode and a second encoding mode for encoding each frame of the plurality of frames of the speech signal; encoding each frame of the plurality of frames of the speech signal according to the selected one of the plurality of encoding modes for each frame of the plurality of frames in the deciding; the first encoding mode supports a first encoding rate and the second encoding mode supports a second encoding rate, wherein the first encoding rate is the same encoding rate as the encoding rate. | 01-22-2009 |
20090030675 | Apparatus and method of encoding and decoding audio signal - In one embodiment, the method includes receiving the audio signal having a plurality of random access units. The random access unit includes one or more frames and at least one of the frames is a random access frame. The random access frame is a frame encoded such that previous frames are not necessary to decode the random access frame. The embodiment further includes reading location information from the audio signal. The location information indicates whether random access unit size information is stored or not in the audio signal. If the random access unit size information is stored, the location information further indicates a location where the random access unit size information is stored in the audio signal. Random access unit size information is read according to the location information. The random access unit size information indicates a distance between random access frames in bytes. The random access units are decoded based on the random access size information. | 01-29-2009 |
20090076802 | WIDEBAND CODEC NEGOTIATION - The invention proposes several methods for codec handling. In specific, methods involving providing a supported codec list of a Call Control Server are described. A node receives information, whether a terminal supports a wideband codec, wherein the information is received in call set up signaling from the terminal of the subscriber. Furthermore, configuration information is retrieved, whether a Radio Access Node supports the wideband codec. Additionally, information is retrieved, whether a media gateway supports the wideband codec, wherein the information is either provided by the operator or retrieved from the media gateway (MGW1, MGW2, MGWx). The information is analyzed and in response to the analysis a supported codec list is provided. Furthermore, alternative embodiments and devices adapted for the methods are disclosed. | 03-19-2009 |
20090076803 | WIRED AND MOBILE WI-FI NETWORKS, CELLULAR, GPS AND OTHER POSITION FINDING SYSTEMS - A voice signal is processed and connected by wire to a mobile wireless unit for further processing into time division multiple access (TDMA) and into spread spectrum signals. The wireless unit is processing a data signal into orthogonal frequency division multiplex (OFDM) signal. The wireless unit receives and processes a position finder signal from Global Positioning System (GPS) satellite and from land based transmitter and provides processed position finder signals. The wireless unit generates a processed touch screen control signal and processes the touch screen control signal with processed position finder, TDMA and spread spectrum signal or with processed OFDM signal and provides these processed signals to a transmitter for wireless signal transmission. The processed OFDM signal is used in a Wi-Fi wireless network and the TDMA or spectrum signal is used in a cellular system, wherein the wireless network and the cellular system are distinct. Processing of position finder signal incorporates step of receiving and processing a signal from Global Positioning System (GPS) satellite and from land based transmitter and for providing processed position finder signals received from GPS satellite and from land based transmitters for transmission. Processed TDMA or spread spectrum signals include code division multiple access (CDMA) and code selectable cross-correlated in-phase and quadrature-phase baseband signals. Modulation and amplification structures and methods of the hybrid wired and wireless systems include amplification by non-linearly amplified (NLA) and by linearly amplified transmitters. | 03-19-2009 |
20090125299 | Speech recognition system - A speech recognition system comprises at least a speech recognition engine and a display device that contains a signal status interface and a textual interface. The signal status interface is used to show a recording status, a speech processing status, or a complete speech recognition status based on waveforms display. The textual interface is used to show word units of the speech recognition results. Two sets of commands are connected with each waveform unit on the signal status interface and each word unit on the textural interface, respectively, in order to allow users to correct the recognition errors or to adjust the speech recognition system. | 05-14-2009 |
20090132240 | METHOD AND APPARATUS FOR MANAGING SPEECH DECODERS - A method and apparatus that manages speech decoders in a communication device is disclosed. The method may include detecting a change in transmission rate from a higher rate to a lower rate, decoding and shifting a first, second and third received first decoder set of frame parameters, generating a first decoder output audio frame from the previously shifted frame parameters, generating a first, second and third second decoder audio fill frame, the second decoder being a higher rate decoder than first decoder, outputting a first and second second decoder audio fill frame, combining the first decoder audio frame and the third second decoder audio fill frame with overlapping triangular windows, and outputting combined first decoder and second decoder frames to an audio buffer for subsequent transmission to a user of the communication device. In an alternative embodiment, another method may include detecting and processing a change in transmission rate from a lower rate to a higher rate. | 05-21-2009 |
20090157392 | PROVIDING SPEECH RECOGNITION DATA TO A SPEECH ENABLED DEVICE WHEN PROVIDING A NEW ENTRY THAT IS SELECTABLE VIA A SPEECH RECOGNITION INTERFACE OF THE DEVICE - The present invention discloses a solution for providing a phonetic representation for a content item along with a content item delivered to a speech enabled computing device. The phonetic representation can be specified in a manner that enables it to be added to a speech recognition grammar of the speech enabled computing device. Thus, the device can recognize speech commands using the newly added phonetic representation that involve the content item. Current implementations of speech recognition systems of this type rely internal generation of speech recognition data that is added to the speech recognition grammar. Generation of speech recognition data can, however, be resource intensive, which can be particularly problematic when the speech enabled device is resource limited. The disclosed solution offloads the task of providing the speech recognition data to an external device, such as a relatively resource rich server or a desktop device. | 06-18-2009 |
20090164209 | DEVICE AND METHOD FOR CAPTURING AND FORWARDING VERBALIZED COMMENTS TO A REMOTE LOCATION - Disclosed is a device for sending a verbalized comment to a remote computer server. The device includes a processor that executes and operates the various software and hardware components. A microphone is utilized to record a comment. Temporary storage buffers the recorded comment. An auto-dialing application is utilized to automatically dial a telephone number associated with the remote computer server. The mode of transmission can include an RF module for automatically establishing a voice connection with an external mobile network, a WiFi module for automatically establishing a voice connection with an external IP network, and a plain old telephone service (POTS) interface module for automatically establishing a voice connection with an external legacy telephone network. The comment can be sent via email over an IP network, via MMS over a mobile network, or directly over a telephone connection including POTS, cellular (mobile), and VoIP. | 06-25-2009 |
20090182555 | Speech Enhancement Device and Method for the Same - A speech enhancement device and a method for the same are included. The device includes a down-converter, a speech enhancement processor, and an up-converter. The method includes steps of down-converting audio signals to generate down-converted audio signals; performing speech enhancement on the down-converted audio signals to generate speech-enhanced audio signals; and up-converting the speech enhancement audio signals to generate up-converted audio signals. | 07-16-2009 |
20090204393 | Systems and Methods For Adaptive Multi-Rate Protocol Enhancement - A method of processing a codec sample is provided. The method includes: removing from a first portion of the codec sample, a first number of first information bits. The first information bits are indicative of frame information associated with the codec sample. The method also includes inserting at the first portion of the codec sample from a second portion of the codec sample, a second number of data bits. The first number of the first information bits is greater than or equal to the second number of the data bits. The method also includes removing the second portion of the codec sample. The method may also include encrypting and decrypting the codec sample. In some embodiments, the codec sample is an adaptive multi-rate codec sample. In some embodiments, the adaptive multi-rate codec sample is a 5.15 mode adaptive multi-rate codec sample. | 08-13-2009 |
20090234643 | TRANSCRIPTION SYSTEM AND METHOD - A transcription system and method for facilitating the transcription of audio messages are disclosed. The transcription system may include a telephony server for receiving an audio message from a customer, an audio broadcast server coupled to the telephony server for streaming the message from the customer in real-time, at least one agent transcriber for receiving the streamed audio message from the audio broadcast server for facilitating the transcribing of the streamed audio message into a transcription text file in real time, and a computer server for providing the customer access to the transcribed text file. | 09-17-2009 |
20090248402 | VOICE MIXING METHOD AND MULTIPOINT CONFERENCE SERVER AND PROGRAM USING THE SAME METHOD - The voice mixing method includes a first step for selecting voice information from a plurality of voice information, a second step for adding up all the selected voice information, a third step for obtaining a voice signal totaling the voice signals other than one voice signal, of the selected voice signals, a fourth step for encoding the voice information obtained in the second step, a fifth step for encoding the voice signal obtained in the third step, and a sixth step for copying the encoded information obtained in the fourth step into the encoded information in the fifth step. | 10-01-2009 |
20090265165 | AUTOMATIC META-DATA TAGGING PICTURES AND VIDEO RECORDS - A method and apparatus for labeling an image recorded by a portable electronic device with descriptive tags is disclosed. Sounds in the vicinity of the portable electronic device are recorded. When the image is captured, the audio record of recorded sounds from a first predetermined period of time prior to the capture of the image until a second predetermined period of time after the capture of the image is retrieved. The retrieved audio record is processed to create a list of recognizable words in the retrieved audio record. The list of recognizable words is then stored in a metatag field associated with the captured image. | 10-22-2009 |
20090265166 | BOUNDARY ESTIMATION APPARATUS AND METHOD - A boundary estimation apparatus includes an boundary estimation unit which estimates a first boundary separating a speech into first meaning units, a boundary estimation unit configured to estimate a second boundary separating a speech, related to the speech, into second meaning units related to the first meaning units, a pattern generating unit configured to generate a representative pattern showing representative characteristic in the analysis interval, a similarity calculation unit configured to calculate a similarity between the representative pattern and a characteristic pattern showing feature in a calculation interval for calculating the similarity in the speech, and the boundary estimation unit estimate as the second boundary based on the calculation interval, in which the similarity is higher than a threshold value or relatively high. | 10-22-2009 |
20090281793 | Time varying processing of repeated digital audio samples in accordance with a user defined effect - A programmed “Stutter Edit” creates, stores and triggers combinations of effects to be used on a repeated short sample (“slice”) of recorded audio. The combination of effects (“gesture”) act on the sample over a specified duration (“gesture length”), with the change in parameters for each effect over the gesture length being dictated by user-defined curves. Such a system affords wide manipulation of audio recorded on-the-fly, perfectly suited for live performance. These effects preferably include not only stuttering but also imposing an amplitude envelope on the slice being triggered, sample rate and bit rate manipulation, panning (interpolation between pre-defined spatial positions), high- and low-pass filters and compression. Destructive edits, such as reversing, pitch shifting, and fading may also alter the way the Stutter Edit is heard. More advanced techniques, include using filters, FX processors, and other plug-ins, can increase the detail and uniqueness of a particular Stutter Edit effect. | 11-12-2009 |
20090281794 | METHOD AND SYSTEM FOR ORDERING A GIFT WITH A PERSONALIZED CELEBRITY AUDIBLE MESSAGE - A method for providing a gift with a personalized celebrity message is disclosed. The method includes providing a database of prerecorded audio celebrity messages; offering a customer to select a celebrity message from the database of prerecorded audio celebrity messages; offering the customer to produce a personalized audio message; combining the personalized audio message with the celebrity message into an incorporated audio message; saving the incorporated audio message on a storage medium incorporated with a playback device; and coupling the playback device with the saved incorporated audio message with a gift. | 11-12-2009 |
20090287477 | System and method for providing network coordinated conversational services - A system and method for providing automatic and coordinated sharing of conversational resources, e.g., functions and arguments, between network-connected servers and devices and their corresponding applications. In one aspect, a system for providing automatic and coordinated sharing of conversational resources includes a network having a first and second network device, the first and second network device each comprising a set of conversational resources, a dialog manager for managing a conversation and executing calls requesting a conversational service, and a communication stack for communicating messages over the network using conversational protocols, wherein the conversational protocols establish coordinated network communication between the dialog managers of the first and second network device to automatically share the set of conversational resources of the first and second network device, when necessary, to perform their respective requested conversational service. | 11-19-2009 |
20090292531 | SYSTEM FOR HANDLING A PLURALITY OF STREAMING VOICE SIGNALS FOR DETERMINATION OF RESPONSIVE ACTION THERETO - Streaming voice signals, such as might be received at a contact center or similar operation, are analyzed to detect the occurrence of one or more unprompted, predetermined utterances. The predetermined utterances preferably constitute a vocabulary of words and/or phrases having particular meaning within the context in which they are uttered. Detection of one or more of the predetermined utterances during a call causes a determination of response-determinative significance of the detected utterance(s). Based on the response-determinative significance of the detected utterance(s), a responsive action may be further determined. Additionally, long term storage of the call corresponding to the detected utterance may also be initiated. Conversely, calls in which no predetermined utterances are detected may be deleted from short term storage. In this manner, the present invention simplifies the storage requirements for contact centers and provides the opportunity to improve caller experiences by providing shorter reaction times to potentially problematic situations. | 11-26-2009 |
20090299733 | METHODS AND SYSTEM FOR CREATING AND EDITING AN XML-BASED SPEECH SYNTHESIS DOCUMENT - A method for creating and editing an XML-based speech synthesis document for input to a text-to-speech engine is provided. The method includes recording voice utterances of a user reading a pre-selected text and parsing the recorded voice utterances into individual words and periods of silence. The method also includes recording a synthesized speech output generated by a text-to-speech engine, the synthesized speech output being an audible rendering of the pre-selected text, and parsing the synthesized speech output into individual words and periods of silence. The method further includes annotating the XML-based speech synthesis document based upon a comparison of the recorded voice utterances and the recorded synthesized speech output. | 12-03-2009 |
20090299734 | STEREO AUDIO ENCODING DEVICE, STEREO AUDIO DECODING DEVICE, AND METHOD THEREOF - Disclosed is a stereo audio encoding device capable of improving a spatial image of a decoded audio in stereo audio encoding. In this device, an original cross correlation calculation unit ( | 12-03-2009 |
20090299735 | Method for Transferring an Audio Stream Between a Plurality of Terminals - A method of transferring an audio stream between at least two terminals, comprising the following steps: a step of connecting a first device and at least a second device; and a step of transferring an audio stream from the first device to the second device; the method being characterized in that it further comprises the following steps: a determination step during which it is determined that the first device or a first network to which the first device is connected or a second network to which the second device is connected is adapted to produce or transfer an audio stream comprising N channels and that the second device includes an electroacoustic transducer adapted to receive an audio stream comprising P channels; and a conversion step during which an audio stream sent by the first device or transmitted by the first network or transmitted by the second network is converted into an audio stream comprising P channels. An associated transfer system is also disclosed. Application to telephone or videophone communication in IP networks. | 12-03-2009 |
20090306970 | SYSTEM AND METHOD OF AN IN-BAND MODEM FOR DATA COMMUNICATIONS OVER DIGITAL WIRELESS COMMUNICATION NETWORKS - A system is provided for transmitting information through a speech codec (in-band) such as found in a wireless communication network. A modulator transforms the data into a spectrally noise-like signal based on the mapping of a shaped pulse to predetermined positions within a modulation frame, and the signal is efficiently encoded by a speech codec. A synchronization sequence provides modulation frame timing at the receiver and is detected based on analysis of a correlation peak pattern. A request/response protocol provides reliable transfer of data using message redundancy, retransmission, and/or robust modulation modes dependent on the communication channel conditions. | 12-10-2009 |
20090319260 | METHOD AND SYSTEM FOR AUDIO TRANSMIT PROCESSING IN AN AUDIO CODEC - Methods and systems for audio transmit processing in an audio CODEC are disclosed and may comprise receiving one or more analog and/or digital audio signals, and simultaneously processing the received one or more analog audio and/or digital audio signals via a plurality of processing paths of the audio CODEC. The digital audio signals may be generated via a digital microphone, which may comprise a microelectromechanical (MEMS) microphone, and may be utilized for audio beamforming. The received analog and digital signals may be processed at one or more sampling rates, and may be filtered via decimation filters. The received analog signals may be converted to digital signals. The processing may comprise converting a sampling rate of the received digital signals and the converted analog signals. The processing may comprises filtering of the received digital signals and the converted analog signals via infinite impulse response (IIR) filters. | 12-24-2009 |
20100004926 | APPARATUS AND METHOD FOR CLASSIFICATION AND SEGMENTATION OF AUDIO CONTENT, BASED ON THE AUDIO SIGNAL - An apparatus for classifying an input audio signal into audio contents of a first and second class, comprising an audio segmentation module adapted to segment said input audio signal into segments of a predetermined length; a feature computation module adapted to calculate for the segments features characterizing said audio input signal; a threshold comparison module adapted to generate a feature vector for each of said one or more segments based on a plurality of predetermined thresholds, the thresholds including for each of the audio contents of the first class and of the second class a substantially near certainty threshold, a substantially high certainty threshold, and a substantially low certainty threshold; and a classification module adapted to analyze the feature vector and classify each one of said one or more segments as audio contents of the first class, of the second class, or as non-decisive audio contents. | 01-07-2010 |
20100017196 | METHOD, SYSTEM, AND APPARATUS FOR COMPRESSION OR DECOMPRESSION OF DIGITAL SIGNALS - Embodiments of methods, apparatuses, devices and systems associated with compression and decompression of digital signals are disclosed. | 01-21-2010 |
20100057444 | METHOD AND SYSTEM OF EXTENDING BATTERY LIFE OF A WIRELESS MICROPHONE UNIT - A method of extending battery life of a wireless microphone unit includes muting the wireless microphone unit responsive to a mute signal from a base station unit, transmitting, by the wireless microphone unit, compressed muted audio data, wherein the compressed muted audio data is compressed via a first compression scheme, determining, by the wireless microphone unit, whether an unmute signal has been received from the base station unit, and responsive to a determination that the unmute signal has been received, unmuting the wireless microphone unit. The method further includes discontinuing transmission of the compressed muted audio data and transmitting compressed audio data via a second compression scheme, wherein the first transmitting step causes the wireless microphone unit to consume less power per unit of transmission time than the second transmitting step. | 03-04-2010 |
20100057445 | System And Method For Automatically Adjusting Floor Controls For A Conversation - A system and method for automatically adjusting floor controls for a conversation is provided. Audio streams are received, which each originate from an audio source. Floor controls for a current configuration including at least a portion of the audio streams are maintained. Conversational characteristics shared by two or more of the audio sources are determined. Possible configurations for the audio streams are identified based on the conversational characteristics. An analysis of the current configuration and the possible configurations is performed. A change threshold is applied to the analysis. When the analysis satisfies the change threshold, the floor controls are automatically adjusted. The audio streams are mixed into one or more outputs based on the adjusted floor controls. | 03-04-2010 |
20100063801 | Postfilter For Layered Codecs - A scalable decoder device ( | 03-11-2010 |
20100063802 | Adaptive Frequency Prediction - In one embodiment, a method of transceiving an audio signal is disclosed. The method includes providing low band spectral information having a plurality of spectrum coefficients and predicting a high band extended spectral fine structure from the low band spectral information for at least one subband, where the high band extended spectral fine structure are made of a plurality of spectrum coefficients. The predicting includes preparing the spectrum coefficients of the low band spectral information, defining prediction parameters for the high band extended spectral fine structure and index ranges of the prediction parameters, and determining possible best indices of the prediction parameters, where determining includes minimizing a prediction error between a reference subband in high band and a predicted subband that is selected and composed from an available low band. The possible best indices of the prediction parameters are transmitted. | 03-11-2010 |
20100070266 | Performance metrics for telephone-intensive personnel - Systems and methods for generating performance metrics to monitor and/or enhance the performance of telephone-intensive personnel are disclosed. The method generally includes detecting voice activity on a receive and/or a transmit channel in a communications system, outputting voicing decision outputs based on the detecting, storing the voicing decision outputs over a period of time to memory, and generating voice activity performance metrics based on the voicing decision output stored in the memory. The generating may include generating a running average ratio of duration of voice activity on the transmit channel to duration of voice activity on the receive channel (talk-listen ratio) over a certain period of time for one or more agents. The talk-listen ratio may be compared to a target ratio. The system may generally include a voice activity detector (VAD) configured to detect voice activity on a receive and/or transmit channel in a communications system, a memory to store outputs from the VAD, and a voice activity analyzer configured to generate performance metrics based on the VAD outputs stored in the memory. | 03-18-2010 |
20100070267 | METHOD AND APPARATUS FOR QOS IMPROVEMENT WITH PACKET VOICE TRANSMISSION OVER WIRELESS LANS - A method for improving packetized speech transmitted over a wireless LAN is disclosed. Speech packets transmitted over the wireless LAN are monitored for errors. Any of the speech packets found to have errors are replaced with synthesized speech packets. The synthesized speech packets may be created from a vocal tract model generated from the received speech packets during periods of time when there are no errors. | 03-18-2010 |
20100076753 | DIALOGUE GENERATION APPARATUS AND DIALOGUE GENERATION METHOD - A dialogue generation apparatus includes a reception unit configured to receive a first text from a dialogue partner, an information storage unit configured to store profile information specific to a person who can be the dialogue partner and a fixed-pattern text associated with the person, a presentation unit configured to present the first text to a user, a speech recognition unit configured to perform speech recognition on speech the user has uttered about the first text presented to the user, and generate a speech recognition result showing the content of the speech, a generation unit configured to generate a second text from the profile information about the dialogue partner, fixed-pattern text about the dialogue partner, and the speech recognition result, and a transmission unit configured to transmit the second text to the dialogue partner. | 03-25-2010 |
20100082334 | SYSTEM AND METHOD FOR VOICE USER INTERFACE NAVIGATION - A Voice User Interface (VUI) or Interactive Voice Response (IVR) system utilizes three levels of navigation (e.g. Main Menu, Services, and Helper Commands) in presenting information units arranged in sets. The units are “spoken” by a system in a group to a human user and the group of information at each level is preceded by a tone that is unique to the level. When navigating the levels, the tones of the levels are in a musical progression, e.g. the three-note blues progression I, IV, V, for preceding the groups of information, respectively. The musical progression returns to the tonic of the musical key when the navigation returns to the level one of the first group of information. | 04-01-2010 |
20100088088 | CUSTOMIZABLE METHOD AND SYSTEM FOR EMOTIONAL RECOGNITION - An automated emotional recognition system is adapted to determine emotional states of a speaker based on the analysis of a speech signal. The emotional recognition system includes at least one server function and at least one client function in communication with the at least one server function for receiving assistance in determining the emotional states of the speaker. The at least one client function includes an emotional features calculator adapted to receive the speech signal and to extract therefrom a set of speech features indicative of the emotional state of the speaker. The emotional state recognition system further includes at least one emotional state decider adapted to determine the emotional state of the speaker exploiting the set of speech features based on a decision model. The server function includes at least a decision model trainer adapted to update the selected decision model according to the speech signal. The decision model to be used by the emotional state decider for determining the emotional state of the speaker is selectable based on a context of use of the recognition system. | 04-08-2010 |
20100100372 | STEREO ENCODING DEVICE, STEREO DECODING DEVICE, AND THEIR METHOD - Disclosed is a stereo encoding device which can improve critical channel encoding accuracy without increasing the encoding information amount. The device includes: a monaural signal synthesis unit ( | 04-22-2010 |
20100114565 | AUDIBLE ERRORS DETECTION AND PREVENTION FOR SPEECH DECODING, AUDIBLE ERRORS CONCEALING - A method and apparatus of providing an audio output to a user in a communications system in which the audio to be output to a user, preferably an audio frame, is assessed before it is broadcast to the user, and then selectively changed on the basis of the assessment. The assessment may be carried out in the audio encoding process, in the audio decoding process and/or after the audio decoding process. The selective changing of the audio output may comprise selectively replacing the audio output and/or re-encoding of the audio output. | 05-06-2010 |
20100145683 | METHOD OF PROVIDING DYNAMIC SPEECH PROCESSING SERVICES DURING VARIABLE NETWORK CONNECTIVITY - A device for providing dynamic speech processing services during variable network connectivity with a network server includes a connection determiner that determines the level of network connectivity of the client device and the network server; and a simplified speech processor that processes speech data and is initiated based on the determination from the connection determiner that the network connectivity is impaired or unavailable. The devices further includes a speech data storage that stores processed speech data from the simplified speech processor; and a transition unit that determines when to transmit the stored speech data and connects with the network server, based on the determination of the connection determiner. | 06-10-2010 |
20100145684 | Regeneration of wideband speed - A system and method for processing a narrowband speech signal comprising speech samples in a first range of frequencies. the method comprises: generating from the narrowband speech signal a highband speech signal in a second range of frequencies above the first range of frequencies; determining a pitch of the highband speech signal; using the pitch to generate a pitch-dependent tonality measure from samples of the highband speech signal; and filtering the speech samples using a gain factor derived from the tonality measure and selected to reduce the amplitude of harmonics in the highband speech signal. | 06-10-2010 |
20100153097 | MULTI-CHANNEL AUDIO CODING - A multi-channel audio encoder ( | 06-17-2010 |
20100174530 | ELECTRONIC AUDIO PLAYING APPARATUS WITH AN INTERACTIVE FUNCTION AND METHOD THEREOF - An audio playing apparatus with an interactive function is provided. An interactive file stored in a data storage of the audio playing apparatus includes controlling data, a main audio, and at least one question audio. The controlling data is for controlling the playing controlling data of the main audio and the question audios. After each question audio is played, the audio playing apparatus output a voice prompt to give user a reference answer. | 07-08-2010 |
20100174531 | Speech coding - A method of encoding one or more parent blocks of values, the number of values being the length of each block, the method comprising for each parent block: | 07-08-2010 |
20100191522 | Apparatus and method for noise generation - The disclosure provides a method for noise generation, including: determining an initial value of a reconstructed parameter; determining a random value range based on the initial value of the reconstructed parameter; taking a value in the random value range randomly as a reconstructed noise parameter; and generating noise by using the reconstructed noise parameter. The disclosure also provides an apparatus for noise generation. | 07-29-2010 |
20100241421 | LANGUAGE PROCESSOR - A language processor according to the present invention includes a probability calculating section ( | 09-23-2010 |
20100241422 | SYNCHRONIZING A CHANNEL CODEC AND VOCODER OF A MOBILE STATION - In one embodiment, the present invention includes a method for maintaining a vocoder and channel codec in substantial synchronization. The method may include receiving a configuration message that includes rate information and an effective radio block identifier at a mobile station, coding a current radio block via a vocoder and channel codec, configuring an encoding portion of the vocoder and channel codec with the rate information after performing the coding, and then coding the effective radio block using the rate information. Other embodiments are described and claimed. | 09-23-2010 |
20100250243 | Service Oriented Speech Recognition for In-Vehicle Automated Interaction and In-Vehicle User Interfaces Requiring Minimal Cognitive Driver Processing for Same - A system and method for implementing a server-based speech recognition system for multi-modal automated interaction in a vehicle includes receiving, by a vehicle driver, audio prompts by an on-board human-to-machine interface and a response with speech to complete tasks such as creating and sending text messages, web browsing, navigation, etc. This service-oriented architecture is utilized to call upon specialized speech recognizers in an adaptive fashion. The human-to-machine interface enables completion of a text input task while driving a vehicle in a way that minimizes the frequency of the driver's visual and mechanical interactions with the interface, thereby eliminating unsafe distractions during driving conditions. After the initial prompting, the typing task is followed by a computerized verbalization of the text. Subsequent interface steps can be visual in nature, or involve only sound. | 09-30-2010 |
20100262420 | AUDIO ENCODER FOR ENCODING AN AUDIO SIGNAL HAVING AN IMPULSE-LIKE PORTION AND STATIONARY PORTION, ENCODING METHODS, DECODER, DECODING METHOD, AND ENCODING AUDIO SIGNAL - An audio encoder for encoding an audio signal includes an impulse extractor for extracting an impulse-like portion from the audio signal. This impulse-like portion is encoded and forwarded to an output interface. Furthermore, the audio encoder includes a signal encoder which encodes a residual signal derived from the original audio signal so that the impulse-like portion is reduced or eliminated in the residual audio signal. The output interface forwards both, the encoded signals, i.e., the encoded impulse signal and the encoded residual signal for transmission or storage. On the decoder-side, both signal portions are separately decoded and then combined to obtain a decoded audio signal. | 10-14-2010 |
20100268529 | VOICE COMMUNICATION APPARATUS - An art capable of transmitting a voice separated for each speaker when voice communications are conducted in a state that a plurality of communication terminals are connected in a cascade mode is provided. When a conference is started, each participant using each terminal | 10-21-2010 |
20100274554 | SPEECH ANALYSIS SYSTEM - A speech analysis system, including a kurtosis module for processing a coded sound signal to generate kurtosis measure data; a wavelet module for processing the coded sound signal to generate wavelet coefficients; and a classification module for processing the wavelet coefficients and the kurtosis measure data to generate label data representing a classification for the coded sound signal. The sound signal is classified as environmental noise, silence, speech from a single speaker, speech from multiple speakers, speech from a single speaker plus environmental noise, or speech from multiple speakers plus environmental noise. Speech is further classified as voiced or unvoiced. | 10-28-2010 |
20100274555 | Audio Coding Apparatus and Method Thereof - An apparatus comprising at least one processor and at least one memory including computer program code the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to determine at least one characteristic of the audio signal; divide the audio signal into at least a low frequency portion and a high frequency portion, and generate from the high frequency portion a plurality of high frequency band signals dependent on the at least one characteristic of the audio signal; and determine for each of the plurality of high frequency band signals at least part of the low frequency portion which can represent the high frequency band signal. | 10-28-2010 |
20100274556 | VECTOR QUANTIZER, VECTOR INVERSE QUANTIZER, AND METHODS THEREFOR - Disclosed is a vector quantizer in which, in multistage vector quantization, the vector quantization of the following stage can be performed adaptively to the result of the vector quantization of the preceding stage to improve the accuracy of the quantization at less calculation amount and bit rate. The quantizer comprises a product set circle calculating section ( | 10-28-2010 |
20100280822 | STEREO SOUND DECODING APPARATUS, STEREO SOUND ENCODING APPARATUS AND LOST-FRAME COMPENSATING METHOD - A stereo sound decoding apparatus wherein lost-frame compensation performance has been improved to enhance the quality of decoded sounds. In this stereo sound decoding apparatus, a sound decoding part ( | 11-04-2010 |
20100280823 | Method and Apparatus for Encoding and Decoding - An encoding method includes extracting background noise characteristic parameters within a hangover period, for a first superframe after the hangover period, performing background noise encoding based on the extracted background noise characteristic parameters, for superframes after the first superframe, performing background noise characteristic parameter extraction and DTX decision for each frame in the superframes after the first superframe, and for the superframes after the first superframe, performing background noise encoding based on extracted background noise characteristic parameters of the current superframe, background noise characteristic parameters of a plurality of superframes previous to the current superframe, and a final DTX decision. Also, a decoding method and apparatus and an encoding apparatus are disclosed. Bandwidth occupancy may be reduced substantially while the signal quality may be guaranteed. | 11-04-2010 |
20100286980 | METHOD AND APPARATUS FOR SPEECH CODING - A method and apparatus for prediction in a speech-coding system extends a 1 | 11-11-2010 |
20100305943 | Method and node for the control of a connection in a communication network - A method, a control node and a program unit for controlling the establishment or modification of a connection for a subscriber having a subscription in a communication network are disclosed. The connection is to be established or modified between nodes that are adapted to employ a coding scheme selected from a plurality of supported coding schemes potentially affecting connection quality. In accordance with the invention a subscribed quality level indicating a target quality level for the subscriber associated with the subscription is determined, and a node controlling the connection checks the subscribed quality level when it selects a coding scheme to be employed for the connection. | 12-02-2010 |
20100324890 | Method and Apparatus For Selecting An Audio Stream - An active stream is selected from one of a plurality of audio streams generated in a common acoustic environment by obtaining, for each stream obtaining, at a series of measurement instants t | 12-23-2010 |
20100332220 | Conversation Recording with Real-Time Notification for Users of Communication Terminals - A recording device provides conversation recording with real-time notification between users of communication terminals engaged in a conversation. The recording device provides a recording start notification to a second communication terminal in response to receiving an initiate recording request from a first communication terminal, and initiates recording of the conversation. The recording device terminates recording of the conversation in response to receiving a terminate recording request from the first communication terminal, provides a recording stop notification to the second communication terminal, and saves the recorded conversation to a file in a file storage medium. The recording start and recording stop notifications can be either audible or electronic notifications. The first communication terminal may be muted prior to providing a notification, and un-muted subsequent to the notification. The recording device may obtain permission from the second communication terminal to record the conversation. | 12-30-2010 |
20110004466 | STEREO SIGNAL ENCODING DEVICE, STEREO SIGNAL DECODING DEVICE AND METHODS FOR THEM - A technique of improving the degree of freedom of controlling the accuracy of encoding a stereo signal. In a stereo signal encoding device ( | 01-06-2011 |
20110010166 | MOBILE COMMUNICATION TERMINAL CONNECTABLE TO NETWORK - A mobile communication terminal includes a first communication unit connectable to a first communication apparatus via a first network using a first wireless communication protocol, a second communication unit connectable to a second communication apparatus via a second network using a second wireless communication protocol different from the first wireless communication protocol, a controller configured to control the first and second communication units, a microphone, and a speaker. The controller transmits first data, addressed to the second communication apparatus, to the second network, transmits second data, addressed to the first communication apparatus, to the first network, causes the speaker to convert into corresponding voice a second voice signal addressed to the mobile communication terminal, and transmits a first voice signal. | 01-13-2011 |
20110010167 | METHOD FOR GENERATING BACKGROUND NOISE AND NOISE PROCESSING APPARATUS - A method for generating background noise and a noise processing apparatus are provided in order to improve user experience. The method includes: if an obtained signal frame is a noise frame, a high band noise encoding parameter is obtained from the noise frame; weighting and/or smoothing is performed on the high band noise encoding parameter to obtain a second high band noise encoding parameter; and a high band background noise signal is generated according to the second high band noise encoding parameter. A noise processing apparatus is also provided. | 01-13-2011 |
20110046945 | METHOD AND DEVICE OF BITRATE DISTRIBUTION/TRUNCATION FOR SCALABLE AUDIO CODING - Embodiments of the invention provides a method and device for assigning bitrates to a plurality of channels in a scalable audio encoding/truncation process. Different bitrates are assigned to different channels in the scalable audio encoding/truncation process. | 02-24-2011 |
20110077938 | DATA REPRODUCTION METHOD AND DATA REPRODUCTION APPARATUS - A reproduction apparatus that reproduces compressed audio data recorded in a recording medium inserts dummy data between data to be concatenated and reproduces the data when performing a specific reproduction of the data obtained by concatenating data which are discontinuously read from the recording medium. | 03-31-2011 |
20110082689 | VAMOS - DARP receiver switching for mobile receivers - Embodiments of the invention include apparatuses, systems, computer readable media, and methods for processing speech signals in a manner that enhances capacity, efficiency and hardware utilization of a communications network. A method, according to one embodiment, includes receiving speech signals, determining a subchannel power imbalance ratio of at least two subchannels, and selecting a receiver architecture for processing the speech signals in accordance with the determined subchannel power imbalance ratio. | 04-07-2011 |
20110082690 | SOUND MONITORING SYSTEM AND SPEECH COLLECTION SYSTEM - Monitoring accuracy degrades due to a noise in an environment where there are many sound sources other than those to be monitored. Easy initialization is required for an environment where many apparatuses operate. A sound monitoring system includes a microphone array having multiple microphones and a location-based abnormal sound monitoring section as a processing section. The location-based abnormal sound monitoring section is supplied with an input signal from the microphone array via a waveform acquisition section and a network. Using the input signal, the location-based abnormal sound monitoring section detects a temporal change in a sound source direction histogram. Based on a detected change result, the location-based abnormal sound monitoring section checks for abnormality in a sound field and outputs a monitoring result. The processing section searches for a microphone array near the sound source to be monitored. The processing section selects a sound field monitoring function for the sound source to be monitored based on various data concerning a microphone belonging to the searched microphone array. | 04-07-2011 |
20110087487 | METHOD AND SYSTEM FOR MEMORY USAGE IN REAL-TIME AUDIO SYSTEMS - System and method for encoding, transmitting and decoding audio data. Audio bit steam syntax is re-organized to allow system optimizations that work well with memory latency and memory burst operations. Multiple small entropy coding tables are stored in RAM and loaded to on-chip memory as needed. Audio prediction is pipelined in the bitstream syntax. Intra frames, independent of other frames in the bitstream, are included in the bitstream for error recovery and channel change. New algorithms are implemented in legacy syntax by including the new information in the user data space of the audio frame. The new decoder can use projection to determine where the new information is and read ahead in the stream. Audio prediction from the immediately previous frame is restricted. Audio prediction is performed across channels within a single audio frame. A variable re-order function comprises storing channels of data to DRAM in the order they are decoded and reading them out in presentation order. | 04-14-2011 |
20110093260 | SIGNAL CLASSIFYING METHOD AND APPARATUS - A signal classifying method and apparatus are disclosed. The signal classifying method includes: obtaining a spectrum fluctuation parameter of a current signal frame determined as a foreground frame, and buffering the spectrum fluctuation parameter; obtaining a spectrum fluctuation variance of the current signal frame according to spectrum fluctuation parameters of all buffered signal frames, and buffering the spectrum fluctuation variance; and calculating a ratio of signal frames whose spectrum fluctuation variance is above or equal to a first threshold to all the buffered signal frames, and determining the current signal frame as a speech frame if the ratio is above or equal to a second threshold or determining the current signal frame as a music frame if the ratio is below the second threshold. In the embodiments of the present disclosure, the spectrum fluctuation variance of the signal is used as a parameter for classifying the signals, and a local statistical method is applied to decide the type of the signal. Therefore, the signals are classified with few parameters, simple logical relations and low complexity. | 04-21-2011 |
20110119053 | System And Method For Leaving And Transmitting Speech Messages - A system for leaving and transmitting speech messages automatically analyzes input speech of at least a reminder, fetches a plurality of tag informations, and transmits speech message to at least a message receiver, according to the transmit criterions of the reminder. A command or message parser parses the tag informations at least including at least a reminder ID, at least a transmitted command and at least a speech message. The tag informations are sent to a message composer for being synthesized into a transmitted message. A transmitting controller controls a device switch according to the reminder ID and the transmitted command, to allow the transmitted message send to the message receiver via a transmitting device. | 05-19-2011 |
20110125488 | ADAPTIVE DATA TRANSMISSION FOR A DIGITAL IN-BAND MODEM OPERATING OVER A VOICE CHANNEL - In one example, a mobile device encodes a digital bitstream using a particular set of modulation parameters to generate an audio signal that has different audio tones selected to pass through a vocoder of the mobile device. The particular set of modulation parameters is optimized for a subset of a plurality of vocoding modes without a priori knowledge of which one of the vocoding modes is currently operated by the vocoder. The mobile device conducts transmissions over the wireless telecommunications network through the vocoder using the particular set of modulation parameters, and monitors these transmissions for errors. If the errors reach a threshold, then the vocoder may be using one of the vocoding modes that are not included in the subset for which the particular set of modulation parameters is optimized, and accordingly, the modulation device switches from the particular set of modulation parameters to a different set of modulation parameters. | 05-26-2011 |
20110144980 | SYSTEM AND METHOD FOR UPDATING INFORMATION IN ELECTRONIC CALENDARS - Systems and methods for updating electronic calendar information. Speech is received from a user at a vehicle telematics unit (VTU), wherein the speech is representative of information related to a particular vehicle trip. The received speech is recorded in the VTU as a voice memo, and data associated with the voice memo is communicated from the VTU to a computer running a calendaring application. The data is associated with a field of the calendaring application, and stored in association with the calendaring application field. | 06-16-2011 |
20110161074 | REMOTE CONFERENCING CENTER - Certain embodiments disclosed herein relate to systems and methods for recording audio and video. In particular, in one embodiment, a method of recording audio signals is provided. The method includes recording audio signals with a plurality of distributed audio transducers to create multiple recordings of the audio signals and providing each of the multiple recordings of the audio signals to a computing device. The computing device combines each of the multiple recordings into a master recording and determines a source for each audio signal in the master recording. Additionally, the computing device stores each audio signal in separate audio files according to the determined source of each audio signal. | 06-30-2011 |
20110161075 | REAL-TIME VOICE RECOGNITION ON A HANDHELD DEVICE - A method and apparatus for implementation of real-time speech recognition using a handheld computing apparatus are provided. The handheld computing apparatus receives an audio signal, such as a user's voice. The handheld computing apparatus ultimately transmits the voice data to a remote or distal computing device with greater processing power and operating a speech recognition software application. The speech recognition software application processes the signal and outputs a set of instructions for implementation either by the computing device or the handheld apparatus. The instructions can include a variety of items including instructing the presentation of a textual representation of dictation, or a function or command to be executed by the handheld device (such as linking to a website, opening a file, cutting, pasting, saving, or other file menu type functionalities), or by the computing device itself. | 06-30-2011 |
20110172992 | METHOD FOR EMOTION COMMUNICATION BETWEEN EMOTION SIGNAL SENSING DEVICE AND EMOTION SERVICE PROVIDING DEVICE - Provided are a method for emotion communication to share a user's emotions between an emotion signal sensing device and an emotion service providing device. The method for emotion communication includes: the emotion signal sensing device's sensing biological and environmental information of the user and generating an emotion signal and emotion information of the user based on the biological and environmental information; establishing an emotion communication connection with the emotion service providing device; transmitting the emotion signal and the emotion information to the emotion service providing device by the emotion communication connection establishment; and breaking the connection with the emotion service providing device | 07-14-2011 |
20110184730 | MULTI-DIMENSIONAL DISAMBIGUATION OF VOICE COMMANDS - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing voice commands. In one aspect, a method includes receiving an audio signal at a server, performing, by the server, speech recognition on the audio signal to identify one or more candidate terms that match one or more portions of the audio signal, identifying one or more possible intended actions for each candidate term, providing information for display on a client device, the information specifying the candidate terms and the actions for each candidate term, receiving from the client device an indication of an action selected by a user, where the action was selected from among the actions included in the provided information, and invoking the action selected by the user. | 07-28-2011 |
20110208514 | DATA EMBEDDING DEVICE AND DATA EXTRACTION DEVICE - A data embedding device for embedding data in a speech code obtained by encoding a speech in accordance with a speech encoding method based on a voice generation process of a human being, includes an embedding judgment unit, every speech code, judging whether or not data should be embedded in the speech code, and an embedding unit embedding data in two or more parameter codes of a plurality of parameter codes constituting the speech code for which it is judged by the embedding judgment unit that the data should be embedded. | 08-25-2011 |
20110218798 | OBFUSCATING SENSITIVE CONTENT IN AUDIO SOURCES - Techniques implemented as systems, methods, and apparatuses, including computer program products, for obfuscating sensitive content in an audio source representative of an interaction between a contact center caller and a contact center agent. The techniques include performing, by an analysis engine of a contact center system, a context-sensitive content analysis of the audio source to identify each audio source segment that includes content determined by the analysis engine to be sensitive content based on its context; and processing, by an obfuscation engine of the contact center system, one or more identified audio source segments to generate corresponding altered audio source segments each including obfuscated sensitive content. | 09-08-2011 |
20110224974 | SPEECH RECOGNITION AND TRANSCRIPTION AMONG USERS HAVING HETEROGENEOUS PROTOCOLS - A system is disclosed for facilitating speech recognition and transcription among users employing incompatible protocols for generating, transcribing, and exchanging speech. The system includes a system transaction manager that receives a speech information request from at least one of the users. The speech information request includes formatted spoken text generated using a first protocol. The system also includes a speech recognition and transcription engine, which communicates with the system transaction manager. The speech recognition and transcription engine receives the speech information request from the system transaction manager and generates a transcribed response, which includes a formatted transcription of the formatted speech. The system transmits the response to the system transaction manager, which routes the response to one or more of the users. The latter users employ a second protocol to handle the response, which may be the same as or different than the first protocol. The system transaction manager utilizes a uniform system protocol for handling the speech information request and the response. | 09-15-2011 |
20110231184 | Correlation of transcribed text with corresponding audio - In one embodiment, a method includes receiving at a communication device an audio communication and a transcribed text created from the audio communication, and generating a mapping of the transcribed text to the audio communication independent of transcribing the audio. The mapping identifies locations of portions of the text in the audio communication. An apparatus for mapping the text to the audio is also disclosed. | 09-22-2011 |
20110238414 | TELEPHONY SERVICE INTERACTION MANAGEMENT - A method for managing an interaction of a calling party to a communication partner is provided. The method includes automatically determining if the communication partner expects DTMF input. The method also includes translating speech input to one or more DTMF tones and communicating the one or more DTMF tones to the communication partner, if the communication partner expects DTMF input. | 09-29-2011 |
20110246186 | INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM - There is provided an information processing device including a storage unit that stores music data for playing music and lyrics data indicating lyrics of the music, a display control unit that displays the lyrics of the music on a screen, a playback unit that plays the music and a user interface unit that detects a user input. The lyrics data includes a plurality of blocks each having lyrics of at least one character. The display control unit displays the lyrics of the music on the screen in such a way that each block included in the lyrics data is identifiable to a user while the music is played by the playback unit. The user interface unit detects timing corresponding to a boundary of each section of the music corresponding to each displayed block in response to a first user input. | 10-06-2011 |
20110246187 | SPEECH SIGNAL PROCESSING - A speech signal processing system comprises an audio processor ( | 10-06-2011 |
20110257964 | Minimizing Speech Delay in Communication Devices - Methods and apparatus for coordinating audio data processing and network communication processing in a communication device. In an exemplary method lower and upper threshold values for use by a network communication processing circuit are set, the lower and upper threshold values defining a window of timing offsets relative to each of a series of periodic network communications frame boundaries. A series of encoded audio data frames are sent to the network communication processing circuit for transmission over the network communications link. The delivery of encoded audio data to the network communication processing circuit outside of the corresponding time window defined by the threshold values will trigger an event report. This event report is received from the network communication processing circuit by the audio data processing circuit, and, in response, timing is adjusted for the sending of one or more of the encoded audio data frames. | 10-20-2011 |
20110264446 | METHOD, SYSTEM, AND MEDIA GATEWAY FOR REPORTING MEDIA INSTANCE INFORMATION - A method, a system, and a media gateway (MG) for reporting media instance information are disclosed. The method for reporting media instance information includes: detecting, by an MG, received media data according to a set media instance detection (MID) event; and reporting, by the MG, the MID event when the media instance information is detected. With the present invention, the MG reports the detected media instance information related to the media data to a media gateway controller (MGC) through a set MID event, so that the MG can detect media instance information related to the media data, and report the detected media instance information related to the media data to the MGC. In this way, the MGC can execute corresponding control operations according to the media instance information related to the media data, extending the applicable scope of media services. | 10-27-2011 |
20110288857 | Distributed Speech Recognition Using One Way Communication - A speech recognition client sends a speech stream and control stream in parallel to a server-side speech recognizer over a network. The network may be an unreliable, low-latency network. The server-side speech recognizer recognizes the speech stream continuously. The speech recognition client receives recognition results from the server-side recognizer in response to requests from the client. The client may remotely reconfigure the state of the server-side recognizer during recognition. | 11-24-2011 |
20110295596 | DIGITAL VOICE RECORDING DEVICE WITH MARKING FUNCTION AND METHOD THEREOF - A digital voice recording device includes a storage unit, a display unit, and a processing unit. The processing unit includes a recording module, a storing module, a marking module, and a playing module. The recording module converts audio into digital signals, and records the digital signals into an audio file. Each audio file is associated with a document including textual content of the audio file. The storing module stores the audio file and the document. The display module displays the document. The marking module creates a plurality of flags for the audio file. Each flag is associated with a time point in the audio file, and is assigned an identifier. The playing module identifies an identifier of a flag to acquire a time point in response to a user input, and begin playing the audio file from the acquired time point. | 12-01-2011 |
20110295597 | SYSTEM AND METHOD FOR AUTOMATED ANALYSIS OF EMOTIONAL CONTENT OF SPEECH - A method and apparatus for automated analysis of emotional content of speech is presented. Telephony calls are routed via a network such as public service telephone network (PSTN) and delivered to an interactive voice response system (IVR) where prerecorded or synthesized prompts guide a caller to speech responses. Speech responses are analyzed for emotional content in real time or collected via recording and analyzed in batch. If performed in real time, results of emotional content analysis (ECA) may be used as input to IVR call processing and call routing. In some applications this might involve ECA input to expert system process whose results interact with an IVR for prompt creation and call processing. In any case, ECA data is valuable on its own and may be culled and restated in the form of reports for business application. | 12-01-2011 |
20110301944 | DIVER AUDIO COMMUNICATION SYSTEM - An underwater communications system is provided that transmits electromagnetic and/or magnetic signals to a remote receiver. The transmitter includes a data input. A digital data compressor compresses data to be transmitted. A modulator modulates compressed data onto a carrier signal. An electrically insulated, magnetic coupled antenna transmits the compressed, modulated signals. The receiver that has an electrically insulated, magnetic coupled antenna for receiving a compressed, modulated signal. A demodulator is provided for demodulating the signal to reveal compressed data. A de-compressor de-compresses the data. An appropriate human interface is provided to present transmitted data into text/audio/visible form. Similarly, the transmit system comprises appropriate audio/visual/text entry mechanisms. | 12-08-2011 |
20110320192 | GATEWAY APPARATUS AND METHOD AND COMMUNICATION SYSTEM - A gateway apparatus receives a call control signal and/or a packet with voice data stored therein in a predetermined protocol from a packet transfer apparatus on a mobile high-speed network and converts the received protocol into a circuit-switched protocol used when an RNC connects to a circuit switching equipment on a mobile circuit-switched network, for output to the circuit switching equipment The gateway apparatus, on receipt of a call process signal and/or a voice signal, from the circuit switching equipment, converts the received protocol for output to the packet transfer apparatus. | 12-29-2011 |
20110320193 | SPEECH ENCODING DEVICE, SPEECH DECODING DEVICE, SPEECH ENCODING METHOD, AND SPEECH DECODING METHOD - Provided is a speech encoding device that is capable of performing encoding in an extension encoder even when the core encoder and core decoder of each layer have been interchanged, and that is also capable of performing high precision encoding by using the appropriate codec for each situation. The speech encoding device ( | 12-29-2011 |
20120010877 | SYSTEM AND METHOD FOR PERFORMING SPEECH SYNTHESIS WITH A CACHE OF PHONEME SEQUENCES - Disclosed are systems, methods, and computer readable media for performing speech synthesis. The method embodiment comprises applying a first part of a speech synthesizer to a text corpus to obtain a plurality of phoneme sequences, the first part of the speech synthesizer only identifying possible phoneme sequences, for each of the obtained plurality of phoneme sequences, identifying joins that would be calculated to synthesize each of the plurality of respective phoneme sequences, and adding the identified joins to a cache for use in speech synthesis. | 01-12-2012 |
20120016666 | Audiovisual (AV) Device and Control Method Thereof - According to one embodiment, an AV device comprises a receiving section, a processing section, a storage section and a control section. The receiving section receives a digital voice signal. The processing section applies a predetermined signal processing operation to the digital voice signal received by the receiving section. The storage section stores information indicating time required for the signal processing operation at the processing section, and when a voice has been set in a mute state, stores the information indicating the time required for the signal processing operation by the processing section which is rewritten into a value that cannot be taken in general. The control section outputs information stored in the storage section upon an external request. Other embodiments are also described. | 01-19-2012 |
20120029911 | METHOD AND SYSTEM FOR DISTRIBUTED AUDIO TRANSCODING IN PEER-TO-PEER SYSTEMS - A method for streaming audio data in a network, the audio data having a sequence of samples, includes encoding the sequence of samples into a plurality of coded base bitstreams, generating a plurality of enhancement streams, and transmitting the coded base bitstreams and the enhancement bitstreams to a receiver for decoding. Each of the enhancement bitstreams is generated from one of a plurality of non-overlapping portions of the sequence of samples. | 02-02-2012 |
20120035918 | METHOD AND ARRANGEMENT FOR PROVIDING A BACKWARDS COMPATIBLE PAYLOAD FORMAT - In a method of providing a backward and forward compatible speech codec payload format, the following steps are included: providing S | 02-09-2012 |
20120041759 | Mobile Replacement-Dialogue Recording System - A mobile replacement-dialogue recording system enables the creation of replacement-dialogue items by mobile users not at a media recording studio. Studio-users prepare guide media video, audio and text data which are made available to mobile users through a media server. A mobile user's mobile replacement-dialogue recording device obtains guide media and allows the user to view the guide media in rehearsal mode. The mobile replacement-dialogue recording device then records the mobile user's dialogue performance while presenting the mobile user with synchronized guide media. The mobile user can review, delete, and rerecord the resulting potential replacement dialogue, as well as create feedback media characterizing the replacement dialogue. Selected replacement dialogue items can be transmitted to the media server. A studio-module can then obtain the selected replacement dialogue items and feedback media from the media server so that they may be used in media-replacement. | 02-16-2012 |
20120041760 | VOICE RECORDING EQUIPMENT AND METHOD - In a voice recording equipment and method, voice data from a speaker is received using a microphone. Threshold values T | 02-16-2012 |
20120041761 | VOICE DECODING APPARATUS AND VOICE DECODING METHOD - Disclosed is a voice decoding apparatus wherein the processor may be continuously employed for other applications for a prescribed time but, in response to an urgent interrupt, the processor can generate synthesised sound even when being used for other applications, without interruption. In this apparatus, a packet receiving section ( | 02-16-2012 |
20120046941 | DIGITAL VOICE COMMUNICATION CONTROL DEVICE AND METHOD - A digital audio communication control apparatus includes a first mixing unit that mixes a voice input from a voice input unit and uttered by a specific speaker with a voice input from a digital audio packet receiving unit and uttered by at least one speaker except for the specific speaker, and a second mixing unit that mixes the voices mixed by the first mixing unit with the voice of the specific speaker. The voices mixed by the second mixing unit are fed back to the specific speaker. | 02-23-2012 |
20120072206 | TERMINAL APPARATUS AND SPEECH PROCESSING PROGRAM - A terminal apparatus configured to obtain positional information indicating a position of another apparatus; to obtain positional information indicating a position of the terminal apparatus; to obtain a first direction, which is a direction to the obtained position of the another apparatus and calculated using the obtained position of the terminal apparatus; to obtain a second direction, which is a direction in which the terminal apparatus is oriented; to obtain inclination information indicating whether the terminal apparatus is inclined to the right or to the left; to switch an amount of correction for a relative angle between the first direction and the second direction in accordance with whether the obtained inclination information indicates an inclination to the right or an inclination to the left; and to determine an attribute of speech output from a speech output unit in accordance with the relative angle corrected by the amount of correction. | 03-22-2012 |
20120072207 | DOWN-MIXING DEVICE, ENCODER, AND METHOD THEREFOR - Provided are a down-mixing method and an encoder, wherein a high quantization performance can be realized when a balance adjustment operation due to a balance weight coefficient and a removal operation of a main component are combined. In the encoder ( | 03-22-2012 |
20120084078 | Method And Apparatus For Voice Signature Authentication - A scalable voice signature authentication capability is provided herein. The scalable voice signature authentication capability enables authentication of varied services such as speaker identification (e.g. private banking and access to healthcare account records), voice signature as a password (e.g. secure access for remote services and document retrieval) and the Internet and its various services (e.g., online shopping), and the like | 04-05-2012 |
20120084079 | Integration of Embedded and Network Speech Recognizers - A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device. | 04-05-2012 |
20120101812 | METHODS AND APPARATUS FOR GENERATING, UPDATING AND DISTRIBUTING SPEECH RECOGNITION MODELS - Techniques for generating, distributing, and using speech recognition models are described. A shared speech processing facility is used to support speech recognition for a wide variety of devices with limited capabilities including business computer systems, personal data assistants, etc., which are coupled to the speech processing facility via a communications channel, e.g., the Internet. Devices with audio capture capability record and transmit to the speech processing facility, via the Internet, digitized speech and receive speech processing services, e.g., speech recognition model generation and/or speech recognition services, in response. The Internet is used to return speech recognition models and/or information identifying recognized words or phrases. Thus, the speech processing facility can be used to provide speech recognition capabilities to devices without such capabilities and/or to augment a device's speech processing capability. Voice dialing, telephone control and/or other services are provided by the speech processing facility in response to speech recognition results. | 04-26-2012 |
20120109643 | ADAPTIVE AUDIO TRANSCODING - A system and method provide an audio/video coding system for adaptively transcoding audio streams based on content characteristics of the audio streams. An audio stream metadata extraction module of the system is configured to extract metadata of a source audio stream. An audio stream classification module of the system is configured to classify the source audio stream into one of the several audio content categories based on the metadata of the source audio stream. An adaptive audio encoder of the system is configured to determine one or more transcoding parameters including target bitrate and sampling rate based on the metadata and classification of the source audio stream. An adaptive audio transcoder of the system is configured to transcode the source audio stream into an output audio stream using the transcoding parameters. | 05-03-2012 |
20120109644 | INFORMATION PROCESSING DEVICE AND MOBILE TERMINAL - There is a need to enable decompression of a speech signal even if no network synchronizing signal is output from a baseband processing portion. For this purpose, an information processing device includes a first serial interface. The first serial interface includes a notification signal generation circuit that generates a notification signal each time compressed data incorporated from the baseband processing portion reaches a predetermined data quantity, and notifies a speech processing portion of this state using the notification signal. The speech processing portion includes a synchronizing signal generation circuit that generates a network synchronizing signal based on the notification signal. A clock signal for PCM communication is generated based on the network synchronizing signal. A speech signal can be decompressed even if no network synchronizing signal is output from the baseband processing portion. | 05-03-2012 |
20120116752 | AUDIO DATA PROCESSING METHOD AND AUDIO DATA PROCESSING SYSTEM - Audio data processing method and an audio data processing system are described. The audio data processing system includes an audio collect module, a processing module, a virtual play module, a virtual collect module, and a buffer memory. The virtual play module and the virtual collect module are registered in an application interface layer of a third-part software. The third-part software chooses the virtual play module and the virtual collect module. The virtual play module is configured for receiving audio data processed by the processing module and storing the processed audio data in the buffer memory. The virtual collect module is configured for collecting the processed audio data from the buffer memory and transmitting the processed audio data to the third-part software. The invention provides a universal solution suitable for any chatting tool by installing the virtual speaker and the virtual microphone. | 05-10-2012 |
20120130709 | SYSTEM AND METHOD FOR BUILDING AND EVALUATING AUTOMATIC SPEECH RECOGNITION VIA AN APPLICATION PROGRAMMER INTERFACE - Disclosed herein are systems, methods, and non-transitory computer-readable storage media for building an automatic speech recognition system through an Internet API. A network-based automatic speech recognition server configured to practice the method receives feature streams, transcriptions, and parameter values as inputs from a network client independent of knowledge of internal operations of the server. The server processes the inputs to train an acoustic model and a language model, and transmits the acoustic model and the language model to the network client. The server can also generate a log describing the processing and transmit the log to the client. On the server side, a human expert can intervene to modify how the server processes the inputs. The inputs can include an additional feature stream generated from speech by algorithms in the client's proprietary feature extraction. | 05-24-2012 |
20120166184 | Selective Transmission of Voice Data - Systems and methods that provide for voice command devices that receive sound but do not transfer the voice data beyond the system unless certain voice-filtering criteria have been met are described herein. In addition, embodiments provide devices that support voice command operation while external voice data transmission is in mute operation mode. As such, devices according to embodiments may process voice data locally responsive to the voice data matching voice-filtering criteria. Furthermore, systems and methods are described herein involving voice command devices that capture sound and analyze it in real-time on a word-by-word basis and decide whether to handle the voice data locally, transmit it externally, or both. | 06-28-2012 |
20120166185 | SYSTEM AND METHOD FOR THE CREATION AND AUTOMATIC DEPLOYMENT OF PERSONALIZED, DYNAMIC AND INTERACTIVE VOICE SERVICES WITH CLOSED LOOP TRANSACTION PROCESSING - A method and system for accomplishing closed-loop transaction processing in conjunction with interactive, real-time, voice transmission of information to a user is disclosed. A voice-based communication between a user and a first system is established and a report is transmitted to the user. The report might comprise information and at least one request for user input based on said information. In response to the report, the user can request a transaction based on said information. The requested transaction is completed automatically by connecting to a second system for processing. | 06-28-2012 |
20120179457 | CONFIGURABLE SPEECH RECOGNITION SYSTEM USING MULTIPLE RECOGNIZERS - Techniques for combining the results of multiple recognizers in a distributed speech recognition architecture. Speech data input to a client device is encoded and processed both locally and remotely by different recognizers configured to be proficient at different speech recognition tasks. The client/server architecture is configurable to enable network providers to specify a policy directed to a trade-off between reducing recognition latency perceived by a user and usage of network resources. The results of the local and remote speech recognition engines are combined based, at least in part, on logic stored by one or more components of the client/server architecture. | 07-12-2012 |
20120185240 | SYSTEM AND METHOD FOR GENERATING AND SENDING A SIMPLIFIED MESSAGE USING SPEECH RECOGNITION - An embodiment provides a system and method for generating and sending a simplified message using speech recognition. The system provides a speech recognition software that may be utilized for receiving audio, converting audio to text derived from audio, comparing text derived from audio to match fields to find matches, replacing matched text with contents of replacement fields associated to the match fields, generating an output message incorporating the replacement text into the text derived from audio, transmitting the output message to a messaging system and redistributing the output message to recipients. | 07-19-2012 |
20120185241 | AUDIO DECODING APPARATUS, AUDIO CODING APPARATUS, AND SYSTEM COMPRISING THE APPARATUSES - An audio decoding apparatus comprises: a plurality of decoding units; a band replicating unit which processes a decoded signal obtained when a corresponding decoding unit decodes a coded signal, according to a scheme specified by transmitted information; and an information transmitting unit which transmits, to a signal processing unit, information identifying the corresponding decoding unit from among the plurality of decoding units. | 07-19-2012 |
20120197633 | VOICE QUALITY MEASUREMENT DEVICE, METHOD AND COMPUTER READABLE MEDIUM - A voice quality measurement device that measures voice quality of a decoded voice signal outputted from a voice decoder unit. The voice quality measurement device includes a packet buffer unit and a voice information monitoring unit. The packet buffer unit accumulates voice packets that arrive non-periodically as voice information, and outputs the voice information to the voice decoder unit periodically. The voice information monitoring unit monitors continuity of the voice information inputted to the voice decoder unit, and calculates an index of voice quality of the decoded voice signal that reflects acceptability of this continuity. | 08-02-2012 |
20120197634 | VOICE CORRECTION DEVICE, VOICE CORRECTION METHOD, AND RECORDING MEDIUM STORING VOICE CORRECTION PROGRAM - A voice correction device includes a detector that detects a response from a user, a calculator that calculates an acoustic characteristic amount of an input voice signal, an analyzer that outputs an acoustic characteristic amount of a predetermined amount when having acquired a response signal due to the response from the detector, a storage unit that stores the acoustic characteristic amount output by the analyzer, a controller that calculates an correction amount of the voice signal on the basis of a result of a comparison between the acoustic characteristic amount calculated by the calculator and the acoustic characteristic amount stored in the storage unit, and a correction unit that corrects the voice signal on the basis of the correction amount calculated by the controller. | 08-02-2012 |
20120197635 | METHOD FOR GENERATING AN AUDIO SIGNAL - A method for generating an audio signal of a user is provided. According to the method, a first audio signal inside of an ear of the user and a second audio signal outside of the ear is detected. The first audio signal and the second audio signal comprise at least a voice signal component generated by the user. Depending on the first audio signal the second audio signal is processed and output as the audio signal. | 08-02-2012 |
20120209596 | ENCODING DEVICE, DECODING DEVICE AND METHOD FOR BOTH - Disclosed are an encoding device and a decoding device which suppress the occurrence of pre-echo artifacts and post-echo artifacts caused by a high layer having a low temporal resolution, and which implement high subjective quality encoding and decoding. An encoding device ( | 08-16-2012 |
20120226494 | IDENTIFYING AN ENCODING FORMAT OF AN ENCODED VOICE SIGNAL - A digital broadcast transmitting device is described that includes a packet generation unit configured to generate packetized elementary stream (PES) data by converting an inputted voice signal into an encoded voice signal and generating a voice stream packet including the encoded voice signal; a descriptor updating unit configured to update a component descriptor to include a component type identification (ID) and a change reservation ID, the component type ID indicating an encoding format of the encoded voice signal is MPEG surround format and the change reservation ID indicating a change of a format of the encoded voice signal to the MPEG surround format; a packetizing unit configured to generate section data by packetizing the component descriptor; a multiplexing unit configured to multiplex the PES data and the section data; and a modulation unit configured to modulate and transmit multiplexed data acquired from the multiplexing unit. | 09-06-2012 |
20120239386 | METHOD AND DEVICE FOR DETERMINING A DECODING MODE OF IN-BAND SIGNALING - In the field of communications, a method and a device for determining a decoding mode of in-band signaling are provided, which improve accuracy of in-band signaling decoding. The method includes: calculating a probability of each decoding mode of in-band signaling of a received signal at a predetermined moment by using a posterior probability algorithm; and from the calculated probabilities of the decoding modes, selecting a decoding mode having a maximum probability value as a decoding mode of the in-band signaling of the received signal at the predetermined moment. The method and the device are mainly used in a process for determining a decoding mode of in-band signaling in a speech frame transmission process. | 09-20-2012 |
20120253794 | VOICE CONVERSION METHOD AND SYSTEM - A method of converting speech from the characteristics of a first voice to the characteristics of a second voice, the method comprising:
| 10-04-2012 |
20120253795 | AUDIO COMMENTING AND PUBLISHING SYSTEM - An audio commenting and publishing system including a storage database, media content and a computing device all coupled together via a network. The computing device comprises a processor and an application executed by the processor configured to input audio data that a user wishes to associate with the media content from an audio recording mechanism or a memory device. The application is then able to store the audio data on the storage database and use the network address of the audio data along with the network address of the media content to publish the audio data and the media content such that a view is able to hear and access them concurrently at a network-accessible location. | 10-04-2012 |
20120259622 | AUDIO ENCODING DEVICE AND AUDIO ENCODING METHOD - Disclosed is an audio encoding device which removes unnecessary inter-channel parameters from the subject to be encoded, improving the encoding efficiency thereby. In this audio encoding device, a principal component analysis unit ( | 10-11-2012 |
20120259623 | System and Method of Providing Generated Speech Via A Network - A system and method of operating an automatic speech recognition application over an Internet Protocol network is disclosed. The ASR application communicates over a packet network such as an Internet Protocol network or a wireless network. A grammar for recognizing received speech from a user over the IP network is selected from a plurality of grammars according to a user-selected application. A server receives information representing speech over the IP network, performs speech recognition using the selected grammar, and returns information based upon the recognized speech. Sub-grammars may be included within the grammar to recognize speech from sub-portions of a dialog with the user. | 10-11-2012 |
20120265522 | Time Scaling of Audio Frames to Adapt Audio Processing to Communications Network Timing - Methods and apparatus for coordinating audio data processing and network communication processing in a communication device by using time scaling for either inbound or outbound audio data processing, or both, in an communication device. In particular, time scaling of audio data is used to adapt timing for audio data processing to timing for modem processing, by dynamically adjusting a collection of audio samples to fit the container size required by the modem. Speech quality can be preserved while recovering and/or maintaining correct synchronizing between audio processing and communication processing circuits. In an example method, it is determined that a completion time for processing a first audio data frame falls outside a pre-determined timing window. Responsive to this determination, a subsequent audio data frame is time-scaled to control the completion time for processing the subsequent audio data frame. | 10-18-2012 |
20120265523 | FRAME ERASURE CONCEALMENT FOR A MULTI RATE SPEECH AND AUDIO CODEC - An audio coding terminal and method is provided. The terminal includes a coding mode setting unit to set an operation mode, from plural operation modes, for input audio coding by a codec, configured to code the input audio based on the set operation mode such that when the set operation mode is a high frame erasure rate (FER) mode the codec codes a current frame of the input audio according to a select frame erasure concealment (FEC) mode of one or more FEC modes. Upon the setting of the operation mode to be the High FER mode, the one FEC mode is selected, from the one or more FEC modes predetermined for the High FER mode, to control the codec by incorporating of redundancy within a coding of the input audio or as separate redundancy information separate from the coded input audio according to the selected one FEC mode. | 10-18-2012 |
20120303360 | PRESERVING AUDIO DATA COLLECTION PRIVACY IN MOBILE DEVICES - Techniques are disclosed for using the hardware and/or software of the mobile device to obscure speech in the audio data before a context determination is made by a context awareness application using the audio data. In particular, a subset of a continuous audio stream is captured such that speech (words, phrases and sentences) cannot be reliably reconstructed from the gathered audio. The subset is analyzed for audio characteristics, and a determination can be made regarding the ambient environment. | 11-29-2012 |
20120310634 | COMMUNICATION DEVICE WITH REDUCED NOISE SPEECH CODING - A communication device includes memory, an input interface, a processing module, and a transmitter. The processing module receives a digital signal from the input interface, wherein the digital signal includes a desired digital signal component and an undesired digital signal component. The processing module identifies one of a plurality of codebooks based on the undesired digital signal component. The processing module then identifies a codebook entry from the one of the plurality of codebooks based on the desired digital signal component to produce a selected codebook entry. The processing module then generates a coded signal based on the selected codebook entry, wherein the coded signal includes a substantially unattenuated representation of the desired digital signal component and an attenuated representation of the undesired digital signal component. The transmitter converts the coded signal into an outbound signal in accordance with a signaling protocol and transmits it. | 12-06-2012 |
20120316868 | Methods And Systems For Changing A Communication Quality Of A Communication Session Based On A Meaning Of Speech Data - Methods and systems are described for changing a communication quality of a communication session based on a meaning of speech data. Speech data exchanged between clients participating in a communication session is parsed. A meaning of the parsed speech data is determined to determine a communication quality of the communication session. An action is performed to change the communication quality of the communication session based on the meaning of the parsed speech data. | 12-13-2012 |
20120323567 | Packet Loss Concealment for Speech Coding - A speech coding method of significantly reducing error propagation due to voice packet loss, while still greatly profiting from a pitch prediction or Long-Term Prediction (LTP), is achieved by limiting or reducing a pitch gain only for the first subframe or the first two subframes within a speech frame. The method is used for a voiced speech class; a pitch cycle length is compared to a subframe size to decide to reduce the pitch gain for the first subframe or the first two subframes within the frame. Speech coding quality loss due to the pitch gain reduction is compensated by increasing a bit rate of a second excitation component or adding one more stage of excitation component only for the first subframe or the first two subframes within the speech frame. | 12-20-2012 |
20120323568 | Source Code Adaption Based on Communication Link Quality and Source Coding Delay - Method and arrangement in a network node for adapting a property of source coding to the quality of a communication link in packet switched conversational services in a communication system. The method comprises obtaining ( | 12-20-2012 |
20130006617 | Methods and Apparatus for Efficient Vocoder Implementations - Techniques for implementing vocoders in parallel digital signal processors are described. A preferred approach is implemented in conjunction with the BOPS® Manifold Array (ManArray™) processing architecture so that in an array of N parallel processing elements, N channels of voice communication are processed in parallel. Techniques for forcing vocoder processing of one data-frame to take the same number of cycles are described. Improved throughput and lower clock rates can be achieved. | 01-03-2013 |
20130013297 | MESSAGE SERVICE METHOD USING SPEECH RECOGNITION - A message service method using speech recognition includes a message server recognizing a speech transmitted from a transmission terminal, generating and transmitting a recognition result of the speech and N-best results based on a confusion network to the transmission terminal; if a message is selected through the recognition result and the N-best results and an evaluation result according to accuracy of the message are decided, the transmission terminal transmitting the message and the evaluation result to a reception terminal; and the reception terminal displaying the message and the evaluation result. | 01-10-2013 |
20130013298 | METHODS AND APPARATUS FOR GENERATING, UPDATING AND DISTRIBUTING SPEECH RECOGNITION MODELS - Techniques for generating, distributing, and using speech recognition models are described. A shared speech processing facility is used to support speech recognition for a wide variety of devices with limited capabilities including business computer systems, personal data assistants, etc., which are coupled to the speech processing facility via a communications channel, e.g., the Internet. Devices with audio capture capability record and transmit to the speech processing facility, via the Internet, digitized speech and receive speech processing services, e.g., speech recognition model generation and/or speech recognition services, in response. The Internet is used to return speech recognition models and/or information identifying recognized words or phrases. Thus, the speech processing facility can be used to provide speech recognition capabilities to devices without such capabilities and/or to augment a device's speech processing capability. Voice dialing, telephone control and/or other services are provided by the speech processing facility in response to speech recognition results. | 01-10-2013 |
20130013299 | METHOD AND APPARATUS FOR DEVELOPMENT, DEPLOYMENT, AND MAINTENANCE OF A VOICE SOFTWARE APPLICATION FOR DISTRIBUTION TO ONE OR MORE CONSUMERS - A system for developing, deploying and maintaining a voice application over a communications network to one or more recipients has a voice application server connected to a data network for storing and serving voice applications, a network communications server connected to the data network and to the communications network for routing the voice applications to their intended recipients, a computer station connected to the data network having control access to at least the voice application server, and a software application running on the computer station for creating applications and managing their states. The system is characterized in that a developer operating the software application from the computer station creates voice applications through object modeling and linking, stores them for deployment in the application server, and manages deployment and state of deployed applications including scheduled deployment and repeat deployments in terms of intended recipients. | 01-10-2013 |
20130018654 | METHOD AND APPARATUS FOR ENABLING PLAYBACK OF AD HOC CONVERSATIONSAANM Toebes; John A.AACI CaryAAST NCAACO USAAGP Toebes; John A. Cary NC US - In one embodiment, a method includes monitoring activity in an environment, and storing a snippet of the monitored activity. Monitoring the activity in the environment includes operating a device arranged to capture the activity between approximately a first time and approximately a second time. The snippet has a particular duration that is arranged to end at approximately the second time;. The method also includes storing the snippet in a storage module and determining when a request to provide the snippet is obtained from a party. If it is determined that the request to play the snippet is obtained, the method includes accessing the storage module to obtain the snippet and providing the snippet to the party if it is determined that the request to provide the snippet is obtained. | 01-17-2013 |
20130024187 | METHOD AND APPARATUS FOR SOCIAL NETWORK COMMUNICATION OVER A MEDIA NETWORK - A system that incorporates teachings of the present disclosure may include, for example, transmitting a request to initiate a communication session with a member device of a social network, activating a speech capture element, maintaining activation of the speech capture element in accordance with a pattern of prior speech messages, detecting a speech message at the activated speech capture element, and transmitting the detected speech message, or a derivative thereof, to the member device of the social network. Other embodiments are disclosed. | 01-24-2013 |
20130024188 | Real-Time Encoding Technique - A system for encoding an audio signal includes an audio console configured to receive a voice audio signal contained within a first audio spectrum, encode the voice audio signal with a background audio signal contained within a second audio spectrum wider than the first audio spectrum, encode the voice audio signal with a monitoring code and output a combined signal including the voice audio signal encoded with the background audio signal and the monitoring code. The combined signal is contained within an audio spectrum including the first audio spectrum and the second audio spectrum. | 01-24-2013 |
20130024189 | MOBILE TERMINAL AND DISPLAY METHOD THEREOF - A mobile terminal and a control method thereof are provided. The mobile terminal includes: an audio output module; a memory storing text; and a controller configured to convert at least a portion of the text into a speech and output the speech through the audio output module, wherein the controller stores at least a portion of speech data obtained by converting the at least a portion of the text into the speech in the memory, and outputs the speech based on the stored speech data to the audio output module when a speech output signal with respect to the at least portion of the text is obtained. When speech output signal with respect to a portion which has been output by speech is obtained, speech is output based on the stored speech data, thereby shortening time required for outputting the speech. | 01-24-2013 |
20130054229 | PORTABLE DEVICE AND METHOD FOR MULTIPLE RECORDING OF DATA - A portable device performs a multiple recording function by which data is recorded using different recording techniques. The device includes at least one of an input unit and a touch panel, which creates or supports an input signal for activating an audio-related function and an input signal for activating the multiple recording function while the audio-related function is performed. The device further includes a display panel configured to output a memo writing screen of a memo function in response to the activation of the multiple recording function, the memo writing screen allowing the activation of a voice recording function. The device also includes a control unit configured to control the output of the memo writing screen. | 02-28-2013 |
20130060565 | SYSTEMS AND METHODS FOR FRAME SYNCHRONIZATION | 03-07-2013 |
20130085748 | Method and device for modifying a compounded voice message - A method and device are provided for modifying a compounded voice message having at least one first voice component. The method includes a step of obtaining at least one second voice component, a step of updating at least one item of information belonging to a group of items of information associated with the compounded voice message as a function of the at least one second voice component and a step of making available the compounded voice message comprising the at least one first and second voice components, and the group of items of information associated with the compounded voice message. The compounded voice message is intended to be consulted by at least one recipient user. | 04-04-2013 |
20130085749 | SOUND PROCESSING TO PRODUCE TARGET VOLUME LEVEL - A sound process apparatus includes a processor. The processor may execute instructions, which are stored on a memory, and when executed cause the sound process apparatus to perform operations. An obtaining operation may obtain sound data in a remote site. A first determining operation may determine volume levels of voice and noise in the remote site based on the sound data. A second determining operation may determine a volume level of noise in a local site based on the sound in the local site. A third determining operation may determine a target volume level based on the volume level of the voice in the remote site, the volume level of the noise in the remote site, and the volume level of the noise in the local site. A notifying operation may notify a user of the target volume level. | 04-04-2013 |
20130085750 | COMMUNICATION SYSTEM, METHOD, AND APPARATUS - A server apparatus acquires content based on instruction information; decodes image data of the acquired content compression encodes captured image data using a predetermined encoding scheme; decodes an audio signal and compression encodes the decoded audio signal using the predetermined encoding scheme, stores the image and the audio signal and sends the packet to a packet forwarding apparatus. A mobile terminal receives the packet, decodes and displays the compression encoded image data stored in the packet; and decodes and reproduces the compression encoded audio signal. | 04-04-2013 |
20130103392 | APPARATUS AND METHOD OF REPRODUCING AUDIO DATA USING LOW POWER - A method and apparatus for reproducing audio data using low power are provided. The apparatus may reproduce the audio data by determining a power mode based on a memory resource of an internal memory, and an amount of a memory required for reproducing the audio data, controlling a power based on the determined power mode, and decoding the audio data. | 04-25-2013 |
20130103393 | Multi-point sound mixing and distant view presentation method, apparatus and system - The disclosure provides a multi-point sound mixing and distant view presentation method, apparatus and system, wherein the multi-point sound mixing and distant view presentation method includes: receiving audio code streams from a plurality of meeting places, wherein each meeting place comprises one or more meeting sections, and each meeting section corresponds to one audio code stream; mixing the audio code streams of the meeting sections which have a corresponding relationship among the plurality of meeting places; and outputting mixed audio code streams to the meeting sections which have the corresponding relationship among the plurality of meeting places. Sounds in different sections of the distant view presentation conference system can be distinguished by technical solutions provided by the disclosure. | 04-25-2013 |
20130124196 | METHOD AND APPARATUS FOR GENERATING NOISES - A method and an apparatus for generating comfortable noises so as to improve user experience are disclosed. The method includes: if a received data frame is a noise frame, calculating a corresponding energy attenuation parameter based on the noise frame and a data frame received earlier than the noise frame; and attenuating noise energy based on the energy attenuation parameter to obtain a comfortable noise signal. An apparatus for generating comfortable noise is also provided. | 05-16-2013 |
20130124197 | MULTI-LAYERED SPEECH RECOGNITION APPARATUS AND METHOD - A multi-layered speech recognition apparatus and method, the apparatus includes a client checking whether the client recognizes the speech using a characteristic of speech to be recognized and recognizing the speech or transmitting the characteristic of the speech according to a checked result; and first through N-th servers, wherein the first server checks whether the first server recognizes the speech using the characteristic of the speech transmitted from the client, and recognizes the speech or transmits the characteristic according to a checked result, and wherein an n-th (2≦n≦N) server checks whether the n-th server recognizes the speech using the characteristic of the speech transmitted from an (n−1)-th server, and recognizes the speech or transmits the characteristic according to a checked result. | 05-16-2013 |
20130124198 | METHOD AND APPARATUS FOR PROCESSING AUDIO SIGNALS - A method of pre-processing an audio signal transmitted to a user terminal via a communication network and an apparatus using the method are provided. The method of pre-processing the audio signal may prevent deterioration of a sound quality of the audio signal transmitted to the user terminal by pre-processing the audio signal, and by enabling a codec module, encoding the audio signal, to determine the audio signal as a speech signal. Also, the method of pre-processing the audio signal may improve a probability that the codec module may determine a corresponding audio signal as a speech when the audio signal is transmitted via the communication network by pre-processing the audio signal using a speech codec. | 05-16-2013 |
20130132074 | METHOD AND SYSTEM FOR REPRODUCING AND DISTRIBUTING SOUND SOURCE OF ELECTRONIC TERMINAL - There is provided a method of reproducing and distributing a sound source of en electronic terminal. The method includes a step of starting to simultaneously reproduce a stream of an MR (Music Recorded) sound source file and a stream of an AR (All Recorded) sound source file that a voice is recorded to be added to the MR sound source file by a reproducing unit of the electronic terminal, and outputting one stream of the streams through an output unit; and a step of controlling the reproducing unit to stop the output of the one stream that is currently being output through the output unit and to output the other stream through the output unit by a reproducing switch unit of the electronic terminal based on a selection of a user while the stream of the MR sound source file and the stream of the AR sound source file are reproduced, respectively. | 05-23-2013 |
20130138431 | SPEECH SIGNAL TRANSMISSION AND RECEPTION APPARATUSES AND SPEECH SIGNAL TRANSMISSION AND RECEPTION METHODS - A speech signal transmission apparatus includes an extractor to extract speech signals from speech source signals collected by a plurality of microphones, a power calculator to calculate powers of speech signals of multiple channels and set any one of the speech signals of the multiple channels as a reference speech signal, a synchronization adjustor to adjust synchronization of the other speech signals based on the reference speech signal, a signal generator to generate extraction signals by offsetting the reference speech signal from the other synchronization-adjusted speech signals, an encryptor to compress and encrypt the reference speech signal and the extraction signals, and a transmitter to transmit the compressed and encrypted reference speech signal and extraction signals. | 05-30-2013 |
20130144610 | ACTION GENERATION BASED ON VOICE DATA - An automated technique is disclosed for processing audio data and generating one or more actions in response thereto. In particular embodiments, the audio data can be obtained during a phone conversation and post-call actions can be provided to the user with contextually relevant entry points for completion by an associated application. Audio transcription services available on a remote server can be leveraged. The entry points can be generated based on keyword recognition in the transcription and passed to the application in the form of parameters. | 06-06-2013 |
20130151242 | Method to Select Active Channels in Audio Mixing for Multi-Party Teleconferencing - An apparatus comprising an ingress port configured to receive a signal comprising a plurality of encoded audio signals corresponding to a plurality of sources; and a processor coupled to the ingress port and configured to calculate a parameter for each of the plurality of encoded audio signals, wherein each parameter is calculated without decoding any of the encoded audio signals, select some, but not all, of the plurality of encoded audio signals according to the parameter for each of the encoded audio signals, decode the selected signals to generate a plurality of decoded audio signals, and combine the plurality of decoded audio signals into a first audio signal. | 06-13-2013 |
20130151243 | VOICE MODULATION APPARATUS AND VOICE MODULATION METHOD USING THE SAME - A voice modulation apparatus is provided. The voice modulation apparatus includes an audio signal input unit which receives an audio signal from an external source; an extraction unit which extracts property information relating to a voice from the audio signal; a storage unit which stores the extracted property information; a control unit which modulates a target voice based on the extracted property information; and an output unit which outputs the modulated target voice. | 06-13-2013 |
20130158988 | GRACEFUL DEGRADATION FOR COMMUNICATION SERVICES OVER WIRED AND WIRELESS NETWORKS - A method for gracefully extending the range and/or capacity of voice communication systems is disclosed. The method involves the persistent storage of voice media on a communication device. When the usable bit rate on the network is poor and below that necessary for conducting a live conversation, voice media is transmitted and received by the communication device at the available usable bit rate on the network. Although latency may be introduced, the persistent storage of both transmitted and received media of a conversation provides the ability to extend the useful range of wireless networks beyond what is required for live conversations. In addition, the capacity and robustness in not being affected by external interferences for both wired and wireless communications is improved. | 06-20-2013 |
20130166285 | MULTI-CORE PROCESSING FOR PARALLEL SPEECH-TO-TEXT PROCESSING - This specification describes technologies relating to multi core processing for parallel speech-to-text processing. In some implementations, a computer-implemented method is provided that includes the actions of receiving an audio file; analyzing the audio file to identify portions of the audio file as corresponding to one or more audio types; generating a time-ordered classification of the identified portions, the time-ordered classification indicating the one or more audio types and position within the audio file of each portion; generating a queue using the time-ordered classification, the queue including a plurality of jobs where each job includes one or more identifiers of a portion of the audio file classified as belonging to the one or more speech types; distributing the jobs in the queue to a plurality of processors; performing speech-to-text processing on each portion to generate a corresponding text file; and merging the corresponding text files to generate a transcription file. | 06-27-2013 |
20130173259 | Method and Apparatus for Processing Audio Frames to Transition Between Different Codecs | 07-04-2013 |
20130173260 | DEVICE FOR ASSESSING ACCURACY OF STATEMENTS AND METHOD OF OPERATION - A device receives voice and/or data from a speaker, such as a politician, and presents a signal indicative of the accuracy of the speaker's statements. The device maybe a mobile device, such as a smart phone, or a fixed device, such as a TV set. The device compares a speaker segment, automatically selected from the speaker statement, with a factual segment, automatically selected from a database comprising stored facts, and presents the accuracy of the speaker statement to the user of the device. The device may be configured so that the user may manually select the speaker segment to be assessed by the device. | 07-04-2013 |
20130179156 | QR DATA PROXY AND PROTOCOL GATEWAY - A quick response (QR) proxy and protocol gateway for interfacing with a carrier network, a QR-equipped device, and a contact center and contact center database is disclosed. A data link is connected to a carrier network to receive QR codes and other data. Additional data links are connected to a contact center database and a QR-equipped device to obtain information used in determining routing and tagging instructions. A user interface is connected to the gateway to accept configurable conditions for determining routing instructions. There is a text conversion function and speech conversion function for each target enterprise contact center. Synchronization between stored user preferences to automated or semi-automated customer service routes is provided by a consumer preference template system. | 07-11-2013 |
20130179157 | COMPUTER, INTERNET AND TELECOMMUNICATIONS BASED NETWORK - A method and apparatus for a computer and telecommunication network which can receive, send and manage information from or to a subscriber of the network, based on the subscriber's configuration. The network is made up of at least one cluster containing voice servers which allow for telephony, speech recognition, text-to-speech and conferencing functions, and is accessible by the subscriber through standard telephone connections or through internet connections. The network also utilizes a database and file server allowing the subscriber to maintain and manage certain contact lists and administrative information. A web server is also connected to the cluster thereby allowing access to all functions through internet connections. | 07-11-2013 |
20130185061 | METHOD AND APPARATUS FOR MASKING SPEECH IN A PRIVATE ENVIRONMENT - A speech masking apparatus includes a microphone and a speaker. The microphone can detect a human voice. The speaker can output a masking language which can include phonemes resembling human speech. At least one component of the masking language can have a pitch, a volume, a theme, and/or a phonetic content substantially matching a pitch, a volume, a theme, and/or a phonetic content of the voice. | 07-18-2013 |
20130191116 | LANGUAGE DICTATION RECOGNITION SYSTEMS AND METHODS FOR USING THE SAME - Language dictation recognition systems and methods for using the same. In at least one exemplary system for analyzing verbal records, the system comprises a database capable of receiving a plurality of verbal records, the verbal record comprising at least one identifier and at least one verbal feature and a processor operably coupled to the database, where the processor has and executes a software program. The processor being operational to identify a subset of the plurality of verbal records from the database, extract at least one verbal feature from the identified records, analyze the at least one verbal feature of the subset of the plurality of verbal records, process the subset of the plurality of records using the analyzed feature according to at least one reasoning approach, generate a processed verbal record using the processed subset of the plurality of records, and deliver the processed verbal record to a recipient. | 07-25-2013 |
20130197902 | SYSTEM, METHOD AND COMPUTER PROGRAM FOR SHARING AUDIBLE WORD TAGS - The invention provides a system, method and computer program for sharing audible word tags. The word may be an individual's name or information conveyed through series of words. An audible word tag may be recorded by an individual. The audible word tag may be embedded in electronic correspondence and/or documents for sharing with others, or accessed dynamically via the internet and/or other applicable network connectivity on an as-required basis. The method includes generating a profile for associating one or more words to an audible word tag. An audio recording is made of the one or more words. The audio recording is linked to the profile. The audible word tag is linked to one or more electronic correspondence or print. The audible word tag is accessible by a receiver of the correspondence to initiate the playback of the audio recording. | 08-01-2013 |
20130197903 | RECORDING SYSTEM, METHOD, AND DEVICE - An exemplary recording method receives the personal information of a speaker transmitted from a RFID tag through a RFID reader. Then the method receives the voice of the speaker through a microphone. The method next receives the personal information of the speaker and the identifier of the audio input device transmitted from the audio input device, and associates the personal information of the speaker with the received identifier of the audio input device. Then, the method receives the voice and the identifier of the audio input device transmitted from the audio input device. The method further converts the received voice to text. The method determines the personal information corresponding to the identifier of the audio input device received with the voice, and associates the converted text with the determined personal information to generate a record. | 08-01-2013 |
20130211826 | Audio Signals as Buffered Streams of Audio Signals and Metadata - Historically, most audio recording and communication control has been exerted through the use of physical buttons, slider and knobs, e.g. to start/stop recording or communicating, control speaker and microphone volume settings, etc. The present invention describes improvements to this approach, such as detecting and analyzing audio signals with human voice components, e.g. to start/stop recording and communicating, set local and remote recording and playback volumes and filters, and manage metadata associated with temporal ranges in audio streams. | 08-15-2013 |
20130218556 | VOICE PROCESSING APPARATUS AND VOICE PROCESSING METHOD - A voice processing apparatus is provided in an ADPCM (Adaptive Differential Pulse Code Modulation) voice transmission system in which voice data that is differentially quantized through an ADPCM scheme is transmitted. The voice processing apparatus includes an error detector which detects whether or not an error occurs in a transmission frame containing voice data that indicates a differential value, and an error determiner which determines a level of the error detected by the error detector when the error detector detects the error. The voice processing apparatus also includes a voice processor which corrects the voice data with a correction value depending on the level of the error detected b the error detector and an ADPCM decoder which decodes the voice data corrected by the voice processor. | 08-22-2013 |
20130226564 | Method and System for Providing an Audio Representation of a Name - A system and method for providing an audio representation of a name includes providing a list of a plurality of users of a network and respective presence information regarding each of the plurality of users; receiving a request from an endpoint to receive an audio representation of a name of a particular user of the plurality of users, and providing the audio representation to the endpoint. Moreover, the audio representation of the name at least generally approximates a pronunciation of the name as pronounced by the particular user. | 08-29-2013 |
20130246051 | METHOD AND MOBILE TERMINAL FOR REDUCING CALL CONSUMPTION OF MOBILE TERMINAL - A method for reducing call power consumption of a mobile terminal and mobile terminal are disclosed in the present invention, wherein, the method includes: in a voice call process, the mobile terminal performing voiceprint modeling on an audio signal collected by the mobile terminal itself to obtain a voiceprint model, and judging whether the obtained voiceprint model matches with a stored voiceprint model of the user; if not matching, giving up performing wireless transmission on the collected audio signal or giving up performing baseband and radio frequency processing and wireless transmission on the collected audio signal, and if matching, performing the baseband and radio frequency processing and wireless transmission on the audio signal. With the present invention, voice call power consumption of the mobile terminal is reduced, battery usage time of the mobile terminal is extended, and user experience is enhanced. | 09-19-2013 |
20130246052 | CORE NETWORK AND COMMUNICATION SYSTEM - A core network connected to a mobile communication network and establishing voice communication between communication apparatuses receives a connection request that includes an identifier identifying a terminating communication apparatus, from the mobile communication network, to which an originating mobile communication apparatus is connected, and temporarily signals a first codec candidate that should be used by the originating mobile communication apparatus to the originating mobile communication apparatus. Then, the core network determines at least one codec that can be used in the terminating communication apparatus. When the codec that can be used in the terminating network is different from the codec to be used, which is temporarily selected to be actually used in the originating mobile communication apparatus, the core network specifies a second codec candidate that should be used in the originating mobile communication apparatus according to the codec that can be used in the terminating communication apparatus and signals the second codec candidate to the originating mobile communication apparatus. | 09-19-2013 |
20130246053 | SYSTEM FOR ANALYZING INTERACTIONS AND REPORTING ANALYTIC RESULTS TO HUMAN OPERATED AND SYSTEM INTERFACES IN REAL TIME - A computerized system for advising one communicant in electronic communication between two or more communicants has apparatus monitoring and recording interaction between the communicants, software executing from a machine-readable medium and providing analytics, the software functions including rendering speech into text, and analyzing the rendered text for topics, performing communicant verification, and detecting changes in communicant emotion. Advice is offered to the one communicant during the interaction, based on results of the analytics. | 09-19-2013 |
20130253918 | SYSTEM AND METHOD FOR EXTRACTING, DECODING, AND UTILIZING HIDDEN DATA EMBEDDED IN AUDIO SIGNALS - A system and method for enabling a user to retrieve, decode, and utilize hidden data embedded in audio signals. An exemplary implementation includes a microphone structured to receive sound waves representative of an audio signal and hidden data embedded in the audio signal. The then microphone converts the received sound waves into an electrical output signal. The system also includes a processor electrically coupled to the microphone and configured to receive the electrical output signal in order to extract the hidden data and provide information represented by the hidden data as an output thereof. A user interface is also provided and is electrically coupled to the processor and configured to receive a first input from the user and activate the processor to selectively initiate extraction of the hidden data. The processor produces as an output the information represented by the hidden data. Finally, the system includes a user presentation mechanism configured to present the information to the user. | 09-26-2013 |
20130253919 | Method and System for Enrolling a Voiceprint in a Fraudster Database - Disclosed is a method for enrolling a voiceprint in a fraudster database, the method comprising: a) defining a fraud model comprising at least one hypothesis indicative of a fraudulent transaction; b) processing audio data based on the fraud model to identify at least one suspect voiceprint in the audio data suspected of belonging to a fraudster; and c) enrolling the at least one suspect voiceprint in the fraudster database. | 09-26-2013 |
20130262095 | METHOD OF PROVIDING DYNAMIC SPEECH PROCESSING SERVICES DURING VARIABLE NETWORK CONNECTIVITY - A user device provides dynamic speech processing services during variable network connectivity with a network server. The user device includes a connection determiner that monitors a level of network connectivity between the user device and the network server, a simplified speech processor that processes speech data and is initiated based on a determination by the connection determiner that the level of network connectivity between the user device and the network server is impaired, a memory that stores processed speech data processed by the simplified speech processor, and a transmitter configured to transmit the stored processed speech data. The connection determiner determines when the level of network connectivity between the user device and the network server is no longer impaired. | 10-03-2013 |
20130282366 | JITTER BUFFER ENHANCED JOINT SOURCE CHANNEL DECODING - Methods, systems, and apparatuses are provided for performing jitter buffer enhanced joint source channel decoding. Jitter buffer enhanced joint source channel decoding may be performed in a manner that exploits parameter domain correlation. A jitter buffer stores hard bits of properly channel decoded packets, and a secondary jitter buffer is implemented to store soft bits associated with packets that are improperly channel decoded. Joint source channel decoding may be delayed to perform channel decoding of a frame in the penultimate position of the jitter buffer. The soft bits stored in the secondary jitter buffer as well as hard bits stored in the jitter buffer, which may include future frames, are utilized to perform channel decoding. The delayed jitter buffer enhanced joint source channel decoding may also be extended to iteratively perform channel decoding for giving frames at each position in the jitter buffer as they traverse the jitter buffer. | 10-24-2013 |
20130304457 | METHOD AND SYSTEM FOR OPERATING COMMUNICATION SERVICE - An operation method capable of adaptively operating at least one of a Speech To Text (STT) service and a Text To Speech (TTS) service according to setting or user operation and a system thereof are provided. The method includes requesting a specific type of a communication service connection to a reception side terminal by a transmission side terminal, and performing an operation of at least one of a speech to text service providing speech recognition based text and a text to speech service converting the text into speech data between the reception side terminal and the transmission side terminal, and includes one of recognizing speech data provided from the transmission side terminal and converting the speech data into a text based on a first speech process supporting device connected to the transmission side terminal. | 11-14-2013 |
20130317809 | SPEECH MASKING AND CANCELLING AND VOICE OBSCURATION - A non-acoustic sensor is used to measure a user's speech and then broadcasts an obscuring acoustic signal diminishing the user's vocal acoustic output intensity and/or distorting the voice sounds making them unintelligible to persons nearby. | 11-28-2013 |
20130317810 | VECTOR JOINT ENCODING/DECODING METHOD AND VECTOR JOINT ENCODER/DECODER - A vector joint encoding/decoding method and a vector joint encoder/decoder are provided, more than two vectors are jointly encoded, and an encoding index of at least one vector is split and then combined between different vectors, so that encoding idle spaces of different vectors can be recombined, thereby facilitating saving of encoding bits, and because an encoding index of a vector is split and then shorter split indexes are recombined, thereby facilitating reduction of requirements for the bit width of operating parts in encoding/decoding calculation. | 11-28-2013 |
20130325446 | SPEECH RECOGNITION ADAPTATION SYSTEMS BASED ON ADAPTATION DATA - The instant application includes computationally-implemented systems and methods that include acquiring indication of a speech-facilitated transaction between a particular party and a target device, receiving adaptation data correlated to the particular party, the receiving facilitated by a particular device associated with the particular party, processing audio data from the particular party at least partly using the received adaptation data correlated to the particular party, and updating the adaptation data based at least in part on a result of the processed audio data, such that the updated adaptation data is configured to be transmitted to the particular device. In addition to the foregoing, other aspects are described in the claims, drawings, and text. | 12-05-2013 |
20130325447 | SPEECH RECOGNITION ADAPTATION SYSTEMS BASED ON ADAPTATION DATA - The instant application includes computationally-implemented systems and methods that include acquiring indication of a speech-facilitated transaction between a particular party and a target device, receiving adaptation data correlated to the particular party, the receiving facilitated by a particular device associated with the particular party, processing audio data from the particular party at least partly using the received adaptation data correlated to the particular party, and updating the adaptation data based at least in part on a result of the processed audio data, such that the updated adaptation data is configured to be transmitted to the particular device. In addition to the foregoing, other aspects are described in the claims, drawings, and text. | 12-05-2013 |
20130325448 | SPEECH RECOGNITION ADAPTATION SYSTEMS BASED ON ADAPTATION DATA - The instant application includes computationally-implemented systems and methods that include managing adaptation data, the adaptation data is at least partly based on at least one speech interaction of a particular party, facilitating transmission of the adaptation data to a target device when there is an indication of a speech-facilitated transaction between the target device and the particular party, such that the adaptation data is to be applied to the target device to assist in execution of the speech-facilitated transaction, and facilitating acquisition of adaptation result data that is based on at least one aspect of the speech-facilitated transaction and to be used in determining whether to modify the adaptation data. In addition to the foregoing, other aspects are described in the claims, drawings, and text. | 12-05-2013 |
20130325449 | SPEECH RECOGNITION ADAPTATION SYSTEMS BASED ON ADAPTATION DATA - The instant application includes computationally-implemented systems and methods that include managing adaptation data, wherein the adaptation data is correlated to at least one aspect of speech of a particular party, facilitating transmission of the adaptation data to a target device, wherein the adaptation data is configured to be applied to the target device to assist in execution of a speech-facilitated transaction, facilitating reception of adaptation result data that is based on at least one aspect of the speech-facilitated transaction between the particular party and the target device, determining whether to modify the adaptation data at least partly based on the adaptation result data, and facilitating transmission of at least a portion of modified adaptation data to a receiving device. In addition to the foregoing, other aspects are described in the claims, drawings, and text. | 12-05-2013 |
20130325450 | METHODS AND SYSTEMS FOR SPEECH ADAPTATION DATA - Computationally implemented methods and systems include detecting speech data related to a speech-facilitated transaction, acquiring adaptation data that is at least partly based on at least one speech interaction of a particular party that is discrete from the detected speech data, wherein at least a portion of the adaptation data has been stored on a particular device associated with the particular party, obtaining a destination of one or more of the adaptation data and the speech data, and transmitting one or more of the speech data and the adaptation data to the acquired destination. In addition to the foregoing, other aspects are described in the claims, drawings, and text. | 12-05-2013 |
20130325451 | METHODS AND SYSTEMS FOR SPEECH ADAPTATION DATA - Computationally implemented methods and systems include detecting speech data related to a speech-facilitated transaction, acquiring adaptation data that is at least partly based on at least one speech interaction of a particular party that is discrete from the detected speech data, wherein at least a portion of the adaptation data has been stored on a particular device associated with the particular party, obtaining a destination of one or more of the adaptation data and the speech data, and transmitting one or more of the speech data and the adaptation data to the acquired destination. In addition to the foregoing, other aspects are described in the claims, drawings, and text. | 12-05-2013 |
20130325452 | METHODS AND SYSTEMS FOR SPEECH ADAPTATION DATA - Computationally implemented methods and systems include receiving speech data correlated to one or more words spoken by a particular party, receiving adaptation data that is at least partly based on at least one speech interaction of a particular party that is discrete from the received speech data, wherein at least a portion of the adaptation data has been stored on a particular device associated with the particular party, obtaining target data regarding a target configured to process at least a portion of the received speech data, and determining whether to apply the adaptation data for processing at least a portion of the received speech data, at least partly based on the acquired target data. In addition to the foregoing, other aspects are described in the claims, drawings, and text. | 12-05-2013 |
20130325453 | METHODS AND SYSTEMS FOR SPEECH ADAPTATION DATA - Computationally implemented methods and systems include receiving speech data correlated to one or more words spoken by a particular party, receiving adaptation data that is at least partly based on at least one speech interaction of a particular party that is discrete from the received speech data, wherein at least a portion of the adaptation data has been stored on a particular device associated with the particular party, obtaining target data regarding a target configured to process at least a portion of the received speech data, and determining whether to apply the adaptation data for processing at least a portion of the received speech data, at least partly based on the acquired target data. In addition to the foregoing, other aspects are described in the claims, drawings, and text. | 12-05-2013 |
20130325454 | METHODS AND SYSTEMS FOR MANAGING ADAPTATION DATA - Computationally implemented methods and systems include managing adaptation data, wherein the adaptation data is correlated to at least one aspect of speech of a particular party, facilitating transmission of the adaptation data to a target device, in response to an indicator related to a speech-facilitated transaction of a particular party, wherein the adaptation data is correlated to at least one aspect of speech of the particular party, and determining whether to update the adaptation data, said determination at least partly based on a result of at least a portion of the speech-facilitated transaction In addition to the foregoing, other aspects are described in the claims, drawings, and text. | 12-05-2013 |
20130332147 | Apparatus and Methods to Update a Language Model in a Speech Recognition System - The technology of the present application provides a method and apparatus to allow for dynamically updating a language model across a large number of similarly situated users. The system identifies individual changes to user profiles and evaluates the change for a broader application, such as, a dialect correction for a speech recognition engine, as administrator for the system identifies similarly situated user profiles and downloads the profile change to effect a dynamic change to the language model of similarly situated users. | 12-12-2013 |
20130339007 | ENHANCING COMPREHENSION IN VOICE COMMUNICATIONS - Embodiments herein include receiving a request to modify an audio characteristic associated with a first user for a voice communication system. One or more suggested modified audio characteristics may be provided for the first user, based on, at least in part, one or more audio preferences established by another user. An input of one or more modified audio characteristics may be received for the first user for the voice communication system. A user-specific audio preference may be associated with the first user for voice communications on the voice communication system, the user-specific audio preference including the one or more modified audio characteristics. | 12-19-2013 |
20130339008 | DEVICE AND METHOD FOR MAINTAINING VOICE COMMUNICATION SECURITY IN TERMINAL - A device and a method for maintaining voice communication security in a terminal are provided, which enable the terminal to maintain security for a conversation during voice communication. The device includes a microphone for receiving voice data through a microphone in a voice communication mode; and a controller for making a control to decode encoded characters included in voice data received through a microphone in a voice communication mode, and then transmitting the decoded characters to a counterpart terminal communicating with the device. | 12-19-2013 |
20130339009 | CODING DEVICE, COMMUNICATION PROCESSING DEVICE, AND CODING METHOD - Provided are a coding device, a communication processing device, and a coding method, whereby processing operation load (computational load) is significantly reduced for a configuration which computes either frame energy or sub-frame energy of an input signal, using auto-correlation operations, without causing a decline in the precision of either the frame energy or the sub-frame energy. In a coding device ( | 12-19-2013 |
20140006015 | CREATING, RENDERING AND INTERACTING WITH A MULTI-FACETED AUDIO CLOUD | 01-02-2014 |
20140006016 | VOICE SIGNAL ENCODING AND DECODING METHOD, DEVICE, AND CODEC SYSTEM | 01-02-2014 |
20140012570 | DECODING WIRELESS IN-BAND ON-CHANNEL SIGNALS - Described herein are systems, methods and apparatus for decoding in-band on-channel signals and extracting audio and data signals. Memory requirements are reduced by selectively filtering a bit stream of data in the signal so that services of interest which are encoded therein are processed. A single pool of memory may be shared between physical layer and data link layer processing. Memory in this pool may be allocated dynamically between processing of data at the physical and data link layers. When the available memory is not sufficient to support the required services, the dynamic allocation allows for graceful degradation. | 01-09-2014 |
20140032211 | DIALOG SERVER FOR HANDLING CONVERSATION IN VIRTUAL SPACE METHOD AND COMPUTER PROGRAM FOR HAVING CONVERSATION IN VIRTUAL SPACE - A dialog server which provides dialogs made by at least one user through their respective avatars in a virtual space. A method and a computer readable article of manufacture tangibly embodying computer readable instructions for executing the steps of the method are also provided. The dialog server includes: a position storage unit which stores positional information on the avatars; an utterance receiver which receives at least one utterance of avatars and utterance strength representing an importance or attention level of the utterance; an interest level calculator which calculates interest levels between avatars based on their positional information; a message processor which generates a message based on the utterance in accordance with a value calculated from the interest levels and the utterance strength; and a message transmitter which transmits the message to the avatars. | 01-30-2014 |
20140032212 | EVALUATION OF THE VOICE QUALITY OF A CODED SPEECH SIGNAL - A method is provided for determining an indicator evaluating the voice quality of a coded speech signal. The method includes the following steps: calculation per signal frame, of a predetermined number of coefficients of a linear prediction filter for the coded speech signal; determination per frame, of a speech signal reconstructed on the basis of the filter coefficients thus calculated; obtaining per sample, of the residual between the coded speech signal and the reconstructed speech signal; calculation of an evaluation indicator on the basis of the mean or the absolute value of the residuals obtained for all the samples. Also provided are a device for determining an indicator implementing the above method, a method of evaluating the quality or of identifying the class of coding of the coded signal using the indicator determined, as well as a measurement terminal implementing these methods. | 01-30-2014 |
20140039881 | SPEECH RECOGNITION ADAPTATION SYSTEMS BASED ON ADAPTATION DATA - The instant application includes computationally-implemented systems and methods that include managing adaptation data, the adaptation data is at least partly based on at least one speech interaction of a particular party, facilitating transmission of the adaptation data to a target device when there is an indication of a speech-facilitated transaction between the target device and the particular party, such that the adaptation data is to be applied to the target device to assist in execution of the speech-facilitated transaction, and facilitating acquisition of adaptation result data that is based on at least one aspect of the speech-facilitated transaction and to be used in determining whether to modify the adaptation data. In addition to the foregoing, other aspects are described in the claims, drawings, and text. | 02-06-2014 |
20140039882 | SPEECH RECOGNITION ADAPTATION SYSTEMS BASED ON ADAPTATION DATA - The instant application includes computationally-implemented systems and methods that include managing adaptation data, wherein the adaptation data is correlated to at least one aspect of speech of a particular party, facilitating transmission of the adaptation data to a target device, wherein the adaptation data is configured to be applied to the target device to assist in execution of a speech-facilitated transaction, facilitating reception of adaptation result data that is based on at least one aspect of the speech-facilitated transaction between the particular party and the target device, determining whether to modify the adaptation data at least partly based on the adaptation result data, and facilitating transmission of at least a portion of modified adaptation data to a receiving device. In addition to the foregoing, other aspects are described in the claims, drawings, and text. | 02-06-2014 |
20140046656 | METHOD AND APPARATUS FOR AUTOMATIC COMMUNICATIONS SYSTEM INTELLIGIBILITY TESTING AND OPTIMIZATION - Systems and methods for automatic user specific, condition specific communication system intelligibility testing and optimization are provided. The intelligibility of speech for a particular user is determined using a test of intelligibility administered by an interactive voice response (IVR) application running on a communication server. The intelligibility test can be run for a particular user under different conditions. For each user and/or set of conditions, a set of speech signal adjustment parameters can be determined. A set of speech signal adjustment parameters that will enhance the intelligibility of a speech signal for a user are applied when that user is involved in a communication session. The particular set of speech signal adjustment parameters selected can depend on the communication equipment and/or environment associated with the communication session. | 02-13-2014 |
20140046657 | VOCODER PROCESSING METHOD, SEMICONDUCTOR DEVICE, AND ELECTRONIC DEVICE - In a semiconductor device, a vocoder processing unit requests, after executing a first vocoder process being one of an encoding process and a decoding process and before executing a following second vocoder process being other one of the encoding process and the decoding process, a cache memory to prefetch first program data to be used for the second vocoder process from an external memory. | 02-13-2014 |
20140052438 | MANAGING AUDIO CAPTURE FOR AUDIO APPLICATIONS - In a computer system that permits multiple audio capture applications to get an audio capture feed concurrently, an audio manager manages audio capture and/or audio playback in reaction to trigger events. For example, a trigger event indicates an application has started, stopped or otherwise changed a communication stream, or indicates an application has gained, lost or otherwise changed focus or visibility in a user interface, or indicates a user change. In response to a trigger event, the audio manager applies a set of rules to determine which audio capture application is allowed to get an audio capture feed. Based on the decisions, the audio manager manages the audio capture feed for the applications. The audio manager also sends a notification to each of the audio capture applications that has registered for notifications, so as to indicate whether the application is allowed to get the audio capture feed. | 02-20-2014 |
20140067381 | Time-Shifting Distribution Of High Definition Audio Data - A system may time-shift the distribution high-definition (HD) audio. The system can obtain an audio stream from a specified audio source, transcode the audio stream into an HD audio stream, and store the HD audio stream in a memory. The system may later forward the stored HD audio stream to a destination device, which can be a communication device linked to the system through a local telephone network or a remote communication device. The system can also store HD audio when a local communication device receives an incoming call request that interrupts a current HD audio distribution process. The system may resume distribution of the HD audio after processing the incoming call request from a point when the distribution was interrupted. | 03-06-2014 |
20140095153 | METHODS AND APPARATUS TO PROVIDE SPEECH PRIVACY - Methods and apparatus to provide speech privacy are disclosed. An example method includes forming a sampling block based on a first received audio sample, the sampling block representing speech of a user, creating, with a processor, a mask based on the sampling block, the mask to reduce the intelligibility of the speech of the user, wherein the mask is created by converting the sampling block from a time domain to a frequency domain to form a frequency domain sampling block, identifying a first peak within the frequency domain sampling block, demodulating the frequency domain sampling block at the first peak to form a first envelope of the sampling block, distorting the first envelope to form a first distorted envelope, and emitting an acoustic representation of the mask via a speaker. | 04-03-2014 |
20140142927 | SYSTEM TO CONTROL AUDIO EFFECT PARAMETERS OF VOCAL SIGNALS - A vocal effect processing system may include an effect modification module configured to selectively and dynamically apply effects to an input audio signal in accordance with a degree of likelihood that the input audio signal includes a vocal signal and/or based on a proximate location of a source of vocal audio with respect to a vocal microphone. Determination of the degree of likelihood that the input audio signal includes a vocal signal and/or the proximate location may be based on processing of the input audio signal or a plurality of input audio signals. Determination of the proximate location may alternatively, or in addition, be estimated based on a proximity sensor. The effect modification module may dynamically and selectively adjust the effects in response to changes in the degree of likelihood that the vocal signal is included in the input audio signal and/or changes in the estimated proximate location. | 05-22-2014 |
20140142928 | SYSTEM TO SELECTIVELY MODIFY AUDIO EFFECT PARAMETERS OF VOCAL SIGNALS - A vocal effect processing system may include an effect modification module configured to selectively and dynamically apply effects to an input audio signal in accordance with a degree of likelihood that the input audio signal includes a vocal signal and/or based on a proximate location of a source of vocal audio with respect to a vocal microphone. Determination of the degree of likelihood that the input audio signal includes a vocal signal and/or the proximate location may be based on processing of the input audio signal or a plurality of input audio signals. Determination of the proximate location may alternatively, or in addition, be estimated based on a proximity sensor. The effect modification module may dynamically and selectively adjust the effects in response to changes in the degree of likelihood that the vocal signal is included in the input audio signal and/or changes in the estimated proximate location. | 05-22-2014 |
20140163970 | METHOD FOR CLASSIFYING VOICE CONFERENCE MINUTES, DEVICE, AND SYSTEM - Embodiments of the present invention provide a method, device, and system for classifying voice conference minutes. The method is: performing voice source locating according to audio data of the conference site so as to acquire a location of a voice source corresponding to the audio data, writing the location of the voice source into additional field information of the audio data, writing a voice activation flag into the additional field information, packaging the audio data as an audio code stream, and sending the audio code stream and the additional field information of the audio code stream to a recording server, so that the recording server classifies the audio data according to the additional field information and writes a participant identity that corresponds to the location of the voice source corresponding to the audio data into the additional field information of the audio code stream. | 06-12-2014 |
20140163971 | METHOD OF USING A MOBILE DEVICE AS A MICROPHONE, METHOD OF AUDIO PLAYBACK, AND RELATED DEVICE AND SYSTEM - The present disclosure discloses a method of using a mobile terminal as a microphone, an audio playback method, and related device and system. The method of using a mobile terminal as a microphone comprises: receiving identification information from a media device; establishing a data connection with the media device based on the identification information; converting a voice signal into audio data and sending the audio data to the media device, enabling the media device to output the audio data. According to the present disclosure, a mobile device and a media device can coordinate with each other. By connecting a mobile device, such as a mobile phone etc., with a media device, the mobile device can be used as a microphone. This makes it convenient for a user to use a microphone whenever and wherever, and meets the user's need. | 06-12-2014 |
20140172419 | SYSTEM AND METHOD FOR GENERATING PERSONALIZED TAG RECOMMENDATIONS FOR TAGGING AUDIO CONTENT - Systems, methods, and computer-readable storage media for generating personalized tag recommendations using speech analytics. The system first analyzes an audio stream to identify topics in the audio stream. Next, the system identifies tags related to the topics to yield identified tags. Based on the identified tags, the system then generates a tag recommendation for tagging the audio stream. The system can also send the tag recommendation to a device associated with a user for presentation to the user. | 06-19-2014 |
20140188463 | HOME APPLIANCE AND OPERATION METHOD THEREOF - A home appliance and an operation method thereof are disclosed. The operation method of the home appliance includes entering a voice recognition mode, receiving a voice data through a microphone, recognizing the received voice date, and, in a case in which the recognized voice data contains information related to another home appliance, transmitting the recognized voice data to the corresponding home appliance. Consequently, sharing of voice data between home appliances is achieved. | 07-03-2014 |
20140195222 | Speech Modification for Distributed Story Reading - Various embodiments provide an interactive, shared, story-reading experience in which stories can be experienced from remote locations. Various embodiments enable augmentation or modification of audio and/or video associated with the story-reading experience. This can include augmentation and modification of a reader's voice, face, and/or other content associated with the story as the story is read. | 07-10-2014 |
20140195223 | METHOD AND SYSTEM FOR TRANSMITTING AUDIO SIGNAL - A method and a system for transmitting an audio signal are provided. The system includes a transmission device and a receiving device communicating with the transmission device via a network. The method includes receiving and sampling the audio signal, recording values of points of the sampled audio signal using the transmission device, segmenting the sampled audio signal into a plurality of frames, extracting and encoding characteristic information from each frame, to obtain a group of generated codes, transmitting each group of generated codes to a receiving device sequentially using the transmission device, and decoding each group of generated codes using the receiving device, to obtain a decoded audio signal. | 07-10-2014 |
20140207442 | Protection of Private Information in a Client/Server Automatic Speech Recognition System - A mobile device is adapted for protecting private information on the mobile device in a hybrid automatic speech recognition arrangement. The mobile device includes a speech input component for receiving a speech input signal from a user. Additionally, the mobile device includes a local ASR arrangement for performing local ASR processing of the speech input signal and determining if private information is included within the speech input signal. A control unit on the mobile device obscures private information in the speech input signal if the local ASR arrangement identifies information within a speech recognition result as private information. The control unit releases the speech input signal with the obscured private information for transmission to a remote server for further ASR processing. | 07-24-2014 |
20140214410 | DIALOGUE SYSTEM AND METHOD FOR RESPONDING TO MULTIMODAL INPUT USING CALCULATED SITUATION ADAPTABILITY - A dialogue system and a method for the same are disclosed. The dialogue system includes a multimodal input unit receiving speech and non-speech information of a user, a domain reasoner, which stores a plurality of pre-stored situations, each of which is formed by a combination one or more speech and non-speech information, calculating each adaptability of the pre-stored situations on the basis of a generated situation based on the speech and the non-speech information received from the multimodal input unit, and determining a current domain according to the calculated adaptability, a dialogue manager to select a response corresponding to the current domain, and a multimodal output unit to output the response. The dialogue system performs domain reasoning using a situation including information combinations reflected in the domain reasoning process, current information, and a speech recognition result, and reduces the size of a dialogue search space while increasing domain reasoning accuracy. | 07-31-2014 |
20140222420 | DATA PROCESSING METHOD THAT SELECTIVELY PERFORMS ERROR CORRECTION OPERATION IN RESPONSE TO DETERMINATION BASED ON CHARACTERISTIC OF PACKETS CORRESPONDING TO SAME SET OF SPEECH DATA, AND ASSOCIATED DATA PROCESSING APPARATUS - A data processing method for performing data processing on wireless received data and an associated data processing apparatus are provided, where the data processing method is applied to an electronic device. The data processing method includes the steps of: wirelessly receiving a plurality of packets corresponding to a same set of speech data from another electronic device; and selectively performing error correction operation on at least one of the plurality of packets to obtain the set of speech data, wherein whether to perform the error correction operation is determined according to at least one characteristic of the plurality of packets. More particularly, the error correction operation is selectively performed for at least one scenario of a timing critical scenario and a re-transmission limited scenario. | 08-07-2014 |
20140303965 | METHOD FOR ENCODING VOICE SIGNAL, METHOD FOR DECODING VOICE SIGNAL, AND APPARATUS USING SAME - The present invention relates to a method for encoding a voice signal, a method for decoding a voice signal, and an | 10-09-2014 |
20140303966 | COMMUNICATION SYSTEM AND TERMINAL DEVICE - A communication system according to the present invention includes a plurality of terminal devices that are able to communicate mutually. Each of the terminal devices includes a voice input conversion device, a voice transmitting device, a voice receiving device, and a voice reproducing device. When there is a plurality of voice signals which has not been completed reproduction, the voice reproducing device reproduces after arranging the voice signals so that respective voices corresponding to the respective voice signals do not overlap. | 10-09-2014 |
20140309991 | METHODS AND APPARATUS FOR MASKING SPEECH IN A PRIVATE ENVIRONMENT - A speech masking apparatus includes a microphone and a speaker. The microphone can detect a human voice. The speaker can output a masking language which can include phonemes resembling human speech. At least one component of the masking language can have a pitch, a volume, a theme, and/or a phonetic content substantially matching a pitch, a volume, a theme, and/or a phonetic content of the voice. | 10-16-2014 |
20140316771 | SYSTEMS AND METHODS FOR SOURCE SIGNAL SEPARATION - A method includes receiving an input signal comprising an original domain signal and creating a first window data set and a second window data set from the signal, wherein an initiation of the second window data set is offset from an initiation of the first window data set, converting the first window data set and the second window data set to a frequency domain and storing the resulting data as data in a second domain different from the original domain, performing complex spectral phase evolution (CSPE) on the second domain data to estimate component frequencies of the first and second window data sets, using the component frequencies estimated in the CSPE, sampling a set of second-domain high resolution windows to select a mathematical representation comprising a second-domain high resolution window that fits at least one of the amplitude, phase, amplitude modulation and frequency modulation of a component of an underlying signal wherein the component comprises at least one oscillator peak, generating an output signal from the mathematical representation of the original signal as at least one of: an audio file; one or more audio signal components; and one or more speech vectors and outputting the output signal to an external system. | 10-23-2014 |
20140316772 | Verification of Extracted Data - Facts are extracted from speech and recorded in a document using codings. Each coding represents an extracted fact and includes a code and a datum. The code may represent a type of the extracted fact and the datum may represent a value of the extracted fact. The datum in a coding is rendered based on a specified feature of the coding. For example, the datum may be rendered as boldface text to indicate that the coding has been designated as an “allergy.” In this way, the specified feature of the coding (e.g., “allergy”-ness) is used to modify the manner in which the datum is rendered. A user inspects the rendering and provides, based on the rendering, an indication of whether the coding was accurately designated as having the specified feature. A record of the user's indication may be stored, such as within the coding itself. | 10-23-2014 |
20140316773 | METHOD OF AND APPARATUS FOR EVALUATING INTELLIGIBILITY OF A DEGRADED SPEECH SIGNAL - The present invention relates to a method of evaluating intelligibility of a degraded speech signal received from an audio transmission system conveying a reference signal. The method comprises sampling said reference and degraded signal into frames, and forming frame pairs. For each pair one or more difference functions representing a difference between the degraded and reference signal are provided. A difference function is selected and compensated for different disturbance types, such as to provide a disturbance density function adapted to human auditory perception. An overall quality parameter is determined indicative of the intelligibility of the degraded signal. The method comprises determining a switching parameter indicative of audio power level of said degraded signal, for performing said selecting. | 10-23-2014 |
20140337014 | VOICE RECORDING AND PLAYBACK DEVICE, AND CONTROL METHOD FOR VOICE RECORDING AND PLAYBACK DEVICE - A voice recording and playback device of the present invention is capable of storage management of generated voice data, to which date information has been attached, as voice files, and comprises a display section capable of calendar display, and a control section for, at the time of retrieving voice files from a storage section, performing movable identification on a calendar display, as well as retrieving voice files that have been stored in the storage section based on date information attached to the files, and performing display of results of this retrieval indicating the existence of voice files close to day display on the calendar display, wherein the control section moves the identification position based on an instruction operation by the retrieval instructions section, and generates a notification in accordance with voice files that exist on the date of the identification position that has been moved. | 11-13-2014 |
20140337015 | METHOD AND APPARATUS FOR SOCIAL NETWORK COMMUNICATION OVER A MEDIA NETWORK - A system that incorporates teachings of the present disclosure may initiate a communication session with a member device of a social network and may activate a speech capture element based on a pattern of prior speech messages. A speech message may be detected at the speech capture element and, in turn, transmitted to the member device. | 11-13-2014 |
20140337016 | Speech Signal Enhancement Using Visual Information - Visual information is used to alter or set an operating parameter of an audio signal processor, other than a beamformer. A digital camera captures visual information about a scene that includes a human speaker and/or a listener. The visual information is analyzed to ascertain information about acoustics of a room. A distance between the speaker and a microphone may be estimated, and this distance estimate may be used to adjust an overall gain of the system. Distances among, and locations of, the speaker, the listener, the microphone, a loudspeaker and/or a sound-reflecting surface may be estimated. These estimates may be used to estimate reverberations within the room and adjust aggressiveness of an anti-reverberation filter, based on an estimated ratio of direct to indirect (reverberated) sound energy expected to reach the microphone. In addition, orientation of the speaker or the listener, relative to the microphone or the loudspeaker, can also be estimated, and this estimate may be used to adjust frequency-dependent filter weights to compensate for uneven frequency propagation of acoustic signals from a mouth, or to a human ear, about a human head. | 11-13-2014 |
20140343929 | VOICE RECORDING SYSTEM AND METHOD - An electronic device includes a camera and two microphones. The space in front of the camera is divided into a plurality of imaginary cubic areas. Each imaginary cubic area is associated with a delay parameter. The camera locates a face of a user and determines an imaginary cubic area in which the face is located from the plurality of imaginary cubic areas. A wave beam pointing to the imaginary cubic area is calculated according to the delay parameter associated with the imaginary cubic area. The two microphone record voices within a range of the wave beam. A voice recording method is also provided. | 11-20-2014 |
20140343930 | Systems and Methods for Voice Data Processing - Systems and methods are provided for voice data processing. For example, a first data packet included in voice data transmitted by a client is received; the first data packet is stored in a storage area; whether to process one or more second data packets stored in the storage area is determined based on at least information associated with a type of the first data packet and a current storage state of the storage area; in response to a determination to process the second data packets, voice resources are applied for; and the second data packets stored in the storage area are processed using the voice resources. | 11-20-2014 |
20140358525 | Dual-Band Speech Encoding - This document describes various techniques for dual-band speech encoding. In some embodiments, a first type of speech feature is received from a remote entity, an estimate of a second type of speech feature is determined based on the first type of speech feature, the estimate of the second type of speech feature is provided to a speech recognizer, speech-recognition results based on the estimate of the second type of speech feature are received from the speech recognizer, and the speech-recognition results are transmitted to the remote entity. | 12-04-2014 |
20150039298 | INSTANT COMMUNICATION VOICE RECOGNITION METHOD AND TERMINAL - The present disclosure discloses a speech recognition method and a terminal, which belong to the field of communications. The method comprises: receiving speech information inputted by a user; acquiring the current environment information, and judging whether the speech information needs to be played according to the current environment information; and recognizing the speech information as text information, when it is judged that the speech information needs not to be played. The terminal comprises an acquisition module, a judgment module and a recognition module. The present disclosure provides the speech receiver with a speech recognition function, when the speech information of the instant messaging is received by the terminal, it can help the receiver to normally acquire the content to be expressed by the speech sender under an inconvenient situation. | 02-05-2015 |
20150081282 | TRANSFERRING AUDIO FILES - Embodiments of the present invention use one or more audible tones to communicate metadata during a transfer of an audio file. Embodiments of the present invention communicate an audio file from a speaker in a recording device (e.g., a recordable book, toy, computing device) to a microphone in a receiving device. The audio file is transferred by audibly broadcasting the audio file content. The audio file may be a recording made by the user (e.g., the user singing a song, a child responding to a storybook prompt intended to elicit a response). The file transfer process uses one or more audible tones, such as dual-tone multi-frequency signaling (“DTMF”) tones to communicate metadata associated with the audio file. Audible tones may also be used to communicate commands that delineate the beginning and/or end of a file broadcast. | 03-19-2015 |
20150112666 | APPARATUSES, METHODS AND SYSTEMS FOR A DIGITAL CONVERSATION MANAGEMENT PLATFORM - The APPARATUSES, METHODS AND SYSTEMS FOR A DIGITAL CONVERSATION MANAGEMENT PLATFORM (“DCM-Platform”) transforms digital dialogue from consumers, client demands and, Internet search inputs via DCM-Platform components into tradable digital assets, and client needs based artificial intelligence campaign plan outputs. In one implementation, The DCM-Platform may capture and examine conversations between individuals and artificial intelligence conversation agents. These agents may be viewed as assets. One can measure the value and performance of these agents by assessing their performance and ability to generate revenue from prolonging conversations and/or ability to effect sales through conversations with individuals. | 04-23-2015 |
20150112667 | METHOD FOR CONTROLLING CORDLESS TELEPHONE DEVICE, HANDSET OF CORDLESS TELEPHONE DEVICE, AND CORDLESS TELEPHONE DEVICE - Disclosed is a method for controlling a cordless telephone device for use in a system that allows remote control of a home electric appliance. The method includes a first generation step of causing a first generation unit in a handset to encode audio input via a sound receiving unit in the handset to generate a first stream, and a first transmission step of transmitting the first stream to a base unit. The first generation step includes causing the first generation unit to generate instruction bit information and a first instruction stream when a first trigger indicating a request to start the remote control is given to the first generation unit. The first transmission step includes transmitting the instruction bit information and the first instruction stream to the base unit through a multiplexing scheme that is common to transmission of a first stream generated when the first trigger is not given. | 04-23-2015 |
20150112668 | VOICE PROCESSING METHOD, APPARATUS, AND SYSTEM - Methods, apparatus, and systems for voice processing are provided herein. An exemplary method can be implemented by a terminal. A voice bit stream to be sent can be obtained. Voice control information corresponding to the voice bit stream to be sent can be obtained. The voice control information can be used for a voice server to determine a voice-mixing strategy. The voice bit stream and the voice control information can be sent to the voice server. At least one voice bit stream, returned by the voice server based on the voice-mixing strategy, can be received. The at least one voice bit stream can be outputted. | 04-23-2015 |
20150120284 | APPARATUS AND METHOD FOR IMPROVING COMMUNICATION QUALITY OF RADIO - An apparatus includes a user input unit, a display unit, a control unit, and a buffer unit. The display unit includes a speed setting menu. The control unit selects a mode from the speed setting menu in response to the selection signal of the user, and controls a compression ratio of a voice codec and a transfer rate of a modem corresponding to a transmission-side radio, and a reception rate of a modem and a restoration rate of a voice codec corresponding to a reception-side radio, based on the selected mode. The buffer unit performs a storage function if there is a difference between the compression ratio of the voice codec and the transfer rate of the modem or if there is a difference between the reception rate of the modem and the restoration rate of the voice codec. | 04-30-2015 |
20150303941 | METHOD AND SYSTEM FOR PROCESSING TEXT - A computer-implemented method at an electronic device, the method comprising: receiving plain text from one of the plurality of software applications; processing the text into compressed text, while maintaining comprehensibility of the compressed text; and returning the compressed text to the one of the plurality of software application. | 10-22-2015 |
20150310866 | DIALOG SERVER FOR HANDLING CONVERSATION IN VIRTUAL SPACE METHOD AND COMPUTER PROGRAM FOR HAVING CONVERSATION IN VIRTUAL SPACE - A dialog server which provides dialogs made by at least one user through their respective avatars in a virtual space. A method and a computer readable article of manufacture tangibly embodying computer readable instructions for executing the steps of the method are also provided. The dialog server includes: a position storage unit which stores positional information on the avatars; an utterance receiver which receives at least one utterance of avatars and utterance strength representing an importance or attention level of the utterance; an interest level calculator which calculates interest levels between avatars based on their positional information; a message processor which generates a message based on the utterance in accordance with a value calculated from the interest levels and the utterance strength; and a message transmitter which transmits the message to the avatars. | 10-29-2015 |
20150310869 | APPARATUS ALIGNING AUDIO SIGNALS IN A SHARED AUDIO SCENE - An apparatus comprising: an input selector configured to select at least two audio signals; a segmenter configured to segment the at least two audio signals according to at least two classifications; a segment selector configured to select, for the at least two audio signals, audio signal segments based on at least one classification from the at least two classifications; and an aligner configured to align the selected audio signal segments, and further configured to align the at least two audio signals based on the alignment of the selected audio signal segments. | 10-29-2015 |
20150325247 | Semiconductor Device, Radio Communication Terminal Using the Same, and Control Method - A communication terminal according to one aspect of the present invention includes a baseband LSI that performs baseband processing for communication, an application LSI that includes a vocoder function and performs processing according to an application, an audio LSI that performs one of D/A conversion and A/D conversion on audio data, and a switch circuit that is installed in the application LSI and connects a data path between the audio processor LSI and the baseband LSI. | 11-12-2015 |
20150325255 | METHOD OF PROVIDING DYNAMIC SPEECH PROCESSING SERVICES DURING VARIABLE NETWORK CONNECTIVITY - A user device provides dynamic speech processing services during variable network connectivity with a network server. The user device includes a monitor that monitors a level of network connectivity between the user device and the network server. A user device speech processor processes speech data and is initiated based on a determination that the level of network connectivity between the user device and the network server is impaired. The monitor determines when the level of network connectivity between the user device and the network server is no longer impaired. | 11-12-2015 |
20150340047 | METHOD OF AND APPARATUS FOR EVALUATING INTELLIGIBILITY OF A DEGRADED SPEECH SIGNAL - The present invention relates to a method of evaluating intelligibility of a degraded speech signal received from an audio transmission system conveying a reference speech signal. The method comprises sampling said signals into reference and degraded signal frames, and forming frame pairs by associating reference and degraded signal frames with each other. For each frame pair a difference function representing disturbance is provided, which is then compensated for specific disturbance types for providing a disturbance density function. Based on the density function of a plurality of frame pairs, an overall quality parameter is determined. The method provides for compensating the overall quality parameter for the effect that the assessment of intelligibility of CVC words is dominated by the intelligibility of consonants. | 11-26-2015 |
20150348542 | SPEECH RECOGNITION METHOD AND SYSTEM BASED ON USER PERSONALIZED INFORMATION - The present invention relates to a speech recognition method and system based on user personalized information. The method comprises the following steps: receiving a speech signal; decoding the speech signal according to a basic static decoding network to obtain a decoding path on each active node in the basic static decoding network, wherein the basic static decoding network is a decoding network associated with a basic name language model; if a decoding path enters a name node in the basic static decoding network, network extending is carried out on the name node according to an affiliated static decoding network of a user, wherein the affiliated static decoding network is a decoding network associated with a name language model of a particular user; and returning a recognition result after the decoding is completed. The recognition accuracy rate of user personalized information in continuous speech recognition may be raised by using the present invention. | 12-03-2015 |
20150371651 | AUTOMATIC CONSTRUCTION OF A SPEECH - A method comprising using at least one hardware processor for: identifying relations between pairs of claims of a set of claims; aggregating the claims of the set of claims into a plurality of clusters based on the identified relations; generating a plurality of arguments from the plurality of clusters, wherein each of the arguments is generated from a cluster of the plurality of clusters, and wherein each of the arguments comprises at least one claim of the set of claims, scoring each possible set of a predefined number of arguments of the plurality of arguments, based on a quality of each argument of the predefined number of arguments and on diversity between the predefined number of arguments; and generating a speech, wherein the speech comprises a top scoring possible set of the possible set of the predefined number of arguments. | 12-24-2015 |
20150371652 | Communication Devices and Methods for Temporal Analysis of Voice Calls - Headsets having corresponding audio adapters and methods comprise: a microphone configured to generate analog audio for a voice call; an analog-to-digital converter configured to convert the analog audio to digital audio; a voice activity detector configured to detect speech in the digital audio; a processor configured to i) determine a temporal characteristic of the speech, and ii) generate a message based on the temporal characteristic of the speech and a temporal characteristic of the voice call; and a transmitter configured to transmit the message. | 12-24-2015 |
20160071528 | SYSTEMS AND METHODS FOR SOURCE SIGNAL SEPARATION - A method includes receiving an input signal comprising an original domain signal and creating a first window data set and a second window data set from the signal, wherein an initiation of the second window data set is offset from an initiation of the first window data set, converting the first window data set and the second window data set to a frequency domain and storing the resulting data as data in a second domain different from the original domain, performing complex spectral phase evolution (CSPE) on the second domain data to estimate component frequencies of the first and second window data sets, using the component frequencies estimated in the CSPE, sampling a set of second-domain high resolution windows to select a mathematical representation comprising a second-domain high resolution window that fits at least one of the amplitude, phase, amplitude modulation and frequency modulation of a component of an underlying signal wherein the component comprises at least one oscillator peak, generating an output signal from the mathematical representation of the original signal as at least one of: an audio file; one or more audio signal components; and one or more speech vectors and outputting the output signal to an external system. | 03-10-2016 |
20160085498 | METHOD AND APPARATUS FOR MUTING A DEVICE - A method and apparatus for muting a device is provided herein. During operation a device such as a two-way radio detects a user's voice and will mute the radio in response to the voice being detected. During the time period the device is muted, all received transmissions will be stored by the radio. These transmissions will be played back to the user when voice activity has ceased for a predetermined amount of time. In a second embodiment of the present invention, the device is only muted when a particular identified user's voice is detected. | 03-24-2016 |
20160098999 | AUDIO SEARCH USING CODEC FRAMES - To detect events in an audio stream, frames of an audio signal (e.g., frames generated by a codec for a voice call or music stream) are received. Based on information in the frames, an index is used to look up an entry in a table associated with the codec. Each entry in the table indicates a likelihood that a frame matches a sound model element. The likelihood is used in the search for a sound bite, word, and/or phrase in the audio signal. The process of dynamic programming is used to find the combined likelihood for a match of the word, phrase, and/or sound bite to a region of the audio stream. Upon detection of the word, phrase, and/or sound bite in the audio stream, an event is generated, such as, notifying a person or logging the event in a database. | 04-07-2016 |
20160111133 | RADIO COMMUNICATION DEVICE - A reception unit is configured to demodulate a received audio signal from a received signal. A first detector is configured to detect that a state of the received signal has changed when the reception unit receives a signal transmitted wirelessly from a distant station. A first chapter generator is configured to generate a first chapter when the first detector has detected that the state of the received signal has changed. The first chapter indicates timing when the state of the received signal has changed. A recording data generator is configured to convert the received audio signal into voice data with a predetermined format, add the first chapter to the voice data, and generate recording data with a predetermined format. A recording controller is configured to control to record the recording data in a recording medium. | 04-21-2016 |
20160133268 | METHOD, ELECTRONIC DEVICE, AND COMPUTER PROGRAM PRODUCT - According to one embodiment, a method of an electronic device for outputting a sound from loudspeakers includes: recording an audio signal comprising voice sections; displaying the voice sections, wherein speakers of the voice sections are visually distinguishable; designating a first voice section of a first speaker; designating a second voice section of a second speaker; outputting signals of the first voice section from the loudspeakers in a first output form; and outputting signals of the second voice section from the loudspeakers in a second output form different from the first output form. | 05-12-2016 |
20160189709 | SPEECH RECOGNITION SYSTEMS AND METHODS FOR MAINTENANCE REPAIR AND OVERHAUL - Methods and systems are provided for capturing information associated with a component of a system during a maintenance procedure. In one embodiment, a method includes: managing a dialog with a user via a wearable device based on a pre-defined dialog file, wherein the pre-defined dialog file is defined for at least one of a component and a procedure; receiving speech signals at the wearable device based on the dialog; processing the speech signals by the wearable device to identify component information; and transmitting the component information from the wearable device to a host component for use by a maintenance application. The managing, receiving, and transmitting are performed during a maintenance procedure. | 06-30-2016 |
20160196824 | VEHICULAR APPARATUS AND SPEECH SWITCHOVER CONTROL PROGRAM | 07-07-2016 |
20160196832 | SYSTEM ENABLING A PERSON TO SPEAK PRIVATELY IN A CONFINED SPACE | 07-07-2016 |
20160203832 | SYSTEMS AND METHODS FOR AN AUTOMATIC LANGUAGE CHARACTERISTIC RECOGNITION SYSTEM | 07-14-2016 |