Patent application number | Description | Published |
20090252344 | GAMING HEADSET AND CHARGING METHOD - An audio headset may comprise a case, near field microphone and far field microphone. A speaker, processor, memory, battery, charging interface and cradle detection circuit may be mounted to the case. Processor-executable instructions embodied in the memory, may be configured to implement a battery charging method. The headset may be shut off in response to placement of the headset in a charging cradle. The far-field microphone is turned on but not the near-field microphone. The battery may then be charged from the cradle. A headset having near-field and far-field microphones may be used to distinguish between user speech and competing sounds by generating signals from the sounds detected by each microphone and comparing the strengths of the signals. The signals may be processed as user speech if they are of comparable strength. Otherwise, the near-field signal may be processed as user speech and the far-field signal as competing sounds. | 10-08-2009 |
20100121640 | METHOD AND SYSTEM FOR MODELING A COMMON-LANGUAGE SPEECH RECOGNITION, BY A COMPUTER, UNDER THE INFLUENCE OF A PLURALITY OF DIALECTS - The present invention relates to a method for modeling a common-language speech recognition, by a computer, under the influence of multiple dialects and concerns a technical field of speech recognition by a computer. In this method, a triphone standard common-language model is first generated based on training data of standard common language, and first and second monophone dialectal-accented common-language models are based on development data of dialectal-accented common languages of first kind and second kind, respectively. Then a temporary merged model is obtained in a manner that the first dialectal-accented common-language model is merged into the standard common-language model according to a first confusion matrix obtained by recognizing the development data of first dialectal-accented common language using the standard common-language model. Finally, a recognition model is obtained in a manner that the second dialectal-accented common-language model is merged into the temporary merged model according to a second confusion matrix generated by recognizing the development data of second dialectal-accented common language by the temporary merged model. This method effectively enhances the operating efficiency and admittedly raises the recognition rate for the dialectal-accented common language. The recognition rate for the standard common language is also raised. | 05-13-2010 |
20100211376 | MULTIPLE LANGUAGE VOICE RECOGNITION - Computer implemented speech processing generates one or more pronunciations of an input word in a first language by a non-native speaker of the first language who is a native speaker of a second language. The input word is converted into one or more pronunciations. Each pronunciation includes one or more phonemes selected from a set of phonemes associated with the second language. Each pronunciation is associated with the input word in an entry in a computer database. Each pronunciation in the database is associated with information identifying a pronunciation language and/or a phoneme language. | 08-19-2010 |
20100211387 | SPEECH PROCESSING WITH SOURCE LOCATION ESTIMATION USING SIGNALS FROM TWO OR MORE MICROPHONES - Computer implemented speech processing is disclosed. First and second voice segments are extracted from first and second microphone signals originating from first and second microphones. The first and second voice segments correspond to a voice sound originating from a common source. An estimated source location is generated based on a relative energy of the first and second voice segments and/or a correlation of the first and second voice segments. A determination whether the voice segment is desired or undesired may be made based on the estimated source location. | 08-19-2010 |
20100211391 | AUTOMATIC COMPUTATION STREAMING PARTITION FOR VOICE RECOGNITION ON MULTIPLE PROCESSORS WITH LIMITED MEMORY - Speech processing is disclosed for an apparatus having a main processing unit, a memory unit, and one or more co-processors. Memory maintenance and voice recognition result retrievals upon execution are performed with a first main processor thread. Voice detection and initial feature extraction on the raw data are performed with a first co-processor. A second co-processor thread receives feature data derived for one or more features extracted by the first co-processor thread and information for locating probability density functions needed for probability computation by a speech recognition model and computes a probability that the one or more features correspond to a known sub-unit of speech using the probability density functions and the feature data. At least a portion of a path probability that a sequence of sub-units of speech correspond to a known speech unit is computed with a third co-processor thread. | 08-19-2010 |
20100324898 | VOICE RECOGNITION WITH DYNAMIC FILTER BANK ADJUSTMENT BASED ON SPEAKER CATEGORIZATION - Voice recognition methods and systems are disclosed. A voice signal is obtained for an utterance of a speaker. The speaker is categorized as a male, female, or child and the categorization is used as a basis for dynamically adjusting a maximum frequency f | 12-23-2010 |
20110288869 | ROBUSTNESS TO ENVIRONMENTAL CHANGES OF A CONTEXT DEPENDENT SPEECH RECOGNIZER - An apparatus to improve robustness to environmental changes of a context dependent speech recognizer for an application, that includes a training database to store sounds for speech recognition training, a dictionary to store words supported by the speech recognizer, and a speech recognizer training module to train a set of one or more multiple state Hidden Markov Models (HMMs) with use of the training database and the dictionary. The speech recognizer training module performs a non-uniform state clustering process on each of the states of each HMM, which includes using a different non-uniform cluster threshold for at least some of the states of each HMM to more heavily cluster and correspondingly reduce a number of observation distributions for those of the states of each HMM that are less empirically affected by one or more contextual dependencies. | 11-24-2011 |
20120075462 | BLOW TRACKING USER INTERFACE SYSTEM AND METHOD - A blow tracking user interface method and apparatus may detect an orientation of blowing of a user's breath and a magnitude of blowing of the user's breath. A blow vector may be generated from the orientation and magnitude of the blowing of the user's breath. The blow vector may be used as a control input in a computer program. | 03-29-2012 |
20120075463 | USER INTERFACE SYSTEM AND METHOD USING THERMAL IMAGING - A thermal imaging interface for control of a computer program may obtain one or more thermal infrared images of one or more objects with one or more thermographic cameras. The images may be analyzed to identify one or more characteristics of the objects. Such characteristics may be used as a control input in the computer program. | 03-29-2012 |
20120110447 | CONTROL OF VIRTUAL OBJECT USING DEVICE TOUCH INTERFACE FUNCTIONALITY - A virtual object can be controlled using one or more touch interfaces. A location for a first touch input can be determined on a first touch interface. A location for a second touch input can be determined on a second touch interface. A three-dimensional segment can be generated using the location of the first touch input, the location of the second touch input, and a pre-determined spatial relationship between the first touch interface and the second touch interface. The virtual object can be manipulated using the three-dimensional segment in c) as a control input. The manipulated virtual object can be displayed on a display. | 05-03-2012 |
20120253812 | SPEECH SYLLABLE/VOWEL/PHONE BOUNDARY DETECTION USING AUDITORY ATTENTION CUES - In syllable or vowel or phone boundary detection during speech, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more syllable or vowel or phone boundaries in the input window of sound can be detected by mapping the cumulative gist vector to one or more syllable or vowel or phone boundary characteristics using a machine learning algorithm. | 10-04-2012 |
20120259554 | TONGUE TRACKING INTERFACE APPARATUS AND METHOD FOR CONTROLLING A COMPUTER PROGRAM - A tongue tracking interface apparatus for control of a computer program may include a mouthpiece configured to be worn over one or more teeth of a user of the computer program. The mouthpiece can include one or more sensors configured to determine one or more tongue orientation characteristics of the user. Other sensors such as microphones, pressure sensors, etc. located around the head, face, and neck, can also be used for determining tongue orientation characteristics. | 10-11-2012 |
20120268359 | CONTROL OF ELECTRONIC DEVICE USING NERVE ANALYSIS - An electronic device may be controlled using nerve analysis by measuring a nerve activity level for one or more body parts of a user of the device using one or more nerve sensors associated with the electronic device. A relationship can be determined between the user's one or more body parts and an intended interaction by the user with one or more components of the electronic device using each nerve activity level determined. A control input or reduced set of likely actions can be established for the electronic device based on the relationship determined. | 10-25-2012 |
20120281181 | INTERFACE USING EYE TRACKING CONTACT LENSES - Methods of eye gaze tracking are provided using magnetized contact lenses tracked by magnetic sensors and/or reflecting contact lenses tracked by video-based sensors. Tracking information of contact lenses from magnetic sensors and video-based sensors may be used to improve eye tracking and/or combined with other sensor data to improve accuracy. Furthermore, reflective contact lenses improve blink detection while eye gaze tracking is otherwise unimpeded by magnetized contact lenses. Additionally, contact lenses may be adapted for viewing 3D information. | 11-08-2012 |
20120295708 | Interface with Gaze Detection and Voice Input - Methods, computer programs, and systems for interfacing a user with a computer program, utilizing gaze detection and voice recognition, are provided. One method includes an operation for determining if a gaze of a user is directed towards a target associated with the computer program. The computer program is set to operate in a first state when the gaze is determined to be on the target, and set to operate in a second state when the gaze is determined to be away from the target. When operating in the first state, the computer program processes voice commands from the user, and, when operating in the second state, the computer program omits processing of voice commands. | 11-22-2012 |
20130268272 | TEXT DEPENDENTSPEAKER RECOGNITION WITH LONG-TERM FEATURE BASED ON FUNCTIONAL DATA ANALYSIS - One or more test features are extracted from a time domain signal. The test features are represented by discrete data. The discrete data is represented for each of the one or more test features by a corresponding one or more fitting functions, which are defined in terms of finite number of continuous basis functions and a corresponding finite number of expansion coefficients. Each fitting function is compressed through Functional Principal Component Analysis (FPCA) to generate corresponding sets of principal components. Each principal component for a given test feature is uncorrelated to each other principal component for the given test feature. A distance between a set of principal components for the given test feature and a set of principal components for one or more training features with the processing system is calculated. The test feature is classified according to the distance calculated with the processing system. | 10-10-2013 |
20130294608 | SOURCE SEPARATION BY INDEPENDENT COMPONENT ANALYSIS WITH MOVING CONSTRAINT - Methods and apparatus for signal processing are disclosed. Source separation can be performed to extract moving source signals from mixtures of source signals by way of independent component analysis. Source motion is modeled by direct to reverberant ratio in the separation process, and independent component analysis techniques described herein use multivariate probability density functions to preserve the alignment of frequency bins in the source separation process. | 11-07-2013 |
20130294611 | SOURCE SEPARATION BY INDEPENDENT COMPONENT ANALYSIS IN CONJUCTION WITH OPTIMIZATION OF ACOUSTIC ECHO CANCELLATION - Methods and apparatus for signal processing are disclosed. Source separation can be performed to extract source signals from mixtures of source signals and perform acoustic echo cancellation. Independent component analysis may be used to perform the source separation in conjunction with acoustic echo cancellation on the time-frequency domain mixed signals to generate at least one estimated source signal corresponding to at least one of the original source signals. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. | 11-07-2013 |
20130297296 | SOURCE SEPARATION BY INDEPENDENT COMPONENT ANALYSIS IN CONJUNCTION WITH SOURCE DIRECTION INFORMATION - Methods and apparatus for signal processing are disclosed. Source separation can be performed to extract source signals from mixtures of source signals by way of independent component analysis. Source direction information is utilized in the separation process, and independent component analysis techniques described herein use multivariate probability density functions to preserve the alignment of frequency bins in the source separation process. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. | 11-07-2013 |
20130297298 | SOURCE SEPARATION USING INDEPENDENT COMPONENT ANALYSIS WITH MIXED MULTI-VARIATE PROBABILITY DENSITY FUNCTION - Methods and apparatus for signal processing are disclosed. Source separation can be performed to extract source signals from mixtures of source signals by way of independent component analysis. Source separation described herein involves mixed multivariate probability density functions that are mixtures of component density functions having different parameters corresponding to frequency components of different sources, different time segments, or some combination thereof. | 11-07-2013 |
20140059484 | VOICE AND VIDEO CONTROL OF INTERACTIVE ELECTRONICALLY SIMULATED ENVIRONMENT - A method of moving objects in a graphical user interface, includes obtaining a video image of a user of the interface; displaying the video image on a display such that the video image is superposed with one or more objects displayed on the display; and moving one or more objects displayed on the display based on recognition of motions of the video image of the user. Recognition of motions of the video image may include recognition of motions of an image of the user's hand. | 02-27-2014 |
20140121002 | SYSTEM AND METHOD FOR DETECTING USER ATTENTION - A system and method for conditioning execution of a control function on a determination of whether or not a person's attention is directed toward a predetermined device. The method involves acquiring data concerning the activity of a person who is in the proximity of the device, the data being in the form of one or more temporal samples. One or more of the temporal samples is then analyzed to determine if the person's activity during the time of the analyzed samples indicates that the person's attention is not directed toward the device. The results of the determination are used to ascertain whether or not the control function should be performed. | 05-01-2014 |
20140198382 | INTERFACE USING EYE TRACKING CONTACT LENSES - Methods of eye gaze tracking are provided using magnetized contact lenses tracked by magnetic sensors and/or reflecting contact lenses tracked by video-based sensors. Tracking information of contact lenses from magnetic sensors and video-based sensors may be used to improve eye tracking and/or combined with other sensor data to improve accuracy. Furthermore, reflective contact lenses improve blink detection while eye gaze tracking is otherwise unimpeded by magnetized contact lenses. Additionally, contact lenses may be adapted for viewing 3D information. | 07-17-2014 |
20140237277 | HYBRID PERFORMANCE SCALING OR SPEECH RECOGNITION - Aspects of the present disclosure describe methods and apparatuses for executing operations on a client device platform that is operating in a low-power state. A first analysis may be used to assign a first confidence score to a recorded non-tactile input. When the first confidence score is above a first threshold an intermediate-power state may be activated. A second more detailed analysis may then assign a second confidence score to the non-tactile input. When the second confidence score is above a second threshold, then the operation is initiated. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. | 08-21-2014 |
20140347272 | AUDIO, VIDEO, SIMULATION, AND USER INTERFACE PARADIGMS - Consumer electronic devices have been developed with enormous information processing capabilities, high quality audio and video outputs, large amounts of memory, and may also include wired and/or wireless networking capabilities. Additionally, relatively unsophisticated and inexpensive sensors, such as microphones, video camera, GPS or other position sensors, when coupled with devices having these enhanced capabilities, can be used to detect subtle features about users and their environments. A variety of audio, video, simulation and user interface paradigms have been developed to utilize the enhanced capabilities of these devices. These paradigms can be used separately or together in any combination. One paradigm automatically creating user identities using speaker identification. Another paradigm includes a control button with 3-axis pressure sensitivity for use with game controllers and other input devices. | 11-27-2014 |
20150073794 | SPEECH SYLLABLE/VOWEL/PHONE BOUNDARY DETECTION USING AUDITORY ATTENTION CUES - In syllable or vowel or phone boundary detection during speech, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more syllable or vowel or phone boundaries in the input window of sound can be detected by mapping the cumulative gist vector to one or more syllable or vowel or phone boundary characteristics using a machine learning algorithm. | 03-12-2015 |
20150331577 | CONTROL OF VIRTUAL OBJECT USING DEVICE TOUCH INTERFACE FUNCTIONALITY - A virtual object can be controlled using one or more touch interfaces. A location for a first touch input can be determined on a first touch interface. A location for a second touch input can be determined on a second touch interface. A three-dimensional segment can be generated using the location of the first touch input, the location of the second touch input, and a pre-determined spatial relationship between the first touch interface and the second touch interface. The virtual object can be manipulated using the three-dimensional segment as a control input. | 11-19-2015 |