Patent application number | Description | Published |
20090306985 | SYSTEM AND METHOD FOR SYNTHETICALLY GENERATED SPEECH DESCRIBING MEDIA CONTENT - Disclosed herein are systems, methods, and computer readable-media for providing an automatic synthetically generated voice describing media content, the method comprising receiving one or more pieces of metadata for a primary media content, selecting at least one piece of metadata for output, and outputting the at least one piece of metadata as synthetically generated speech with the primary media content. Other aspects of the invention involve alternative output, output speech simultaneously with the primary media content, output speech during gaps in the primary media content, translate metadata in foreign language, tailor voice, accent, and language to match the metadata and/or primary media content. A user may control output via a user interface or output may be customized based on preferences in a user profile. | 12-10-2009 |
20110065428 | SYSTEMS AND METHODS FOR SELECTING AN OUTPUT MODALITY IN A MOBILE DEVICE - A method and system for selecting the output modality of an application in a mobile device from attributes of the mobile device includes a detection of at least one attribute of the mobile device, automatically identifying available modalities for the output of the application based on the attribute, and selecting a preferred output modality from the available modalities. The output of the application is converted to the output modality and transmitted through an output interface selected based on the attributes measured. | 03-17-2011 |
20120130709 | SYSTEM AND METHOD FOR BUILDING AND EVALUATING AUTOMATIC SPEECH RECOGNITION VIA AN APPLICATION PROGRAMMER INTERFACE - Disclosed herein are systems, methods, and non-transitory computer-readable storage media for building an automatic speech recognition system through an Internet API. A network-based automatic speech recognition server configured to practice the method receives feature streams, transcriptions, and parameter values as inputs from a network client independent of knowledge of internal operations of the server. The server processes the inputs to train an acoustic model and a language model, and transmits the acoustic model and the language model to the network client. The server can also generate a log describing the processing and transmit the log to the client. On the server side, a human expert can intervene to modify how the server processes the inputs. The inputs can include an additional feature stream generated from speech by algorithms in the client's proprietary feature extraction. | 05-24-2012 |
20120134507 | Methods, Systems, and Products for Voice Control - Methods, systems, and computer program products provide voice control of electronic devices. Speech and a beacon signal are received. A directional microphone is aligned to a source of the beacon signal. A voice command in the speech is received and executed. | 05-31-2012 |
20130317824 | System and Method for Detecting Synthetic Speaker Verification - Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received. | 11-28-2013 |
20140099594 | Methods, Systems, and Products for Monitoring Health - Methods, systems, and products monitor a person's regimen for medicinal and dietary restrictions. When the person's regimen requires a liquid medication or supplement, an oral instrument is commanded to dispense a dosage of fluid. The oral instrument stores a reservoir of the fluid. If the oral instrument is a spoon, for example, the spoon may automatically dispense cough syrup or other medicine. A toothbrush, likewise, may automatically dispense mouthwash. A sensor may confirm presence of the oral instrument in the person's mouth, thus ensuring the dosage of fluid is ingested. | 04-10-2014 |
20140101084 | Methods, Systems, and Products for Interfacing with Neurological and Biological Networks - Methods, systems, and products provide interfaces between intrahost networks and interhost networks within biological hosts. Neuroregional translations are performed to route communications to and from the biological hosts. Bioregional translations may also be performed to route communications to and from the biological hosts. | 04-10-2014 |
20140101296 | Methods, Systems, and Products for Prediction of Mood - Methods, systems, and products predict emotional moods. Predicted moods may then be used to configure devices and machinery. A communications device may be configured to a mood of a user. A car may adjust to the mood of an operator. Even assembly lines may be configured, based on the mood of operators. Machinery and equipment may thus adopt performance and safety precautions that account for varying moods. | 04-10-2014 |
20140101740 | Methods, Systems, and Products for Authentication of Users - Methods, systems, and products authenticate users for access to devices, applications, and services. Skills of a user are learned over time, such that an electronic model of random subject matter may be generated. The user is prompted to interpret the random subject matter, such as with a drawing, physical arrangement, or performance. The user's interpretation is then compared to the electronic model of the random subject matter. If the user is truly who they purport to be, their interpretation will match the electronic model, thus authenticating the user. If interpretation fails to match the electronic model, authentication may be denied. | 04-10-2014 |
20140126741 | Methods, Systems, and Products for Personalized Feedback - Methods, systems, and computer program products provide personalized feedback in a cloud-based environment. A client device routes image data and audio data to a server for analysis. The server analyzes the image data to recognize people of interest. The server also analyzes the audio data to generate audible feedback. Because the server performs image recognition and audio processing, the client device is relieved of these intensive operations. | 05-08-2014 |
20140145873 | Electromagnetic Reflection Profiles - Methods, systems, and products determine electromagnetic reflective characteristics of ambient environments. A wireless communications device sends a cellular impulse and receives reflections of the cellular impulse. The cellular impulse and the reflections of the cellular impulse may be compared to determine the electromagnetic reflective characteristics of an ambient environment. | 05-29-2014 |
20140156697 | Methods, Systems, and Products for Recalling and Retrieving Documentary Evidence - Methods, systems, and products help users recall memories and search for content of those memories. When a user cannot recall a memory, the user is prompted with questions to help recall the memory. As the user answers the questions, a virtual recollection of the memory is synthesized from the answers to the questions. When the user is satisfied with the virtual recollection of the memory, a database of content may be searched for the virtual recollection of the memory. Video data, for example, may be retrieved that matches the virtual recollection of the memory. The video data is thus historical data documenting past events. | 06-05-2014 |
20140162607 | System and Method for Answering a Communication Notification - Disclosed herein are systems, methods, and computer readable-media for answering a communication notification. The method for answering a communication notification comprises receiving a notification of communication from a user, converting information related to the notification to speech, outputting the information as speech to the user, and receiving from the user an instruction to accept or ignore the incoming communication associated with the notification. In one embodiment, information related to the notification comprises one or more of a telephone number, an area code, a geographic origin of the request, caller id, a voice message, address book information, a text message, an email, a subject line, an importance level, a photograph, a video clip, metadata, an IP address, or a domain name. Another embodiment involves notification assigned an importance level and repeat attempts at notification if it is of high importance. | 06-12-2014 |
20140163960 | REAL - TIME EMOTION TRACKING SYSTEM - Devices, systems, methods, media, and programs for detecting an emotional state change in an audio signal are provided. A plurality of segments of the audio signal is received, with the plurality of segments being sequential. Each segment of the plurality of segments is analyzed, and, for each segment, an emotional state and a confidence score of the emotional state are determined. The emotional state and the confidence score of each segment are sequentially analyzed, and a current emotional state of the audio signal is tracked throughout each of the plurality of segments. For each segment, it is determined whether the current emotional state of the audio signal changes to another emotional state based on the emotional state and the confidence score of the segment. | 06-12-2014 |
20140237577 | Methods, Systems, and Products for Authentication of Users - Methods, systems, and products authenticate users for access to devices, applications, and services. Skills of a user are learned over time, such that an electronic model of random subject matter may be generated. The user is prompted to interpret the random subject matter, such as with a drawing, physical arrangement, or performance. The user's interpretation is then compared to the electronic model of the random subject matter. If the user is truly who they purport to be, their interpretation will match the electronic model, thus authenticating the user. If interpretation fails to match the electronic model, authentication may be denied. | 08-21-2014 |
20140350938 | SYSTEM AND METHOD FOR DETECTING SYNTHETIC SPEAKER VERIFICATION - Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received. | 11-27-2014 |
20140379350 | System and Method for Synthetically Generated Speech Describing Media Content - Disclosed herein are systems, methods, and computer readable-media for providing an automatic synthetically generated voice describing media content, the method comprising receiving one or more pieces of metadata for a primary media content, selecting at least one piece of metadata for output, and outputting the at least one piece of metadata as synthetically generated speech with the primary media content. Other aspects of the invention involve alternative output, output speech simultaneously with the primary media content, output speech during gaps in the primary media content, translate metadata in foreign language, tailor voice, accent, and language to match the metadata and/or primary media content. A user may control output via a user interface or output may be customized based on preferences in a user profile. | 12-25-2014 |
20150046160 | Systems, Computer-Implemented Methods, and Tangible Computer-Readable Storage Media For Transcription Alighnment - Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for captioning a media presentation. The method includes receiving automatic speech recognition (ASR) output from a media presentation and a transcription of the media presentation. The method includes selecting via a processor a pair of anchor words in the media presentation based on the ASR output and transcription and generating captions by aligning the transcription with the ASR output between the selected pair of anchor words. The transcription can be human-generated. Selecting pairs of anchor words can be based on a similarity threshold between the ASR output and the transcription. In one variation, commonly used words on a stop list are ineligible as anchor words. The method includes outputting the media presentation with the generated captions. The presentation can be a recording of a live event. | 02-12-2015 |
20150072739 | System and Method for Answering a Communication Notification - Disclosed herein are systems, methods, and computer readable-media for answering a communication notification. The method for answering a communication notification comprises receiving a notification of communication from a user, converting information related to the notification to speech, outputting the information as speech to the user, and receiving from the user an instruction to accept or ignore the incoming communication associated with the notification. In one embodiment, information related to the notification comprises one or more of a telephone number, an area code, a geographic origin of the request, caller id, a voice message, address book information, a text message, an email, a subject line, an importance level, a photograph, a video clip, metadata, an IP address, or a domain name. Another embodiment involves notification assigned an importance level and repeat attempts at notification if it is of high importance. | 03-12-2015 |
20150073805 | SYSTEM AND METHOD FOR DISTRIBUTED VOICE MODELS ACROSS CLOUD AND DEVICE FOR EMBEDDED TEXT-TO-SPEECH - Disclosed herein are systems, methods, and computer-readable storage media for intelligent caching of concatenative speech units for use in speech synthesis. A system configured to practice the method can identify a speech synthesis context, and determine, based on a local cache of text-to-speech units for a text-to-speech voice and based on the speech synthesis context, additional text-to-speech units which are not in the local cache. The system can request from a server the additional text-to-speech units, and store the additional text-to-speech units in the local cache. The system can then synthesize speech using the text-to-speech units and the additional text-to-speech units in the local cache. The system can prune the cache as the context changes, based on availability of local storage, or after synthesizing the speech. The local cache can store a core set of text-to-speech units associated with the text-to-speech voice that cannot be pruned from the local cache. | 03-12-2015 |