Patent application number | Description | Published |
20100076757 | ADAPTING A COMPRESSED MODEL FOR USE IN SPEECH RECOGNITION - A speech recognition system includes a receiver component that receives a distorted speech utterance. The speech recognition also includes an adaptor component that selectively adapts parameters of a compressed model used to recognize at least a portion of the distorted speech utterance, wherein the adaptor component selectively adapts the parameters of the compressed model based at least in part upon the received distorted speech utterance. | 03-25-2010 |
20100076758 | PHASE SENSITIVE MODEL ADAPTATION FOR NOISY SPEECH RECOGNITION - A speech recognition system described herein includes a receiver component that receives a distorted speech utterance. The speech recognition also includes an updater component that is in communication with a first model and a second model, wherein the updater component automatically updates parameters of the second model based at least in part upon joint estimates of additive and convolutive distortions output by the first model, wherein the joint estimates of additive and convolutive distortions are estimates of distortions based on a phase-sensitive model in the speech utterance received by the receiver component. Further, distortions other than additive and convolutive distortions, including other stationary and nonstationary sources, can also be estimated used to update the parameters of the second model. | 03-25-2010 |
20100188404 | SINGLE-PASS BOUNDING BOX CALCULATION - Embodiments for single-pass bounding box calculation are disclosed. In accordance with one embodiment, the single-pass bounding box calculation includes rendering a first target to a 2-dimensional screen space, whereby the first target includes at least six pixels. The calculation further includes producing transformed vertices in a set of geometry primitives based on an application-specified transformation. The calculation also includes generating six new points for each transformed vertex in the set of geometry primitives. The calculation additionally includes producing an initial third coordinate value for each pixel by rendering the at least six new points generate for each pixel to each corresponding pixel. The calculation further includes producing a post-rasterization value for each pixel by rasterizing the at least six new points rendered to each pixel with each corresponding pixel. Finally, the calculation includes computing bounding box information for the set of geometry primitives based on the produced third coordinate values. | 07-29-2010 |
20100188412 | CONTENT BASED CACHE FOR GRAPHICS RESOURCE MANAGEMENT - Providing content based cache for graphic resource management is disclosed herein. In some aspects, a portion of a shadow copy of graphics resources is updated from an original copy of the graphics resources when a requested resource is not current. The shadow copy may be dedicated to a graphics processing unit (GPU) while the original copy may be maintained by a central processing unit (CPU). In further aspects, the requested graphics resource in the shadow copy may be compared to a corresponding graphics resource in the original copy when the GPU requests the graphics resource. The comparison may be performed by comparing hashes of each graphics resource and/or by comparing at least a portion of the graphics resources. | 07-29-2010 |
20100201691 | SHADER-BASED FINITE STATE MACHINE FRAME DETECTION - Embodiments for shader-based finite state machine frame detection for implementing alternative graphical processing on an animation scenario are disclosed. In accordance with one embodiment, the embodiment includes assigning an identifier to each shader used to render animation scenarios. The embodiment also includes defining a finite state machine for a key frame in each of the animation scenarios, whereby each finite state machine representing a plurality of shaders that renders the key frame in each animation scenario. The embodiment further includes deriving a shader ID sequence for each finite state machine based on the identifier assigned to each shader. The embodiment additionally includes comparing an input shader ID sequence of a new frame of a new animation scenario to each derived shader ID sequences. Finally, the embodiment includes executing alternative graphics processing on the new animation scenario when the input shader ID sequence matches one of the derived shader ID sequences. | 08-12-2010 |
20100214294 | METHOD FOR TESSELLATION ON GRAPHICS HARDWARE - An exemplary method for tessellating a primitive of a graphical object includes receiving information for a primitive of a graphical object where the information includes vertex information and an edge factor for each edge of the primitive; based on the received information, dividing the primitive into parts where each part corresponds to at least a portion of an edge of the primitive and at least one vertex of the primitive and where each part has an association with the edge factor of the corresponding edge; for each of the parts, executing a geometry shader on a graphics processing unit (GPU) where the executing includes determining barycentric coordinates for a respective part based in part on its associated edge factor; for each of the parts, outputting the barycentric coordinates to a vertex buffer; and generating a tessellated mesh for the primitive based on the vertex information and the barycentric coordinates of the vertex buffer where the generating includes invoking a draw function of the GPU. Other methods, devices and systems are also disclosed. | 08-26-2010 |
20100214301 | VGPU: A real time GPU emulator - An exemplary method for emulating a graphics processing unit (GPU) includes executing a graphics application on a host computing system to generate commands for a target GPU wherein the host computing system includes host system memory and a different, host GPU; converting the generated commands into intermediate commands; based on one or more generated commands that call for one or more shaders, caching one or more corresponding shaders in a shader cache in the host system memory; based on one or more generated commands that call for one or more resources, caching one or more corresponding resources in a resource cache in the host system memory; based on the intermediate commands, outputting commands for the host GPU; and based on the output commands for the host GPU, rendering graphics using the host GPU where output commands that call for one or more shaders access the one or more corresponding shaders in the shader cache and where output commands that call for one or more resources access the one or more corresponding resources in the resource cache. Other methods, devices and systems are also disclosed. | 08-26-2010 |
20100217579 | EMULATING LEGACY HARDWARE USING IEEE 754 COMPLIANT HARDWARE - Emulating legacy hardware using IEEE 754 compliant hardware is disclosed herein. In some aspects, the emulation includes locating an instruction that includes NaN (not a number) as at least one of an operand or a resultant. The emulation adjusts the resultant of the instruction, via additional code, to produce a final resultant of non-compliant (legacy) hardware. Legacy software, which was written in anticipation of processing by legacy hardware, may then be processed using compliant hardware. | 08-26-2010 |
20120130710 | ONLINE DISTORTED SPEECH ESTIMATION WITHIN AN UNSCENTED TRANSFORMATION FRAMEWORK - Noise and channel distortion parameters in the vectorized logarithmic or the cepstral domain for an utterance may be estimated, and subsequently the distorted speech parameters in the same domain may be updated using an unscented transformation framework during online automatic speech recognition. An utterance, including speech generated from a transmission source for delivery to a receiver, may be received by a computing device. The computing device may execute instructions for applying the unscented transformation framework to speech feature vectors, representative of the speech, in order to estimate, in a sequential or online manner, static noise and channel distortion parameters and dynamic noise distortion parameters in the unscented transformation framework. The static and dynamic parameters for the distorted speech in the utterance may then be updated from clean speech parameters and the noise and channel distortion parameters using non-linear mapping. | 05-24-2012 |
20140067387 | Utilizing Scalar Operations for Recognizing Utterances During Automatic Speech Recognition in Noisy Environments - Scalar operations for model adaptation or feature enhancement may be utilized for recognizing an utterance during automatic speech recognition in a noisy environment. An utterance including distorted speech generated from a transmission source for delivery to a receiver, may be received by a computer. The distorted speech may be caused by the noisy environment and channel distortion. Computations using scalar operations in the form of an algorithm may then be performed for recognizing the utterance. As a result of performing all of the computations with scalar operations, computational complexity is very small in comparison to matrix and vector operations. Vector Taylor Series with diagonal Jacobian approximation may also be utilized as a distortion-model-based noise robust algorithm with scalar operations. | 03-06-2014 |
20140257804 | EXPLOITING HETEROGENEOUS DATA IN DEEP NEURAL NETWORK-BASED SPEECH RECOGNITION SYSTEMS - Technologies pertaining to training a deep neural network (DNN) for use in a recognition system are described herein. The DNN is trained using heterogeneous data, the heterogeneous data including narrowband signals and wideband signals. The DNN, subsequent to being trained, receives an input signal that can be either a wideband signal or narrowband signal. The DNN estimates the class posterior probability of the input signal regardless of whether the input signal is the wideband signal or the narrowband signal. | 09-11-2014 |
20140257805 | MULTILINGUAL DEEP NEURAL NETWORK - Described herein are various technologies pertaining to a multilingual deep neural network (MDNN). The MDNN includes a plurality of hidden layers, wherein values for weight parameters of the plurality of hidden layers are learned during a training phase based upon training data in terms of acoustic raw features for multiple languages. The MDNN further includes softmax layers that are trained for each target language separately, making use of the hidden layer values trained jointly with multiple source languages. The MDNN is adaptable, such that a new softmax layer may be added on top of the existing hidden layers, where the new softmax layer corresponds to a new target language. | 09-11-2014 |
20140257814 | Posterior-Based Feature with Partial Distance Elimination for Speech Recognition - A high-dimensional posterior-based feature with partial distance elimination may be utilized for speech recognition. The log likelihood values of a large number of Gaussians are needed to generate the high-dimensional posterior feature. Gaussians with very small log likelihoods are associated with zero posterior values. Log likelihoods for Gaussians for a speech frame may be evaluated with a partial distance elimination method. If the partial distance of a Gaussian is already too small, the Gaussian will have a zero posterior value. The partial distance may be calculated by sequentially adding individual dimensions in a group of dimensions. The partial distance elimination occurs when less than all of the dimensions in the group are sequentially added. | 09-11-2014 |
20140372112 | RESTRUCTURING DEEP NEURAL NETWORK ACOUSTIC MODELS - A Deep Neural Network (DNN) model used in an Automatic Speech Recognition (ASR) system is restructured. A restructured DNN model may include fewer parameters compared to the original DNN model. The restructured DNN model may include a monophone state output layer in addition to the senone output layer of the original DNN model. Singular value decomposition (SVD) can be applied to one or more weight matrices of the DNN model to reduce the size of the DNN Model. The output layer of the DNN model may be restructured to include monophone states in addition to the senones (tied triphone states) which are included in the original DNN model. When the monophone states are included in the restructured DNN model, the posteriors of monophone states are used to select a small part of senones to be evaluated. | 12-18-2014 |