DOLBY LABORATORIES LICENSING CORPORATION Patent applications |
Patent application number | Title | Published |
20160142851 | Method for Generating a Surround Sound Field, Apparatus and Computer Program Product Thereof - Embodiments of the present invention relate to adaptive audio content generation. Specifically, a method for generating adaptive audio content is provided. The method comprises extracting at least one audio object from channel-based source audio content, and generating the adaptive audio content at least partially based on the at least one audio object. Corresponding system and computer program product are also disclosed. | 05-19-2016 |
20160142844 | IMPROVED RENDERING OF AUDIO OBJECTS USING DISCONTINUOUS RENDERING-MATRIX UPDATES - An audio playback system generates output signals for multiple channels of acoustic transducers by applying a rendering matrix to data representing the aural content and spatial characteristics of audio objects, so that the resulting sound field creates accurate listener impressions of the spatial characteristics. Matrix coefficients are updated to render moving objects. Discontinuous updates of the rendering matrix coefficients are controlled according to psychoacoustic principles to reduce audible artifacts. The updates may also be managed to control the amount of data needed to perform the updates. | 05-19-2016 |
20160142709 | Optimized Filter Selection for Reference Picture Processing - Reference processing may be used in a video encoder or decoder to derive reference pictures that are better correlated with a source image to be encoded or decoded, which generally yields better coding efficiency. Methods for filter selection for a reference processing unit adapted for use in a video codec system are discussed. Specifically, methods for filter selection based on performing motion estimation and obtaining distortion/cost information by comparing reference pictures, either processed or non-processed, with the source image to be encoded are discussed. | 05-19-2016 |
20160139560 | PROJECTOR DISPLAY SYSTEMS HAVING NON-MECHANICAL MIRROR BEAM STEERING - Dual or multi-modulation display system are disclosed that comprise projector systems with at least one modulator that may employ non-mechanical beam steering modulation. Many embodiments disclosed herein employ a non-mechanical beam steering and/or polarizer to provide for a highlights modulator. | 05-19-2016 |
20160134873 | ENCODING, DECODING, AND REPRESENTING HIGH DYNAMIC RANGE IMAGES - Techniques are provided to encode and decode image data comprising a tone mapped (TM) image with HDR reconstruction data in the form of luminance ratios and color residual values. In an example embodiment, luminance ratio values and residual values in color channels of a color space are generated on an individual pixel basis based on a high dynamic range (HDR) image and a derivative tone-mapped (TM) image that comprises one or more color alterations that would not be recoverable from the TM image with a luminance ratio image. The TM image with HDR reconstruction data derived from the luminance ratio values and the color-channel residual values may be outputted in an image file to a downstream device, for example, for decoding, rendering, and/or storing. The image file may be decoded to generate a restored HDR image free of the color alterations. | 05-12-2016 |
20160134872 | Adaptive Reshaping for Layered Coding of Enhanced Dynamic Range Signals - An encoder receives an input enhanced dynamic range (EDR) image to be coded in a layered representation. Input images may be gamma-coded or perceptually-coded using a bit-depth format not supported by one or more video encoders. The input image is remapped to one or more quantized layers to generate output code words suitable for compression using the available video encoders. Algorithms to determine optimum function parameters for linear and non-linear mapping functions are presented. Given a mapping function, the reverse mapping function may be transmitted to a decoder as a look-up table or it may be approximated using a piecewise polynomial approximation. A polynomial approximation technique for representing reverse-mapping functions and chromaticity translation schemes to reduce color shifts are also presented. | 05-12-2016 |
20160134870 | Rate Control Adaptation for High-Dynamic Range Images - A high dynamic range input video signal characterized by either a gamma-based or a perceptually-quantized (PQ) source electro-optical transfer function (EOTF) is to be compressed. Given a luminance range for an image-region in the input, for a gamma-coded input signal, a rate-control adaptation method in the encoder adjusts a region-based quantization parameter (QP) so that it increases in highlight regions and decreases in dark regions, otherwise, for a PQ-coded input, the region-based QP increases in the dark areas and decreases in the highlight areas. | 05-12-2016 |
20160134853 | EXTENDING IMAGE DYNAMIC RANGE - Enhancing image dynamic range is described. An input video signal that is represented in a first color space with a first color gamut, which is related to a first dynamic range, is converted to a video signal that is represented in a second color space with a second color gamut. The second color space is associated with a second dynamic range. At least two (e.g., three) color-related components of the converted video signal are mapped over the second dynamic range. | 05-12-2016 |
20160133266 | Multi-Stage Quantization of Parameter Vectors from Disparate Signal Dimensions - A first vector quantization process may be applied to two or more parameter values along a first dimension of the N-dimensional parameter set to produce a first set of quantized values. Two or more parameter prediction values may be calculated for a second dimension of the N-dimensional parameter set based, at least in part, on one or more values of the first set of quantized values. Prediction residual values may be calculated based, at least in part, on the parameter prediction values. A second vector quantization process may be applied to the prediction residual values to produce a second set of quantized values. These processes may be extended to any number of dimensions. Corresponding inverse vector quantization processes may be performed. | 05-12-2016 |
20160132824 | 3D Glasses and Related Systems - 3D glasses having an RFID tag (embedded in one or more temples) are rented to theater or other venue operators. The glasses are shipped to a venue for distribution to patrons and collected from patrons in the trays. Inventory and other measures are implemented by RFID scanning while the glasses are in the trays (e.g., upon delivery to a theater, on collection from the theater, upon inspection at the 3D rental company, etc). Data gathered from RFID scanning and inspections allows the rental company to properly allocate rental costs to various venues based on shrinkage which includes, for example, extraordinary wear of the glasses, breakage or theft, which is attributable and traceable to the specific venues. The theater or venue may also independently scan the trays upon delivery and pick-up to maintain their own records. The invention includes 3D glasses with RFID, a washing rack, and rental systems. | 05-12-2016 |
20160125818 | COLOR DISPLAY BASED ON SPATIAL CLUSTERING - A color display has a monochrome modulator. An active area of the modulator is illuminated by an array of light sources. The light sources include light sources of three or more colors. The intensities of the light sources may be adjusted to project desired luminance patterns on an active area of the modulator. In a fast field sequential method different colors are projected sequentially. The modulator is set to modulate the projected luminance patterns to display a desired image. In a slow field sequential method, colors are projected simultaneously and the modulator is set to modulate most important colors in the image. | 05-05-2016 |
20160125581 | LOCAL MULTISCALE TONE-MAPPING OPERATOR - In a method to generate a tone-mapped image from a high-dynamic range image (HDR), an input HDR image is converted into a logarithmic domain and a global tone-mapping operator generates a high-resolution gray scale ratio image from the input HDR image. Based at least in part on the high-resolution gray scale ratio image, at least two different gray scale ratio images are generated and are merged together to generate a local multiscale gray scale ratio image that represents a weighted combination of the at least two different gray scale ratio images, each being of a different spatial resolution level. An output tone-mapped image is generated based on the high-resolution gray scale image and the local multiscale gray scale ratio image. | 05-05-2016 |
20160125579 | Systems and Methods for Rectifying Image Artifacts - A sparse FIR filter can be used to process an image in order to rectify imaging artifacts. In a first example application, the sparse FIR filter is applied as a selective sparse FIR filter that examines a set of selected neighboring pixels of an original pixel in order to identify smooth areas of the image and to selectively apply filtering to only the smooth areas of the image. The parameters of selective filtering are selected based on the characteristics of an inter-layer predictor. In a second example application, the sparse FIR filter is applied as an edge aware selective sparse FIR filter that examines additional neighboring pixels to the set of selected pixels in order to identify edges and carry out selective filtering of smooth areas of the image. Examples for detecting and removing banding artifacts during the coding of high-dynamic range images are provided. | 05-05-2016 |
20160119599 | Multi-Half-Tone Imaging and Dual Modulation Projection/Dual Modulation Laser Projection - Smaller halftone tiles are implemented on a first modulator of a dual modulation projection system. This techniques uses multiple halftones per frame in the pre-modulator synchronized with a modified bit sequence in the primary modulator to effectively increase the number of levels provided by a given tile size in the halftone modulator. It addresses the issue of reduced contrast ratio at low light levels for small tile sizes and allows the use of smaller PSFs which reduce halo artifacts in the projected image and may be utilized in 3D projecting and viewing. | 04-28-2016 |
20160119459 | LIGHTING FOR AUDIO DEVICES - An apparatus may include an inner module, an outer module that substantially surrounds a perimeter of the inner module, a plurality of light emitters, and a light distribution medium. The plurality of light emitters may be positioned under the inner module and project light radially outward. The light distribution medium may transport the light projected from the plurality of light emitters to an edge of the light distribution medium. The edge may include a diffusive surface and traverse a substantial portion of a boundary between the inner module and the outer module. | 04-28-2016 |
20160111049 | Rapid Estimation Of Effective Illuminance Patterns For Projected Light Fields - Apparatus and methods are provided that employ one or more of a variety of techniques for reducing the time required to display high resolution images on a high dynamic range display having a light source layer and a display layer. In one technique, the image resolution is reduced, an effective luminance pattern is determined for the reduced resolution image, and the resolution of the effective luminance pattern is then increased to the resolution of the—display layer. In another technique, the light source layer's point spread function is decomposed into a plurality of components, and an effective luminance pattern is determined for each component. The effective luminance patterns are then combined to produce a total effective luminance pattern. Additional image display time reduction techniques are provided. | 04-21-2016 |
20160105695 | Transmitting Display Management Metadata Over HDMI - In an embodiment, a media source combines reference code values and mapping function parameters for mapping functions into video frames originally designated to carry pixel values. The video frames are delivered to a downstream device such as a media sink in an encoded video signal. The media sink extracts the mapping function parameters for the mapping functions from the encoded video signal and applies the mapping functions as a part of display management operations to map the reference code values to the mapped pixel values appropriate for the media sink. The mapped pixel values can be used to render images as represented by the reference code values. | 04-14-2016 |
20160105666 | Stereoscopic Dual Modulator Display Device Using Full Color Anaglyph - Systems and methods are provided for displaying three-dimensional image data on a display having a light source modulation layer and a display modulation layer. The light source modulation layer has light sources for providing spatially modulated, spectrally separated light for displaying time-multiplexed left and right eye images. The display modulation layer has pixels for spatially modulating the amount of light received from the light source modulation layer. | 04-14-2016 |
20160100108 | Blending Images Using Mismatched Source and Display Electro-Optical Transfer Functions - Input video signals characterized by a source electro-optical transfer function (EOTF) are to be blended and displayed on a target display with a target EOTF which is different than the source EOTF. Given an input set of blending parameters, an output set of blending parameters is generated as follows. The input blending parameters are scaled by video signal metrics computed in the target EOTF to generate scaled blending parameters. The scaled blended parameters are mapped back to the source EOTF space to generate mapped blending parameters. Finally the mapped blending parameters are normalized to generate the output blending parameters. An output blended image is generating by blending the input video signals using the output blending parameters. Examples of generating the video signal metrics are also provided. | 04-07-2016 |
20160094859 | Directed Interpolation and Data Post-Processing - An encoding device evaluates a plurality of processing and/or post-processing algorithms and/or methods to be applied to a video stream, and signals a selected method, algorithm, class or category of methods/algorithms either in an encoded bitstream or as side information related to the encoded bitstream. A decoding device or post-processor utilizes the signaled algorithm or selects an algorithm/method based on the signaled method or algorithm. The selection is based, for example, on availability of the algorithm/method at the decoder/post-processor and/or cost of implementation. The video stream may comprise, for example, downsampled multiplexed stereoscopic images and the selected algorithm may include any of upconversion and/or error correction techniques that contribute to a restoration of the downsampled images. | 03-31-2016 |
20160086555 | Projection Display Providing Additional Modulation and Related Methods - A projection display system includes a spatial modulator that is controlled to compensate for flare in a lens of the projector. The spatial modulator increases achievable intra-frame contrast and facilitates increased peak luminance without unacceptable black levels. Some embodiments provide 3D projection systems in which the spatial modulator is combined with a polarization control panel. | 03-24-2016 |
20160080772 | Encoding and Decoding Architecture of Checkerboard Multiplexed Image Data - A device includes a coder or a codec configured for interleaved image data utilizing diamond shaped blocks for motion estimation and/or motion compensation and utilizing square or orthogonal transforms of residual data. In various embodiments, the decoder may be configured, among others, to perform de-blocking on edges of the diamond shaped blocks and/or data padding at boundaries of the image data. | 03-17-2016 |
20160080716 | Graphics Blending for High Dynamic Range Video - A method and system for merging graphics and high dynamic range video data is disclosed. In a video receiver, if needed, video and graphics are translated first into the IPT-PQ color space. A display management process uses metadata to map the input video data and the graphics from their own color volume space into a target blending color volume space by taking into consideration the color volume space of the target display. | 03-17-2016 |
20160080712 | Dithering for Chromatically Subsampled Image Formats - Dithering techniques for images are described herein. An input image of a first bit depth is separated into a luma and one or more chroma components. A model of the optical transfer function (OTF) of the human visual system (HVS) is used to generate dither noise which is added to the chroma components of the input image. The model of the OTF is adapted in response to viewing distances determined based on the spatial resolution of the chroma components. An image based on the original input luma component and the noise-modified chroma components is quantized to a second bit depth, which is lower than the first bit depth, to generate an output dithered image. | 03-17-2016 |
20160080685 | SYSTEM AND METHOD OF OUTPUTTING MULTI-LINGUAL AUDIO AND ASSOCIATED AUDIO FROM A SINGLE CONTAINER - A method of storing and outputting associated audio. The associated audio and the main audio are stored in, and output from, a single multimedia container. In this manner, when the main language is changed, the associated audio automatically changes with minimal audio artifacts. | 03-17-2016 |
20160078882 | MEASURING CONTENT COHERENCE AND MEASURING SIMILARITY - Embodiments for measuring content coherence and embodiments for measuring content similarity are described. Content coherence between a first audio section and a second audio section is measured. For each audio segment in the first audio section, a predetermined number of audio segments in the second audio section are determined. Content similarity between the audio segment in the first audio section and the determined audio segments is higher than that between the audio segment and all the other audio segments in the second audio section. An average of the content similarity between the audio segment in the first audio section and the determined audio segments is calculated. The content coherence is calculated as an average, the maximum or the minimum of the averages calculated for the audio segments in the first audio section. The content similarity may be calculated based on Dirichlet distribution. | 03-17-2016 |
20160078879 | Apparatuses and Methods for Audio Classifying and Processing - Apparatus and methods for audio classifying and processing are disclosed. In one embodiment, an audio processing apparatus includes an audio classifier for classifying an audio signal into at least one audio type in real time; an audio improving device for improving experience of audience; and an adjusting unit for adjusting at least one parameter of the audio improving device in a continuous manner based on the confidence value of the at least one audio type. | 03-17-2016 |
20160073084 | Compatible Stereoscopic Video Delivery - Stereoscopic images are subsampled and placed in a “checkerboard” pattern in an image. The image is encoded in a monoscopic video format. The monoscopic video is transmitted to a device where the “checkerboard” is decoded. Portions of the checkerboard (e.g., “black” portions) are used to reconstruct one of the stereoscopic images and the other portion of the checkerboard (e.g., “white” portions) are used to reconstruct the other image. The subsamples are, for example, taken from the image in a location coincident to the checkerboard position in which the subsamples are encoded. | 03-10-2016 |
20160071527 | Method and System for Scaling Ducking of Speech-Relevant Channels in Multi-Channel Audio - A method and system for filtering a multi-channel audio signal having a speech channel and at least one non-speech channel, to improve intelligibility of speech determined by the signal. In typical embodiments, the method includes steps of determining at least one attenuation control value indicative of a measure of similarity between speech-related content determined by the speech channel and speech-related content determined by the non-speech channel, and attenuating the non-speech channel in response to the at least one attenuation control value. Typically, the attenuating step includes scaling of a raw attenuation control signal (e.g., a ducking gain control signal) for the non-speech channel in response to the at least one attenuation control value. Some embodiments are a general or special purpose processor programmed with software or firmware and/or otherwise configured to perform filtering in accordance the invention. | 03-10-2016 |
20160071484 | Method and Apparatus for Image Data Transformation - Image data is transformed for display on a target display. A sigmoidal transfer function provides a free parameter controlling min-tone contrast. The transfer function may be dynamically adjusted to accommodate changing ambient lighting conditions. The transformation may be selected so as to automatically adapt image data for display on a target display in a way that substantially preserves creative intent embodied in the image data. The image data may be video data. | 03-10-2016 |
20160066116 | USING SINGLE BITSTREAM TO PRODUCE TAILORED AUDIO DEVICE MIXES - Audio stems are generated to contain audio content to be mixed by recipient devices. Multiple sets of mixing instructions for multiple audio channel configurations are determined, for example, based on input of audio producers. Each set of mixing instructions is to be used for mixing the audio stems for rendering in a corresponding audio channel configuration. A bitstream is generated to carry both the audio stems and the sets of mixing instructions. A recipient device receives the bitstream as the input. The recipient device determines a specific audio channel configuration to be used for rendering the plurality of audio stems. Based on that determination, a specific set of mixing instructions is retrieved from the bitstream and used to mix the audio stems. | 03-03-2016 |
20160066032 | ACQUISITION, RECOVERY, AND MATCHING OF UNIQUE INFORMATION FROM FILE-BASED MEDIA FOR AUTOMATED FILE DETECTION - A media fingerprint archive system generates and archives media fingerprints from second media content portions such as commercials. A downstream media measurement system can extract/derive query fingerprints from an incoming signal, and query the media fingerprint archive system whether any of the query fingerprints matches any archived fingerprints. If so, the media measurement system can perform media measurements on a specific secondary media content portion from which the matched query fingerprint is derived. If not, the media measurement system can analyze media characteristics of a media content portion to determine whether the media content portion is a secondary media content portion and perform media measurement if needed to. The media measurement system may send fingerprints from an identified secondary media content portion to the media fingerprint archive system for storage. | 03-03-2016 |
20160065975 | Layered Decomposition of Chroma Components in EDR Video Coding - An encoder receives one or more input pictures of enhanced dynamic range (EDR) to be encoded in a coded bit stream comprising a base layer and one or more enhancement layers. To encode the chroma pixels, the encoder generates a luma mask and a corresponding chroma mask. Based on generated high-clipping and low-clipping thresholds, the encoder determines the appropriate parameters to encode the chroma values in the base and enhancement layers. | 03-03-2016 |
20160065957 | Overlapped Rate Control for Video Splicing Applications - Rate control techniques are provided for encoding an input video sequence into a compressed coded bitstream with multiple coding passes. The final coding pass may comprise final splices with non-overlapping frames that do not extend into neighboring final splices. A final splice in the final coding pass may correspond to at least one non-final splice in a non-final coding pass. A non-final splice may have overlapping frames that extend into neighboring final splices in the final coding pass. The overlapping frames in the non-final splice may be used to derive complexity information about the neighboring final splices. The complexity information about the neighboring final splices, as derived from the overlapping frames, may be used to allocate or improve rate control related budgets in encoding the final splice into the compressed coded bitstream in the final coding pass. | 03-03-2016 |
20160065949 | Guided 3D Display Adaptation - A 3D display is characterized by a quality of viewing experience (QVE) mapping which represents a display-specific input-output relationship between input depth values and output QVE values. Examples of QVE mappings based on a metric of “viewing blur” are presented. Given reference depth data generated for a reference display and a representation of an artist's mapping function, which represents an input-output relationship between original input depth data and QVE data generated using a QVE mapping for a reference display, a decoder may reconstruct the reference depth data and apply an inverse QVE mapping for a target display to generate output depth data optimized for the target display. | 03-03-2016 |
20160065792 | Scene-Change Detection Using Video Stream Pairs - A scene change is determined using a first and a second video signal, each representing the same scene or content, but at a different color grade (such as dynamic range). A set of prediction coefficients is generated to generate prediction signals approximating the first signal based on the second signal and a prediction model. A set of prediction error signals is generated based on the prediction signals and the first signal. Then, a scene change is detected based on the characteristics of the prediction error signals. Alternatively, a set of entropy values of the difference signals between the first and second video signals are computed, and a scene change is detected based on the characteristics of the entropy values. | 03-03-2016 |
20160064007 | AUDIO ENCODER AND DECODER - The present document relates an audio encoding and decoding system (referred to as an audio codec system). In particular, the present document relates to a transform-based audio codec system which is particularly well suited for voice encoding/decoding. A transform-based speech encoder ( | 03-03-2016 |
20160057334 | 3D Cameras for HDR - High dynamic range 3D images are generated with relatively narrow dynamic range image sensors. Input frames of different views may be set to different exposure settings. Pixels in the input frames may be normalized to a common range of luminance levels. Disparity between normalized pixels in the input frames may be computed and interpolated. The pixels in the different input frames may be shifted to, or stay in, a common reference frame. The pre-normalized luminance levels of the pixels may be used to create high dynamic range pixels that make up one, two or more output frames of different views. Further, a modulated synopter with electronic mirrors is combined with a stereoscopic camera to capture monoscopic HDR, alternating monoscopic HDR and stereoscopic LDR images, or stereoscopic HDR images. | 02-25-2016 |
20160056787 | EQUALIZER CONTROLLER AND CONTROLLING METHOD - Equalizer controller and controlling method are disclosed. In one embodiment, an equalizer controller includes an audio classifier for identifying the audio type of an audio signal in real time; and an adjusting unit for adjusting an equalizer in a continuous manner based on the confidence value of the audio type as identified. | 02-25-2016 |
20160055864 | AUDIO ENCODER AND DECODER - An audio processing system ( | 02-25-2016 |
20160055855 | AUDIO PROCESSING SYSTEM - An audio processing system ( | 02-25-2016 |
20160055854 | Methods and Apparatuses for Generating and Using Low-Resolution Preview Tracks with High-Quality Encoded Object and Multichannel Audio Signals - A low-quality rendition of a complex soundtrack is created, synchronized and combined with the soundtrack. The low-quality rendition may be monitored in mastering operations, for example, to control the removal, replacement or addition of aural content in the soundtrack without the need for expensive equipment that would otherwise be required to render the soundtrack. | 02-25-2016 |
20160050404 | Depth Map Delivery Formats for Multi-View Auto-Stereoscopic Displays - Stereoscopic video data and corresponding depth map data for multi-view auto-stereoscopic displays are coded using a multiplexed asymmetric image frame that combines an image data partition and a depth map data partition, wherein the size of the image data partition is different than the size of the depth map data partition. The image data partition comprises one or more of the input views while the depth map partition comprises at least a portion of the depth map data rotated with respect to the orientation of the image data in the multiplexed output image frame. | 02-18-2016 |
20160049915 | VOLUME LEVELER CONTROLLER AND CONTROLLING METHOD - Volume leveler controller and controlling method are disclosed. In one embodiment, A volume leveler controller includes an audio content classifier for identifying the content type of an audio signal in real time; and an adjusting unit for adjusting a volume leveler in a continuous manner based on the content type as identified. The adjusting unit may configured to positively correlate the dynamic gain of the volume leveler with informative content types of the audio signal, and negatively correlate the dynamic gain of the volume leveler with interfering content types of the audio signal. | 02-18-2016 |
20160044433 | RENDERING AUDIO USING SPEAKERS ORGANIZED AS A MESH OF ARBITRARY N-GONS - In some embodiments, a method for rendering an audio program indicative of at least one source, including by panning the source along a trajectory comprising source locations using speakers organized as a mesh whose faces are convex N-gons, where N can vary from face to face, and N is not equal to three for at least one face of the mesh, including steps of: for each source location, determining an intersecting face of the mesh (including the source location's projection on the mesh), thereby determining a subset of the speakers whose positions coincide with the intersecting face's vertices, and determining gains (which may be determined by generalized barycentric coordinates) for speaker feeds for driving each speaker subset to emit sound perceived as emitting from the source location corresponding to the subset. Other aspects include systems configured (e.g., programmed) to perform any embodiment of the method. | 02-11-2016 |
20160044430 | METHOD AND SYSTEM FOR HEAD-RELATED TRANSFER FUNCTION GENERATION BY LINEAR MIXING OF HEAD-RELATED TRANSFER FUNCTIONS - A method for performing linear mixing on coupled Head-related transfer functions (HRTFs) to determine an interpolated HRTF for any specified arrival direction in a range (e.g., a range spanning at least 60 degrees in a plane, or a full range of 360 degrees in a plane), where the coupled HRTFs have been predetermined to have properties such that linear mixing can be performed thereon (to generate interpolated HRTFs) without introducing significant comb filtering distortion. In some embodiments, the method includes steps of: in response to a signal indicative of a specified arrival direction, performing linear mixing on data indicative of coupled HRTFs of a coupled HRTF set to determine an HRTF for the specified arrival direction; and performing HRTF filtering on an audio input signal using the HRTF for the specified arrival direction. | 02-11-2016 |
20160037280 | System and Tools for Enhanced 3D Audio Authoring and Rendering - Improved tools for authoring and rendering audio reproduction data are provided. Some such authoring tools allow audio reproduction data to be generalized for a wide variety of reproduction environments. Audio reproduction data may be authored by creating metadata for audio objects. The metadata may be created with reference to speaker zones. During the rendering process, the audio reproduction data may be reproduced according to the reproduction speaker layout of a particular reproduction environment. | 02-04-2016 |
20160036987 | Normalization of Soundfield Orientations Based on Auditory Scene Analysis - Embodiments are described for a soundfield system that receives a transmitting soundfield, wherein the transmitting soundfield includes a sound source at a location in the transmitting soundfield. The system determines a rotation angle for rotating the transmitting soundfield based on a desired location for the sound source. The transmitting soundfield is rotated by the determined angle and the system obtains a listener's soundfield based on the rotated transmitting soundfield. The listener's soundfield is transmitted for rendering to a listener. | 02-04-2016 |
20160035367 | SPEECH DEREVERBERATION METHODS, DEVICES AND SYSTEMS - Improved audio data processing method and systems are provided. Some implementations involve dividing frequency domain audio data into a plurality of subbands and determining amplitude modulation signal values for each of the plurality of subbands. A band-pass filter may be applied to the amplitude modulation signal values in each subband, to produce band-pass filtered amplitude modulation signal values for each subband. The band-pass filter may have a central frequency that exceeds an average cadence of human speech. A gain may be determined for each subband based, at least in part, on a function of the amplitude modulation signal values and the band-pass filtered amplitude modulation signal values. The determined gain may be applied to each subband. | 02-04-2016 |
20160029144 | Method of Rendering One or More Captured Audio Soundfields to a Listener - A computer implemented system for rendering captured audio soundfields to a listener ( | 01-28-2016 |
20160029140 | METHODS AND SYSTEMS FOR GENERATING AND INTERACTIVELY RENDERING OBJECT BASED AUDIO - Methods for generating an object based audio program, renderable in a personalizable manner, and including a bed of speaker channels renderable in the absence of selection of other program content (e.g., to provide a default full range audio experience). Other embodiments include steps of delivering, decoding, and/or rendering such a program. Rendering of content of the bed, or of a selected mix of other content of the program, may provide an immersive experience. The program may include multiple object channels (e.g., object channels indicative of user-selectable and user-configurable objects), the bed of speaker channels, and other speaker channels. Another aspect is an audio processing unit (e.g., encoder or decoder) configured to perform, or which includes a buffer memory which stores at least one frame (or other segment) of an object based audio program (or bitstream thereof) generated in accordance with, any embodiment of the method. | 01-28-2016 |
20160029138 | Methods and Systems for Interactive Rendering of Object Based Audio - Methods for generating an object based audio program which is renderable in a personalizable manner, e.g., to provide an immersive, perception of audio content of the program. Other embodiments include steps of delivering (e.g., broadcasting), decoding, and/or rendering such a program. Rendering of audio objects indicated by the program may provide an immersive experience. The audio content of the program may be indicative of multiple object channels (e.g., object channels indicative of user-selectable and user-configurable objects, and typically also a default set of objects which will be rendered in the absence of a selection by a user) and a bed of speaker channels. Another aspect is an audio processing unit (e.g., encoder or decoder) configured to perform, or which includes a buffer memory which stores at least one frame (or other segment) of an object based audio program (or bitstream thereof) generated in accordance with, any embodiment of the method. | 01-28-2016 |
20160029044 | 3D VISUAL DYNAMIC RANGE CODING - A sequence of 3D VDR images and 3D SDR images are encoded using a monoscopic SDR base layer and one or more enhancement layers. A first VDR view and a first SDR view are encoded with a DVDL encoder to output first and second coded signals. A predicted 3D VDR signal is generated, which has first and second predicted VDR views. First and second VDR residuals are generated based on their respective VDR views and predicted VDR views. A DVDL encoder encodes the first and second VDR residuals to output third and fourth coded signals. A 3D VDR decoder, which has two DVDL decoders and SDR-to-VDR predictors use the four coded input signals to generate a single-view SDR, 3D SDR, single-view VDR, or 3D VDR signals. A corresponding decoder is also described, which is capable of decoding these encoded 3D VDR and SDR images. | 01-28-2016 |
20160027447 | SPATIAL COMFORT NOISE - A method, an apparatus, logic (e.g., executable instructions encoded in a non-transitory computer-readable medium to carry out a method), and a non-transitory computer-readable medium configured with such instructions. The method is to generate and spatially render spatial comfort noise at a receiving endpoint of a conference system, such that the comfort noise has target spectral characteristics typical of comfort noise, and at least one spatial property that at least substantially matches at least one target spatial property. On version includes receiving one or more or more audio signals from other endpoints, combining the received audio signals with the spatial comfort noise signals, and rendering the combination of the received audio signals and the spatial comfort noise signals to a set of output signals for loudspeakers, such that the spatial comfort noise signals are continually in the output signal sin addition to output from the received audio signals. | 01-28-2016 |
20160021476 | System and Method for Adaptive Audio Signal Generation, Coding and Rendering - Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content. | 01-21-2016 |
20160021391 | Image Decontouring in High Dynamic Range Video Processing - A set of optimized operational parameter values is generated for performing decontouring operations on a predicted image. The predicted image is predicted from a first image mapped from a second image that has a higher dynamic range than the first image. Based on the set of optimized operational parameter values, smoothen operations and selection/masking based on a residual mask are performed on the predicted image. The set of optimized operational parameter values is encoded into a part of a multi-layer video signal that includes the first image, and can be used by a recipient decoder to generate a decontoured image based on the predicted image and reconstruct a version of the second image. | 01-21-2016 |
20160019909 | ACOUSTIC ECHO MITIGATION APPARATUS AND METHOD, AUDIO PROCESSING APPARATUS AND VOICE COMMUNICATION TERMINAL - The present application provides an acoustic echo mitigation apparatus and method, an audio processing apparatus and a voice communication terminal. According to an embodiment, an acoustic echo mitigation apparatus is provided, including: an acoustic echo canceller for cancelling estimated acoustic echo from a microphone signal and outputting an error signal; a residual echo estimator for estimating residual echo power; and an acoustic echo suppressor for further suppressing residual echo and noise in the error signal based on the residual echo power and noise power. Here, the residual echo estimator is configured to be continuously adaptive to power change in the error signal. According to the embodiments of the present application, the acoustic echo mitigation apparatus and method can, at least, be well adaptive to the change of power of the error signal after the AEC processing, such as that caused by change of double-talk status, echo path properties, noise level and etc. | 01-21-2016 |
20160019908 | COMPANDING APPARATUS AND METHOD TO REDUCE QUANTIZATION NOISE USING ADVANCED SPECTRAL EXTENSION - Embodiments are directed to a companding method and system for reducing coding noise in an audio codec. A compression process reduces an original dynamic range of an initial audio signal through a compression process that divides the initial audio signal into a plurality of segments using a defined window shape, calculates a wideband gain in the frequency domain using a non-energy based average of frequency domain samples of the initial audio signal, and applies individual gain values to amplify segments of relatively low intensity and attenuate segments of relatively high intensity. The compressed audio signal is then expanded back to substantially the original dynamic range that applies inverse gain values to amplify segments of relatively high intensity and attenuating segments of relatively low intensity. A QMF filterbank is used to analyze the initial audio signal to obtain a frequency domain representation. | 01-21-2016 |
20160007133 | RENDERING OF AUDIO OBJECTS WITH APPARENT SIZE TO ARBITRARY LOUDSPEAKER LAYOUTS - Multiple virtual source locations may be defined for a volume within which audio objects can move. A set-up process for rendering audio data may involve receiving reproduction speaker location data and pre-computing gain values for each of the virtual sources according to the reproduction speaker location data and each virtual source location. The gain values may be stored and used during “run time,” during which audio reproduction data are rendered for the speakers of the reproduction environment. During run time, for each audio object, contributions from virtual source locations within an area or volume defined by the audio object position data and the audio object size data may be computed. A set of gain values for each output channel of the reproduction environment may be computed based, at least in part, on the computed contributions. Each output channel may correspond to at least one reproduction speaker of the reproduction environment. | 01-07-2016 |
20160006879 | Audio Capture and Render Device Having a Visual Display and User Interface for Audio Conferencing - A method in a soundfield-capturing endpoint and the capturing endpoint that comprises a microphone array capturing soundfield, and an input processor pre-processing and performing auditory scene analysis to detect local sound objects and positions, de-clutter the sound objects, and integrate with auxiliary audio signals to form a de-cluttered local auditory scene that has a measure of plausibility and perceptual continuity. The input processor also codes the resulting de-cluttered auditory scene to form coded scene data comprising mono audio and additional scene data to send to others. The endpoint includes an output processor generating signals for a display unit that displays a summary of the de-cluttered local auditory scene and/or a summary of activity in the communication system from received data, the display including a shaped ribbon display element that has an extent with locations on the extent representing locations and other properties of different sound objects. | 01-07-2016 |
20160006561 | Systems and Methods for Detecting a Synchronization Code Word - Systems and methods for detecting a synchronization code word embedded in a plurality of frames of a signal are described. In one example embodiment, the synchronization code word contains “s” bits, embedded one bit per frame in “s” frames of an input signal. The method of detecting this synchronization code word includes: initiating a first segmentation procedure wherein “n” segments are defined in each signal frame of the input signal. A first correlation threshold value, which is based on the synchronization code word, is used to identify in the “n” segments, a first segment having the highest likelihood of containing at least a portion of the synchronization code word. The first segment is used to initiate a recursive detection procedure incorporating one or more additional segmentation procedures and one or more additional correlation threshold values, to detect the synchronization code word in a sub-divided portion of the first segment. | 01-07-2016 |
20160005413 | Audio Signal Enhancement Using Estimated Spatial Parameters - Received audio data may include a first set of frequency coefficients and a second set of frequency coefficients. Spatial parameters for at least part of the second set of frequency coefficients may be estimated, based at least in part on the first set of frequency coefficients. The estimated spatial parameters may be applied to the second set of frequency coefficients to generate a modified second set of frequency coefficients. The first set of frequency coefficients may correspond to a first frequency range (for example, an individual channel frequency range) and the second set of frequency coefficients may correspond to a second frequency range (for example, a coupled channel frequency range). Combined frequency coefficients of a composite coupling channel may be based on frequency coefficients of two or more channels. Cross-correlation coefficients, between frequency coefficients of a first channel and the combined frequency coefficients, may be computed. | 01-07-2016 |
20160005406 | Methods for Controlling the Inter-Channel Coherence of Upmixed Audio Signals - Audio characteristics of audio data corresponding to a plurality of audio channels may be determined. The audio characteristics may include spatial parameter data. Decorrelation filtering processes for the audio data may be based, at least in part, on the audio characteristics. The decorrelation filtering processes may cause a specific inter-decorrelation signal coherence (“IDC”) between channel-specific decorrelation signals for at least one pair of channels. The channel-specific decorrelation signals may be received and/or determined. Inter-channel coherence (“ICC”) between a plurality of audio channel pairs may be controlled. Controlling ICC may involve at receiving an ICC value and/or determining an ICC value based, at least partially, on the spatial parameter data. A set of IDC values may be based, at least partially, on the set of ICC values. A set of channel-specific decorrelation signals, corresponding with the set of IDC values, may be synthesized by performing operations on the filtered audio data. | 01-07-2016 |
20160005405 | Methods for Audio Signal Transient Detection and Decorrelation Control - Some audio processing methods may involve receiving audio data corresponding to a plurality of audio channels and determining audio characteristics of the audio data, which may include transient information. An amount of decorrelation for the audio data may be based, at least in part, on the audio characteristics. If a definite transient event is determined, a decorrelation process may be temporarily halted or slowed. Determining transient information may involve evaluating the likelihood and/or the severity of a transient event. In some implementations, determining transient information may involve evaluating a temporal power variation in the audio data. Explicit transient information may or may not be received with the audio data, depending on the implementation. Explicit transient information may include a transient control value corresponding to a definite transient event, a definite non-transient event or an intermediate transient control value. | 01-07-2016 |
20160005349 | Display Management for High Dynamic Range Video - A display management processor receives an input image with enhanced dynamic range to be displayed on a target display which has a different dynamic range than a reference display. The input image is first transformed into a perceptually-corrected IPT color space. A non-linear mapping function generates a first tone-mapped signal by mapping the intensity of the input signal from the reference dynamic range into the target dynamic range. The intensity (I) component of the first tone-mapped signal is sharpened to preserve details, and the saturation of the color (P and T) components is adjusted to generate a second tone-mapped output image. A color gamut mapping function is applied to the second tone-mapped output image to generate an image suitable for display onto the target display. The display management pipeline may also be adapted to adjust the intensity and color components of the displayed image according to specially defined display modes. | 01-07-2016 |
20160005201 | SYSTEMS AND METHODS FOR APPEARANCE MAPPING FOR COMPOSITING OVERLAY GRAPHICS - Systems and methods for overlaying a second image/video data onto a first image/video data are described herein. The first image/video data may be intended to be rendered on a display with certain characteristics—e.g., HDR, EDR, VDR or UHD capabilities. The second image/video data may comprise graphics, closed captioning, text, advertisement—or any data that may be desired to be overlaid and/or composited onto the first image/video data. The second image/video data may be appearance mapped according to the image statistics and/or characteristics of the first image/video data. In addition, such appearance mapping may be made according to the characteristics of the display that the composite data is to be rendered. Such appearance mapping is desired to render a composite data that is visually pleasing to a viewer, rendered upon a desired display. | 01-07-2016 |
20160005153 | Display Management for High Dynamic Range Video - A display management processor receives an input image with enhanced dynamic range to be displayed on a target display which has a different dynamic range than a reference display. The input image is first transformed into a perceptually-quantized (PQ) color space. A non-linear mapping function generates a tone-mapped intensity image in response to the characteristics of the source and target display and a measure of the intensity of the PQ image. After a detail-preservation step which may generate a filtered tone-mapped intensity image, an image-adaptive intensity and saturation adjustment step generates an intensity adjustment factor and a saturation adjustment factor as functions of the measure of intensity and saturation of the PQ image, which together with the filtered tone-mapped intensity image are used to generate the output image. Examples of the functions to compute the intensity and saturation adjustment factors are provided. | 01-07-2016 |
20150382127 | AUDIO SPATIAL RENDERING APPARATUS AND METHOD - An audio spatial rendering apparatus and method are disclosed. In one embodiment, The audio spatial rendering apparatus includes a rendering unit for spatially rendering an audio stream so that the reproduced far-end sound is perceived by a listener as originating from at least one virtual spatial position, a real position obtaining unit for obtaining a real spatial position of a real sound source, a comparator for comparing the real spatial position with the at least one virtual spatial position; and an adjusting unit for, where the real spatial position is within a predetermined range around at least one virtual spatial position, or vice versa, adjusting the parameters of the rendering unit so that the at least one virtual spatial position is changed. | 12-31-2015 |
20150380000 | Signal Decorrelation in an Audio Processing System - Audio processing methods may involve receiving audio data corresponding to a plurality of audio channels. The audio data may include a frequency domain representation corresponding to filterbank coefficients of an audio encoding or processing system. A decorrelation process may be performed with the same filterbank coefficients used by the audio encoding or processing system. The decorrelation process may be performed without converting coefficients of the frequency domain representation to another frequency domain or time domain representation. The decorrelation process may involve selective or signal-adaptive decorrelation of specific channels and/or specific frequency bands. The decorrelation process may involve applying a decorrelation filter to a portion of the received audio data to produce filtered audio data. The decorrelation process may involve using a non-hierarchal mixer to combine a direct portion of the received audio data with the filtered audio data according to spatial parameters. | 12-31-2015 |
20150378166 | Method and System for Shaped Glasses and Viewing 3D Images - Shaped glasses have curved surface lenses and spectrally complementary filters disposed on the curved surface lenses configured to compensate for wavelength shifts occurring due to viewing angles and other sources. The spectrally complementary filters include guard bands to prevent crosstalk between spectrally complementary portions of a 3D image viewed through the shaped glasses. In one embodiment, the spectrally complementary filters are disposed on the curved lenses with increasing layer thickness towards edges of the lenses. The projected complementary images may also be pre-shifted to compensate for subsequent wavelength shifts occurring while viewing the images. | 12-31-2015 |
20150372820 | METADATA TRANSCODING - The present document relates to transcoding of metadata, and in particular to a method and system for transcoding metadata with reduced computational complexity. A transcoder configured to transcode an inbound bitstream comprising an inbound content frame and an associated inbound metadata frame into an outbound bitstream comprising an outbound content frame and an associated outbound metadata frame is described. The inbound content frame is indicative of a signal encoded according to a first codec system and the outbound content frame is indicative of the signal encoded according to a second codec system. The transcoder is configured to identify an inbound block of metadata from the inbound metadata frame, the inbound block of metadata associated with an inbound descriptor indicative of one or more properties of metadata comprised within the inbound block of metadata, and to generate the outbound metadata frame from the inbound metadata frame based on the inbound descriptor. | 12-24-2015 |
20150371654 | ECHO CONTROL THROUGH HIDDEN AUDIO SIGNALS | 12-24-2015 |
20150371649 | Processing Audio Signals with Adaptive Time or Frequency Resolution - In one aspect, a method for processing an encoded audio signal is disclosed. The method includes decoding the encoded audio signal to obtain a time-domain audio signal and then analyzing the time-domain audio signal with an analysis filter bank to obtain a plurality of complex-valued subband samples in a first frequency region. The method further includes processing the audio signal by generating a plurality of subband samples in a second frequency region based at least in part on the complex-valued subband samples in the first frequency region, grouping at least some of the plurality of subband samples in the second frequency region with an adaptive time resolution and an adaptive frequency resolution to obtain an adaptive grouping, and determining a spectral profile of at least some of the subband samples in the second frequency region based at least in part on the adaptive grouping. | 12-24-2015 |
20150371646 | Time-Varying Filters for Generating Decorrelation Signals - Decorrelation filter parameters for audio data may be based, at least in part, on audio characteristics such as tonality information and/or transient information. Determining the audio characteristics may involve receiving explicit audio characteristics with the audio data and/or determining audio characteristics based on one or more attributes of the audio data. The decorrelation filter parameters may include dithering parameters and/or randomly selected pole locations for at least one pole of an all-pass filter. The dithering parameters and/or pole locations may involve a maximum stride value for pole movement. In some examples, the maximum stride value may be substantially zero for highly tonal signals of the audio data. The dithering parameters and/or pole locations may be bounded by constraint areas within which pole movements are constrained. The constraint areas may or may not be fixed. In some implementations, different channels of the audio data may share the same constraint areas. | 12-24-2015 |
20150365775 | AUTOMATIC LOUDSPEAKER POLARITY DETECTION - In some embodiments, a method for automatic detection of polarity of speakers, e.g., speakers installed in cinema environments. In some embodiments, the method determines relative polarities of a set of speakers (e.g., loudspeakers and/or drivers of a multi-driver loudspeaker) using a set of microphones, including by measuring impulse responses, including an impulse response for each speaker-microphone pair; clustering the speakers into a set of groups, each group including at least two of the speakers which are similar to each other in at least one respect; and for each group, determining and analyzing cross-correlations of pairs of impulse responses (e.g., pairs of processed versions of impulse responses) of speakers in the group to determine relative polarities of the speakers. Other aspects include systems configured (e.g., programmed) to perform any embodiment of the inventive method, and computer readable media (e.g., discs) which store code for implementing any embodiment of the inventive method. | 12-17-2015 |
20150365688 | Efficient Transcoding for Backward-Compatible Wide Dynamic Range Codec - An intermediate bitstream generated by a first-stage transcoding system from an initial transmission package is received. The intermediate bitstream comprises base layer (BL) and enhancement layer (EL) signals. The combination of the BL and EL signals of the intermediate bitstream represents compressed wide dynamic range images. The BL signal of the intermediate bitstream alone represents compressed standard dynamic range images. A targeted transmission package is generated based on the intermediate bitstream. The targeted transmission package comprises BL and EL signals. The BL signal of the targeted transmission package may be directly transcoded from the BL signal of the intermediate bitstream alone. | 12-17-2015 |
20150365580 | Global Display Management Based Light Modulation - A plurality of input images in an input video signal of a wide dynamic range is received. A specific setting of global light modulation is determined based on a specific input image in the plurality of input images. The specific setting of global light modulation produces a specific dynamic range window. A plurality of input code values in the specific input image is converted to a plurality of output code values in a specific output image corresponding to the specific input image. The plurality of output code values produces the same or substantially the same luminance levels as represented by the plurality of input code values. Any other pixels in the specific input image are converted to different luminance levels in the specific output image through display management. | 12-17-2015 |
20150363160 | System and Method for Optimizing Loudness and Dynamic Range Across Different Playback Devices - Embodiments are directed to a method and system for receiving, in a bitstream, metadata associated with the audio data, and analyzing the metadata to determine whether a loudness parameter for a first group of audio playback devices are available in the bitstream. Responsive to determining that the parameters are present for the first group, the system uses the parameters and audio data to render audio. Responsive to determining that the loudness parameters are not present for the first group, the system analyzes one or more characteristics of the first group, and determines the parameter based on the one or more characteristics. | 12-17-2015 |
20150356978 | AUDIO CODING WITH GAIN PROFILE EXTRACTION AND TRANSMISSION FOR SPEECH ENHANCEMENT AT THE DECODER - The invention provides a layered audio coding format with a monophonic layer and at least one sound field layer. A plurality of audio signals is decomposed, in accordance with decomposition parameters controlling the quantitative properties of an orthogonal energy-compacting transform, into rotated audio signals. Further, a time-variable gain profile specifying constructively how the rotated audio signals may be processed to attenuate undesired audio content is derived. The monophonic layer may comprise one of the rotated signals and the gain profile. The sound field layer may comprise the rotated signals and the decomposition parameters. In one embodiment, the gain profile comprises a cleaning gain profile with the main purpose of eliminating non-speech components and/or noise. The gain profile may also comprise mutually independent broadband gains. Because signals in the audio coding format can be mixed with a limited computational effort, the invention may advantageously be applied in a tele-conferencing application. | 12-10-2015 |
20150350804 | Reflected Sound Rendering for Object-Based Audio - Embodiments are described for rendering spatial audio content through a system that is configured to reflect audio off of one or more surfaces of a listening environment. The system includes an array of audio drivers distributed around a room, wherein at least one driver of the array of drivers is configured to project sound waves toward one or more surfaces of the listening environment for reflection to a listening area within the listening environment and a renderer configured to receive and process audio streams and one or more metadata sets that are associated with each of the audio streams and that specify a playback location in the listening environment. | 12-03-2015 |
20150350661 | HIGH PRECISION UP-SAMPLING IN SCALABLE CODING OF HIGH BIT-DEPTH VIDEO - The precision of up-sampling operations in a layered coding system is preserved when operating on video data with high bit-depth. In response to bit-depth requirements of the video coding or decoding system, scaling and rounding parameters are determined for a separable up-scaling filter. Input data are first filtered across a first spatial direction using a first rounding parameter to generate first up-sampled data. First intermediate data are generated by scaling the first up-sampled data using a first shift parameter. The intermediate data are then filtered across a second spatial direction using a second rounding parameter to generate second up-sampled data. Second intermediate data are generated by scaling the second up-sampled data using a second shift parameter. Final up-sampled data may be generated by clipping the second intermediate data. | 12-03-2015 |
20150350099 | Controlling A Jitter Buffer - Apparatus and methods for controlling a jitter buffer are described. In one embodiment, the apparatus for controlling a jitter buffer includes an inter-talkspurt delay jitter estimator for estimating an offset value of the delay of a first frame in the current talkspurt with respect to the delay of a latest anchor frame in a previous talkspurt, and a jitter buffer controller for adjusting a length of the jitter buffer based on a long term length of the jitter buffer for each frame and the offset value. | 12-03-2015 |
20150348546 | AUDIO PROCESSING APPARATUS AND AUDIO PROCESSING METHOD - An audio processing apparatus and an audio processing method are described. In one embodiment, the audio processing apparatus include an audio masker separator for separating from a first audio signal an audio material comprising a sound other than stationary noise and utterance meaningful in semantics, as an audio masker candidate. The apparatus also includes a first context analyzer for obtaining statistics regarding contextual information of detected audio masker candidates, and a masker library builder for building a masker library or updating an existing masker library by adding, based on the statistics, at least one audio masker candidate as an audio masker into the masker library, wherein audio maskers in the maker library are used to be inserted into a target position in a second audio signal to conceal defects in the second audio signal. | 12-03-2015 |
20150341675 | Backward-Compatible Coding for Ultra High Definition Video Signals with Enhanced Dynamic Range - Video data with both ultra-high definition (UHD) resolution and high or enhanced dynamic range (EDR) data are coded in a backward-compatible layered stream which allows legacy decoders to extract an HD standard dynamic range (SDR) signal. In response to a base layer HD SDR signal, a predicted signal is generated using separate luma and chroma prediction models. In the luma predictor, luma pixel values of the predicted signal are computed based only on luma pixel values of the base layer, while in the chroma predictor, chroma pixel values of the predicted signal are computed based on both the luma and the chroma pixel values of the base layer. A residual signal is computed based on the input UHD EDR signal and the predicted signal. The base layer and the residual signal are coded separately to form a coded bitstream. A compatible dual-layer decoder is also presented. | 11-26-2015 |
20150341498 | Audio Burst Collision Resolution - In an conferencing system in which a plurality of communication devices electronically connect respective participants to one another, a method for mitigating the effects of substantially concurrent audio bursts from two or more of the participants includes identifying a priority attribute associated with each of multiple substantially concurrent audio bursts, comparing the identified priority attributes, and electronically suppressing at least one audio burst as a function of the comparison. | 11-26-2015 |
20150339990 | Systems and Methods of Managing Metameric Effects in Narrowband Primary Display Systems - Several embodiments of display systems that use narrowband emitters are disclosed herein. In one embodiment, a display system comprises, for at least one primary color, a plurality of narrowband emitters distributed around the primary color point. The plurality of narrowband emitters provides a more regular power vs. spectral distribution in a desired band of frequencies. | 11-26-2015 |
20150332704 | Method for Controlling Acoustic Echo Cancellation and Audio Processing Apparatus - A method for controlling acoustic echo cancellation and an audio processing apparatus are described. In one embodiment, the audio processing apparatus includes an acoustic echo canceller for suppressing acoustic echo in a microphone signal, a jitter buffer for reducing delay jitter of a received signal, and a joint controller for controlling the acoustic echo canceller by referring to at least one future frame in the jitter buffer. | 11-19-2015 |
20150332680 | Object Clustering for Rendering Object-Based Audio Content Based on Perceptual Criteria - Embodiments are directed a method of rendering object-based audio comprising determining an initial spatial position of objects having object audio data and associated metadata, determining a perceptual importance of the objects, and grouping the audio objects into a number of clusters based on the determined perceptual importance of the objects, such that a spatial error caused by moving an object from an initial spatial position to a second spatial position in a cluster is minimized for objects with a relatively high perceptual importance. The perceptual importance is based at least in part by a partial loudness of an object and content semantics of the object. | 11-19-2015 |
20150325243 | AUDIO ENCODER AND DECODER WITH PROGRAM LOUDNESS AND BOUNDARY METADATA - Apparatus and methods for generating an encoded audio bitstream, including by including program loudness metadata and audio data in the bitstream, and optionally also program boundary metadata in at least one segment (e.g., frame) of the bitstream. Other aspects are apparatus and methods for decoding such a bitstream, e.g., including by performing adaptive loudness processing of the audio data of an audio program indicated by the bitstream, or authentication and/or validation of metadata and/or audio data of such an audio program. Another aspect is an audio processing unit (e.g., an encoder, decoder, or post-processor) configured (e.g., programmed) to perform any embodiment of the method or which includes a buffer memory which stores at least one frame of an audio bitstream generated in accordance with any embodiment of the method. | 11-12-2015 |
20150319450 | Guided Color Transient Improvement Filtering in Video Coding - An encoder receives a target image in a standard dynamic range and a guide image in a high dynamic range, wherein both the target image and the guide image represent the same scene. A color transient improvement (CTI) filter is selected to predict a chroma component of a decoded version of the target image based on both the luma and chroma components of the target image and the guide image. The filtering coefficients for the CTI filter are computed by minimizing an error measurement (e.g., MSE) between pixel values of the decoded image and the guide image. The computed set of filtering coefficients is signaled to a receiver (e.g., as metadata). A decoder receives the coded image and the metadata, and applies the same CTI filter to the decoded image to generate an output image. | 11-05-2015 |
20150310872 | Multistage IIR Filter and Parallelized Filtering of Data with Same - In some embodiments, a multistage filter whose biquad filter stages are combined with latency between the stages, a system (e.g., an audio encoder or decoder) including such a filter, and methods for multistage biquad filtering. In typical embodiments, all biquad filter stages of the filter are operable independently to perform fully parallelized processing of data. In some embodiments, the inventive multistage filter includes a buffer memory, at least two biquad filter stages, and a controller coupled and configured to assert a single stream of instructions to the filter stages. Typically, the multistage filter is configured to perform multistage filtering of a block of input samples in a single processing loop with iteration over a sample index but without iteration over a biquadratic filter stage index. | 10-29-2015 |
20150304791 | VIRTUAL HEIGHT FILTER FOR REFLECTED SOUND RENDERING USING UPWARD FIRING DRIVERS - Embodiments are directed to speakers and circuits that reflect sound off a ceiling to a listening location at a distance from a speaker. The reflected sound provides height cues to reproduce audio objects that have overhead audio components. The speaker comprises upward firing drivers to reflect sound off of the upper surface and represents a virtual height speaker. A virtual height filter based on a directional hearing model is applied to the upward-firing driver signal to improve the perception of height for audio signals transmitted by the virtual height speaker to provide optimum reproduction of the overhead reflected sound. The virtual height filter may be incorporated as part of a crossover circuit that separates the full band and sends high frequency sound to the upward-firing driver. | 10-22-2015 |
20150304658 | Quantization Control for Variable Bit Depth - The quantization parameter QP is well-known in digital video compression as an indication of picture quality. Digital symbols representing a moving image are quantized with a quantizing step that is a function QSN of the quantization parameter QP, which function QSN has been normalized to the most significant bit of the bit depth of the digital symbols. As a result, the effect of a given QP is essentially independent of bit depth a particular QP value has a standard effect on image quality, regardless of bit depth. The invention is useful, for example, in encoding and decoding at different bit depths, to generate compatible, bitstreams having different bit depths, and to allow different bit depths for different components of a video signal by compressing each with the same fidelity (i.e., the same QP). | 10-22-2015 |
20150304640 | Managing 3D Edge Effects On Autostereoscopic Displays - 3D images may be represented by a sequence of received pairs of LE and RE frames. It is determined whether a frame comprising a floating window exists in a pair of LE and RE frames in the sequence of pairs of LE and RE frames. If so, depth information for one or more pixels in a plurality of pixels in a portion of the frame covered by the floating window is determined. Such depth information may be generated based on depth information extracted from one or more frames in one or more pairs of LE and RE frames in the sequence of pairs of LE and RE frames that are either previous or subsequent to the pair of LE and RE frames. | 10-22-2015 |
20150296226 | Techniques For Client Device Dependent Filtering Of Metadata - Methods and apparatuses for media data communication for improved bandwidth utilization are provided. A client device communicates profile information to a server. The server maintains media content data and first metadata category information associated with the media content data. Based upon the profile information, a determination is made as to whether the client device is to utilize the first metadata category information. If to be utilized, media content data as well as the first metadata category information is provided to the client device. If a non-utilization determination is made, media content data is provided without the first metadata category information. In exemplary embodiments, first metadata category information can relate to any of the following: closed captioning, speaker virtualization, three dimensional rendering, global positioning, audio and/or video codecs, volume control, and the like. | 10-15-2015 |
20150296086 | PLACEMENT OF TALKERS IN 2D OR 3D CONFERENCE SCENE - The present document relates to setting up and managing two-dimensional or three-dimensional scenes for audio conferences. A conference controller ( | 10-15-2015 |
20150294630 | Locally Dimmed Nano-Crystal Based Display - Dual modulator displays are disclosed incorporating a phosphorescent plate interposed in the optical path between a light source modulation layer and a display modulation layer. Spatially modulated light output from the light source modulation layer impinges on the phosphorescent plate and excites corresponding regions of the phosphorescent plate which in turn emit light having different spectral characteristics than the light output from the light source modulation layer. Light emitted from the phosphorescent plate is received and further modulated by the display modulation layer to provide the ultimate display output. | 10-15-2015 |
20150288824 | TELECONFERENCING USING MONOPHONIC AUDIO MIXED WITH POSITIONAL METADATA - In some embodiments, a method for preparing monophonic audio for transmission to a node of a teleconferencing system, including steps of generating a monophonic mixed audio signal, including by a mixing a metadata signal (e.g., a tone) with monophonic audio indicative of speech by a currently dominant participant in a teleconference, and encoding the mixed audio signal for transmission, where the metadata signal is indicative of an apparent source position for the currently dominant conference participant. Other embodiments include steps of decoding such a transmitted encoded signal to determine the monophonic mixed audio signal, identifying the metadata signal, and determining the apparent source position corresponding to the currently dominant participant from the metadata signal. Other aspects are systems configured to perform any embodiment of the method or steps thereof. | 10-08-2015 |
20150287368 | TECHNIQUES FOR DUAL MODULATION DISPLAY WITH LIGHT CONVERSION - Techniques for driving a dual modulation display include generating backlight drive signals to drive individually-controllable illumination sources. The illumination sources emit first light onto a light conversion layer. The light conversion layer converts the first light into second light. The light conversion layer can include quantum dots or phosphor materials. Modulation drive signals are generated to determine transmission of the second light through individual subpixels of the display. These modulation drive signals can be adjusted based on one or more light field simulations. The light field simulations can include: (i) a color shift for a pixel based on a point spread function of the illumination sources; (ii) binning difference of individual illumination sources; (iii) temperature dependence of display components on performance; or (iv) combinations thereof. | 10-08-2015 |
20150279383 | Processing Audio Signals with Adaptive Time or Frequency Resolution - In one aspect, an audio processing apparatus is disclosed. The apparatus includes an audio decoder, a filterbank, and a processor. The audio decoder decodes an encoded audio signal to obtain a time-domain audio signal, the encoded audio signal including a plurality of spectral components. The filterbank splits the time-domain audio signal to obtain a plurality of complex-valued subband samples in a first frequency region. The processor generates a plurality of subband samples in a second frequency region based at least in part on the complex-valued subband samples in the first frequency region, adaptively groups at least some of the plurality of subband samples in the second frequency region with an adaptive time resolution or an adaptive frequency resolution, and determines a spectral profile of at least some of the subband samples in the second frequency region based on the groups. | 10-01-2015 |
20150279379 | Reconstructing an Audio Signal with a Noise Parameter - A method for generating a reconstructed audio signal having a baseband portion and a highband portion is disclosed. The method includes deformatting an encoded audio signal into a first part and a second part and decoding the first part to obtain a decoded baseband audio signal. The method also includes extracting an estimated spectral envelope of the highband portion and a noise parameter from the second part and filtering the decoded baseband audio signal to obtain a plurality of subband signals. The method further includes generating a high-frequency reconstructed signal by copying a number of consecutive subband signals of the plurality of subband signals and adjusting a spectral envelope of the high-frequency reconstructed signal based on the estimated spectral envelope of the highband portion to obtain an envelope adjusted high-frequency signal. | 10-01-2015 |
20150277840 | Maximizing Native Capability Across Multiple Monitors - In an embodiment, a first display monitor communicates with a second display monitor to determine a common set of display capabilities that are supported by both the first display monitor and the second display monitor. One or more first color grading instructions for one or more first images are received from a first user. In response to receiving the one or more first color grading instructions, the one or more first images are color graded with the one or more first color grading instructions to generate one or more first color graded images. The one or more first color graded images are rendered on the first display monitor. In addition, the one or more first color graded images are caused to be rendered on the second display monitor. | 10-01-2015 |
20150271620 | REFLECTED AND DIRECT RENDERING OF UPMIXED CONTENT TO INDIVIDUALLY ADDRESSABLE DRIVERS - Embodiments are described for a system of rendering spatial audio content in a listening environment. The system includes a rendering component configured to generate a plurality of audio channels including information specifying a playback location in a listening area, an upmixer component receiving the plurality of audio channels and generating, for each audio channel, at least one reflected sub-channel configured to cause a majority of driver energy to reflect off of one or more surfaces of the listening area, and at least one direct sub-channel configured to cause a majority of driver energy to propagate directly to the playback location. | 09-24-2015 |
20150271619 | Processing Audio or Video Signals Captured by Multiple Devices - Embodiments of the present disclosure relate to processing audio or video signals captured by multiple devices. An apparatus for processing video and audio signals includes an estimating unit and a processing unit. The estimating unit may estimate at least one aspect of an array at least based on at least one video or audio signal captured respectively by at least one of portable devices arranged in an array. The processing unit may apply the aspect at least based on video to a process of generating a surround sound signal via the array, or apply the aspect at least based on audio to a process of generating a combined video signal via the array. With cross-referencing visual or acoustic hints, an improvement can be achieved in generating an audio or video signal. | 09-24-2015 |
20150270819 | Techniques for Distortion Reducing Multi-Band Compressor with Timbre Preservation - Distortion reducing multi-band compressor with timbre preservation is provided. Timbre preservation is achieved by determining a time-varying threshold in each of a plurality frequency bands as a function of a respective fixed threshold for the frequency band and, at least in part, an audio signal level and a fixed threshold outside such frequency band. If a particular frequency band receives significant gain reduction due to being above or approaching its fixed threshold, then a time-varying threshold of one or more other frequency bands are also decreased to receive some gain reduction. In a specific embodiment, time-varying thresholds can be computed from an average difference of the audio input signal in each frequency band and its respective fixed threshold. | 09-24-2015 |
20150264461 | TELECOMMUNICATIONS DEVICE - A telecommunications device ( | 09-17-2015 |
20150264314 | Systems and Methods for Initiating Conferences Using External Devices - A system and method for initiating conference calls with external devices are disclosed. Call participants are sent conference invitation and conference information regarding the designated conference call. This conference information is stored on the participant's external device. When the participants arrive at a conference call location having a conferencing device, the conferencing device is capable of communicating with the external device, initiating communications, exchanging conference information. If the participant is verified and/or authorized, the conference system may send the IP address of the conference device to the conference system to initiate the conference call. In one embodiment, the conference device uses an ultrasound acoustic communication band to initiate the call with the external device on a semi-automated basis. An acoustic signature comprising a pilot sequence for communications synchronization may be generated to facilitate the call. Audible and aesthetic acoustic protocols may also be employed. | 09-17-2015 |
20150256860 | Graphics Blending for High Dynamic Range Video - A method for merging graphics and high dynamic range video data is disclosed. In a video receiver, a display management process uses metadata to map input video data from a first dynamic range into the dynamic range of available graphics data. The remapped video signal is blended with the graphics data to generate a video composite signal. An inverse display management process uses the metadata to map the video composite signal to an output video signal with the first dynamic range. To alleviate perceptual tone-mapping jumps during video scene changes, a metadata transformer transforms the metadata to transformed so that on a television (TV) receiver metadata values transition smoothly between consecutive scenes. The TV receiver receives the output video signal and the transformed metadata to generate video data mapped to the dynamic range of the TV's display. | 09-10-2015 |
20150256752 | Multi-Field CCD Capture for HDR Imaging - Techniques are described to combine image data from multiple images with different exposures into a relatively high dynamic range image. A first image of a scene may be generated with a first operational mode of an image processing system. A second image of the scene may be generated with a second different operational mode of the image processing system. The first image may be of a first spatial resolution, while the second image may be of a second spatial resolution. For example, the first spatial resolution may be higher than the second spatial resolution. The first image and the second image may be combined into an output image of the scene. The output image may be of a higher dynamic range than either of the first image and the second image and may be of a spatial resolution higher than the second spatial resolution. | 09-10-2015 |
20150255079 | Position-Dependent Hybrid Domain Packet Loss Concealment - The present document relates to audio signal processing in general, and to the concealment of artifacts that result from loss of audio packets during audio transmission over a packet-switched network, in particular. A method ( | 09-10-2015 |
20150254823 | Image Range Expansion Control Methods And Apparatus - Image data is adjusted for display on a target display. Maximum safe expansions for one or more attributes of the image data are compared to maximum available expansions for the attributes. An amount of expansion is selected that does not exceed either of the maximum safe expansion and the maximum available expansion. Artifacts caused by over expansion may be reduced or avoided. | 09-10-2015 |
20150254054 | Audio Signal Processing - A method for audio signal processing is provided. The method includes acquiring a first set of metadata associated with consumption of an audio signal by a target user, acquiring a second set of metadata associated with a set of reference users and generating, at least partially based on the first and second sets of metadata, a recommended configuration of at least one parameter for the target user, the at least one parameter being for use in the consumption of the audio signal. Corresponding apparatus and computer program product are also disclosed. | 09-10-2015 |
20150249832 | HDR IMAGES WITH MULTIPLE COLOR GAMUTS - Image encoding and decoding are described. An input HDR image that includes a base image and a ratio image may be stored using two or more color description profiles. One profile defines the encoding color space of the base image and the second profile defines the encoding color space of the HDR metadata which may be different than the color space of the base image. | 09-03-2015 |
20150248889 | LAYERED APPROACH TO SPATIAL AUDIO CODING - The invention provides a layered audio coding format with a monophonic layer and at least one sound field layer. A plurality of audio signals is decomposed, in accordance with decomposition parameters controlling the quantitative properties of an orthogonal energy-compacting transform, into rotated audio signals. Further, a time-variable gain profile specifying constructively how the rotated audio signals may be processed to attenuate undesired audio content is derived. The monophonic layer may comprise one of the rotated signals and the gain profile. The sound field layer may comprise the rotated signals and the decomposition parameters. In one embodiment, the gain profile comprises a cleaning gain profile with the main purpose of eliminating non-speech components and/or noise. The gain profile may also comprise mutually independent broadband gains. Because signals in the audio coding format can be mixed with a limited computational effort, the invention may advantageously be applied in a tele-conferencing application. | 09-03-2015 |
20150248747 | DISPLAY MANAGEMENT FOR IMAGES WITH ENHANCED DYNAMIC RANGE - An image processor receives an input image with enhanced dynamic range to be displayed on a target display which has a different dynamic range than a reference display. After optional color transformation ( | 09-03-2015 |
20150245157 | Virtual Rendering of Object-Based Audio - Embodiments are described for a system for virtual rendering of object based audio through binaural rendering of each object followed by panning of the resulting stereo binaural signal between a plurality of cross-talk cancelation circuits feeding a corresponding plurality of speaker pairs. In comparison to prior art virtual rendering utilizing a single pair of speakers, the described embodiments improve the spatial impression for both listeners inside and outside of the cross-talk canceller sweet spot. Also described is an improved equalization technique for a crosstalk canceller that is computed from both the crosstalk canceller filters and the binaural filters and applied to a monophonic audio signal being virtualized. The described techniques improve timbre for listeners outside of the sweet-spot as well as a smaller timbre shift when switching from standard rendering to virtual rendering. | 08-27-2015 |
20150244869 | Spatial Multiplexing in a Soundfield Teleconferencing System - The present document relates to audio conference systems. In particular, the present document relates to the mapping of soundfields within an audio conference system. A conference multiplexer ( | 08-27-2015 |
20150244868 | Method for Improving Perceptual Continuity in a Spatial Teleconferencing System - The present document relates to audio conference systems. In particular, the present document relates to improving the perceptual continuity within an audio conference system. According to an aspect, a method for multiplexing first and second continuous input audio signals is described, to yield a multiplexed output audio signal which is to be rendered to a listener. The first and second input audio signals ( | 08-27-2015 |
20150243300 | Voice Activity Detector for Audio Signals - According to one aspect, a method for detecting voice activity is disclosed, the method including receiving a frame of an input audio signal, the input audio signal having an sample rate; dividing the frame into a plurality of subbands based on the sample rate, the plurality of subbands including at least a lowest subband and a highest subband; filtering the lowest subband with a moving average filter to reduce an energy of the lowest subband; estimating a noise level for each of the plurality of subbands; calculating a signal to noise ratio value for each of the plurality of subbands; and determining a speech activity level of the frame based on an average of the calculated signal to noise ratio values and a weighted average of an energy of each of the plurality of subbands. Other aspects include audio decoders that decode audio that was encoded using the methods described herein. | 08-27-2015 |
20150243295 | Reconstructing an Audio Signal with a Noise Parameter - A method for reconstructing an audio signal having a baseband portion and a highband portion is disclosed. The method includes decoding an encoded audio signal to obtain a decoded baseband audio signal, filtering the decoded baseband audio signal to obtain subband signals, and generating a high-frequency reconstructed signal by copying a number of consecutive subband signals. The method also includes adjusting a spectral envelope of the high-frequency reconstructed signal based on an estimated spectral envelope of the highband portion extracted from the encoded audio signal to obtain an envelope adjusted high-frequency signal, generating a noise component based on a noise parameter extracted from the encoded audio signal, and adding the noise component to the envelope adjusted high-frequency signal to obtain a noise and envelope adjusted high-frequency signal. | 08-27-2015 |
20150243289 | Multi-Channel Audio Content Analysis Based Upmix Detection - Forensic audio upmixer detection is described. Feature sets are extracted from an audio signal that has two or more individual channels. Based on the extracted feature sets, it is determined whether the audio signal was upmixed from audio content that has fewer channels. | 08-27-2015 |
20150237301 | NEAR-END INDICATION THAT THE END OF SPEECH IS RECEIVED BY THE FAR END IN AN AUDIO OR VIDEO CONFERENCE - Embodiments of client device and method for audio or video conferencing are described. An embodiment includes an offset detecting unit, a configuring unit, an estimator and an output unit. The offset detecting unit detects an offset of speech input to the client device. The configuring unit determines a voice latency from the client device to every far end. The estimator estimates a time when a user at the far end perceives the offset based on the voice latency. The output unit outputs a perceivable signal indicating that a user at the far end perceives the offset based on the time estimated for the far end. The perceivable signal is helpful to avoid collision between parties. | 08-20-2015 |
20150237294 | Multiple Stage Modulation Projector Display Systems Having Efficient Light Utilization - Dual or multi-modulation display systems comprising a first modulator and a second modulator are disclosed. The first modulator may comprise a plurality of analog mirrors (e.g. MEMS array) and the second modulator may comprise a plurality of mirrors (e.g., DMD array). The display system may further comprise a controller that sends control signals to the first and second modulator. The display system may render highlight features within a projected image by affecting a time multiplexing scheme. In one embodiment, the first modulator may be switched on a sub-frame basis such that a desired proportion of the available light may be focused or directed onto the second modulator to form the highlight feature on a sub-frame rendering basis. | 08-20-2015 |
20150235645 | Encoding and Rendering of Object Based Audio Indicative of Game Audio Content - In some embodiments, a method (typically performed by a game console) for generating an object based audio program indicative of game audio content (audio content pertaining to play of or events in a game, and optionally also other information regarding the game), and including at least one audio object channel and at least one speaker channel. In other embodiments, a game console configured to generate such an object based audio program. Some embodiments implement object clustering in which audio content of input objects is mixed to generate at least one clustered audio object, or audio content of at least one input object is mixed with speaker channel audio. In response to the program, a spatial rendering system (e.g., external to the game console) may operate with knowledge of playback speaker configuration to generate speaker feeds indicative of a spatial mix of the program's speaker and object channel content. | 08-20-2015 |
20150228293 | Method and System for Object-Dependent Adjustment of Levels of Audio Objects - In some embodiments, a method for adaptive control of gain applied to an audio signal, including steps of analyzing segments of the signal to identify audio objects (e.g., voices of participants in a voice conference); storing information regarding each distinct identified object; using at least some of the information to determine at least one of a target gain, or a gain change rate for reaching a target gain, for each identified object; and applying gain to segments of the signal indicative of an identified object such that the gain changes (typically, at the gain change rate for the object) from an initial gain to the target gain for the object. The information stored may include a scene description. Aspects of the invention include a system configured (e.g., programmed) to per form any embodiment of the inventive method. | 08-13-2015 |
20150228286 | Processing Audio Objects in Principal and Supplementary Encoded Audio Signals - Methods and apparatuses are disclosed that can combine audio content from two encoded input signals into a new encoded output signal without requiring a decode or re-encode of audio content in either encoded input signal. Encoded data representing audio content and spatial location of audio objects in two different input encoded signals are combined to generate an encoded output signal that has encoded data representing audio objects from both of the input encoded signals. | 08-13-2015 |
20150228219 | Dual Modulator Synchronization in a High Dynamic Range Display System - A dual and/or multi modulator display system is disclosed comprising at least a first modulator and a second modulator wherein one of modulators has a faster response time that the other modulator. The response of the slower modulator may be characterized according to various image data inputs and this characterized data may then be used by the display system to derive control and/or data signals to the faster modulator. These control/data signals may represent a fitted set of data matched to one or more characteristics of the slower modulator in order to reduce light produced during frame or other transition times of the modulators. One or more characteristics may be employed to reduce such undesirable visual effects. | 08-13-2015 |
20150227003 | Light Directed Modulation Displays - A light source includes a light reflector and multi-pixel light modulators. The light reflector is surrounded with reflective surfaces. Light can be injected into the light reflector and diffused throughout the light reflector. The multi-pixel light modulators have individual transmittance states based on image data to modulate light that illuminates multi-pixel portions of a light receiving surface. | 08-13-2015 |
20150223002 | System for Rendering and Playback of Object Based Audio in Various Listening Environments - Embodiments are described for a system of rendering object-based audio content through a system that includes individually addressable drivers, including at least one driver that is configured to project sound waves toward one or more surfaces within a listening environment for reflection to a listening area within the listening environment; a renderer configured to receive and process audio streams and one or more metadata sets associated with each of the audio streams and specifying a playback location of a respective audio stream; and a playback system coupled to the renderer and configured to render the audio streams to a plurality of audio feeds corresponding to the array of audio drivers in accordance with the one or more metadata sets. | 08-06-2015 |
20150222916 | Piecewise Cross Color Channel Predictor - A sequence of visual dynamic range (VDR) images is encoded using a standard dynamic range (SDR) base layer and one or more enhancement layers. A predicted VDR image is generated from an SDR input by using a weighted, multi-band, cross-color channel prediction model. Exponential weights with an adaptable decay parameter for each band are also presented. | 08-06-2015 |
20150221319 | METHODS AND SYSTEMS FOR SELECTING LAYERS OF ENCODED AUDIO SIGNALS FOR TELECONFERENCING - In some embodiments, a method for selecting at least one layer of a spatially layered, encoded audio signal. Typical embodiments are teleconferencing methods in which at least one of a set of nodes (endpoints, each of which is a telephone system, and optionally also a server) is configured to perform audio coding in response to soundfield audio data to generate spatially layered encoded audio including any of a number of different subsets of a set of layers, the set of layers including at least one monophonic layer, at least one soundfield layer, and optionally also at least one metadata layer comprising metadata indicative of at least one processing operation to be performed on the encoded audio. Other aspects are systems configured (e.g., programmed) to perform any embodiment of the method, and computer readable media which store code for implementing any embodiment of the method or steps thereof. | 08-06-2015 |
20150221313 | CODING OF A SOUND FIELD SIGNAL - A method for encoding sound field signals includes allocating coding rate by application of a uniform criterion to all subbands of all signals in a joint process. An allocation criterion may be based on a comparison, in a given subband, between a spectral envelope of the signals to be encoded and a coding noise profile, wherein the noise profile may be a sum of a noise shape and a noise offset, which noise offset is computed on the basis of the coding bit budget. The rate allocation process may be combined with an energy-compacting orthogonal transform, for which there is proposed a parameterization susceptible of efficient coding and having adjustable directivity. In a further aspect, the invention provides a corresponding decoding method. | 08-06-2015 |
20150215700 | PERCENTILE FILTERING OF NOISE REDUCTION GAINS - A method of post-processing banded gains for applying to an audio signal, an apparatus to post-processed banded gains, and a tangible computer-readable storage medium comprising instructions that when executed carry out the method. The banded gains are determined by input processing one or more input audio signals. The method includes post-processing the banded gains to generate post-processed gains, generating a particular post-processed gain for a particular frequency band including percentile filtering using gain values from one or more previous frames of the one or more input audio signals and from gain values for frequency bands adjacent to the particular frequency band. | 07-30-2015 |
20150215634 | Encoding, Decoding, and Representing High Dynamic Range Images - Techniques are provided to encode and decode image data comprising a tone mapped (TM) image with HDR reconstruction data in the form of luminance ratios and color residual values. In an example embodiment, luminance ratio values and residual values in color channels of a color space are generated on an individual pixel basis based on a high dynamic range (HDR) image and a derivative tone-mapped (TM) image that comprises one or more color alterations that would not be recoverable from the TM image with a luminance ratio image. The TM image with HDR reconstruction data derived from the luminance ratio values and the color-channel residual values may be outputted in an image file to a downstream device, for example, for decoding, rendering, and/or storing. The image file may be decoded to generate a restored HDR image free of the color alterations. | 07-30-2015 |
20150215594 | Metadata for Use in Color Grading - Methods and systems for color grading video content are presented. A component (e.g. a frame, a shot and/or a scene) of the video content is designated to be a master component and one or more other components of the video content are designated to be slave components, each slave component associated with the master component. A master component color grading operation is performed to the master component. For each one of the slave components, the master component color grading operation is performed to the slave component and a slave color grading operation that is specific to the one of the slave components is also performed. Metadata, which form part of the video content, are created to provide indicators on to whether components of the video are designated as master or slave components. | 07-30-2015 |
20150215467 | LONG TERM MONITORING OF TRANSMISSION AND VOICE ACTIVITY PATTERNS FOR REGULATING GAIN CONTROL - The present document relates to audio communication systems. In particular, the present document relates to the control of the level of audio signals within audio communication systems. A method for leveling a near-end audio signal ( | 07-30-2015 |
20150208190 | BI-DIRECTIONAL INTERCONNECT FOR COMMUNICATION BETWEEN A RENDERER AND AN ARRAY OF INDIVIDUALLY ADDRESSABLE DRIVERS - Embodiments are directed to an interconnect for coupling components in an object-based rendering system comprising: a first network channel coupling a renderer to an array of individually addressable drivers projecting sound in a listening environment and transmitting audio signals and control data from the renderer to the array, and a second network channel coupling a microphone placed in the listening environment to a calibration component of the renderer and transmitting calibration control signals for acoustic information generated by the microphone to the calibration component. The interconnect is suitable for use in a system for rendering spatial audio content comprising channel-based and object-based audio components. | 07-23-2015 |
20150208071 | Quantization Control for Variable Bit Depth - The quantization parameter QP is well-known in digital video compression as an indication of picture quality. Digital symbols representing a moving image are quantized with a quantizing step that is a function QSN of the quantization parameter QP, which function QSN has been normalized to the most significant bit of the bit depth of the digital symbols. As a result, the effect of a given QP is essentially independent of bit depth a particular QP value has a standard effect on image quality, regardless of bit depth. The invention is useful, for example, in encoding and decoding at different bit depths, to generate compatible, bitstreams having different bit depths, and to allow different bit depths for different components of a video signal by compressing each with the same fidelity (i.e., the same QP). | 07-23-2015 |
20150207710 | Call Quality Estimation by Lost Packet Classification - Described are: a method, an apparatus, and a tangible computer-readable storage medium comprising instructions to instruct one or more processors to carry out a method. One set of methods is for the transmit side of a communication link and another set of methods is for the receive side. A transmit side method includes assigning one of a set of classifications to media, e.g., voice/audio packets transmitted in a sequence, different classifications impacting differently a measure of perceptual quality calculated at the receive side if packets of the respective classifications are lost. A present packet is sent to the receive side containing the classification of a previous packet. | 07-23-2015 |
20150206295 | IMAGE PROCESSING FOR HDR IMAGES - Image encoding is described. Log-luminances in an HDR input image are histogrammed to generate a tone-map, along with which a log global tone-mapped luminance image is computed. The log global tone-mapped luminance image is downscaled. The log-luminances and the log global tone-mapped luminance image generate a log ratio image. Multi-scale resolution filtering the log ratio image generates a log multi-scale ratio image. The log multi-scale ratio image and the log luminances generate a second log tone-mapped image, which is normalized to output a tone-mapped image based on the downscaled log global tone-mapped luminance image and the normalized image. The HDR input image and the output tone-mapped image generate a second ratio image, which is quantized. | 07-23-2015 |
20150201206 | MULTI-LAYER INTERLACE FRAME-COMPATIBLE ENHANCED RESOLUTION VIDEO DELIVERY - A video base layer can contain information pertaining to frame-compatible interlace representations of multiple data categories while video enhancement layers can contain interlace or progressive representations and/or frame-compatible representations of these data categories. Video data are encoded and decoded using layered approaches. | 07-16-2015 |
20150201178 | Frame Compatible Depth Map Delivery Formats for Stereoscopic and Auto-Stereoscopic Displays - Stereoscopic video data and corresponding depth map data for stereoscopic and auto-stereoscopic displays are coded using a coded base layer and one or more coded enhancement layers. Given a 3D input picture and corresponding input depth map data, a side-by-side and a top-and-bottom picture are generated based on the input picture. Using an encoder, the side-by-side picture is coded to generate a coded base layer Using the encoder and a texture reference processing unit (RPU), the top-and-bottom picture is encoded to generate a first enhancement layer, wherein the first enhancement layer is coded based on the base layer stream, and using the encoder and a depth-map RPU, depth data for the side-by-side picture are encoded to generate a second enhancement layer, wherein the second enhancement layer is coded based on to the base layer. Alternative single, dual, and multi-layer depth map delivery systems are also presented. | 07-16-2015 |
20150195643 | Loudspeaker Horn and Cabinet - According to various embodiments, a loudspeaker horn and cabinet are designed to achieve a sound coverage pattern characterized by narrow vertical dispersion and a wide horizontal dispersion. A loudspeaker horn may comprise at least two horn sections, each extending from an inlet to a mouth. A first plurality of outlet channels is disposed in an interleaved column with a second plurality of outlet channels. A loudspeaker cabinet may comprise a primary enclosure having a front wall, the front wall having an aperture in which a low frequency loudspeaker driver is mounted. The loudspeaker cabinet further comprises a top baffle section having a top end and a bottom baffle section having a bottom end, each extending vertically from the primary enclosure. The top baffle section has a first width that gradually increases towards the top end and the bottom section has a second width that gradually increases towards the bottom end. | 07-09-2015 |
20150195499 | Color Grading Apparatus and Methods - A method for color grading input video data for display on a target display comprises obtaining target display metadata indicative of a capability of the target display, obtaining input video data metadata indicative of image characteristics of the input video data, automatically determining initial values for parameters of a parameterized sigmoidal transfer function, at least one of the initial values based at least in part on at least one of the target display metadata and the input video data metadata and mapping the input video data to color-graded video data according to the parameterized transfer function specified using the initial values. | 07-09-2015 |
20150187362 | Multichannel Audio Coding - Multiple channels of audio are combined either to a monophonic composite signal or to multiple channels of audio along with related auxiliary information from which multiple channels of audio are reconstructed, including improved downmixing of multiple audio channels to a monophonic audio signal or to multiple audio channels and improved decorrelation of multiple audio channels derived from a monophonic audio channel or from multiple audio channels. Aspects of the disclosed invention are usable in audio encoders, decoders, encode/decode systems, downmixers, upmixers, and decorrelators. | 07-02-2015 |
20150184814 | Quantum Dot Modulation For Displays - Modulated light sources are described. A modulated light source may have first light sources that are configured to emit first light, which has first color components that occupy a range that is beyond one or more prescribed ranges of light wavelengths. The modulated light source may also have a light converter that is configured to be illuminated by the first light. The light converter converts the first light into second light. The second light has one or more second color components that are within the one or more prescribed ranges of light wavelengths. Strengths of the one or more second color components in the second light are monitored and regulated to produce a particular point within a specific color gamut. | 07-02-2015 |
20150179182 | Adaptive Quantization Noise Filtering of Decoded Audio Data - A method including steps of decoding an encoded audio signal indicative of encoded audio content (e.g., audio content captured during a teleconference) to generate a decoded signal indicative of a decoded version of the audio content, and performing adaptive quantization noise filtering on the decoded signal. The filtering is performed adaptively in the frequency domain in response to data indicative of signal to noise values in turn indicative of a post-quantization signal-to-quantization noise ratio for each frequency band of each of at least one segment of the encoded audio content. In some embodiments, each signal to noise value is a bit allocation value equal to the number of mantissa bits of an encoded audio sample of a frequency band of a segment of the encoded audio content. Other aspects are decoder, or post-filter coupled to receive a decoder's output, configured to perform an embodiment of the adaptive filtering. | 06-25-2015 |
20150178981 | METHODS AND APPARATUS FOR IMAGE ADJUSTMENT FOR DISPLAYS HAVING 2D AND 3D DISPLAY MODES - Embodiments of the invention relate to a display operable in 2D and 3D display modes. Methods and apparatus are provided for adjusting the colors and brightness of the image data and/or intensity of the display backlight based on the current display mode and/or color-grading of the image data. For example, when switching to a 3D display mode a color mapping may be performed on left and right eye image data to increase color saturation in particular regions, and/or the backlight intensity may be increased in particular regions to compensate for lower light levels in 3D display mode. | 06-25-2015 |
20150163614 | EMBEDDING DATA IN STEREO AUDIO USING SATURATION PARAMETER MODULATION - In some embodiments, a method for embedding data (e.g., metadata for use during post-processing) in a stereo audio signal comprising frames. Each of the frames has a saturation value, and data are embedded in the stereo audio signal by modifying the signal to generate a modulated stereo audio signal comprising a sequence of modulated frames having modulated saturation values indicative of the embedded data. Typically, one data bit is embedded in each frame of an input stereo audio signal by modifying the frame to produce a modulated frame whose modulated saturation value matches a target value indicative of the data bit. In other embodiments, a method for extracting data from a stereo audio signal in which the data have been embedded in accordance with an embodiment of the inventive embedding method. Other aspects are systems (e.g., programmed processors) configured to perform any embodiment of the inventive method. | 06-11-2015 |
20150163362 | METRIC FOR MEETING COMMENCEMENT IN A VOICE CONFERENCING SYSTEM | 06-11-2015 |
20150156469 | Decoding and Display of High Dynamic Range Video - Novel methods and systems for decoding and displaying enhanced dynamic range (EDR) video signals are disclosed. To accommodate legacy digital media players with constrained computational resources, compositing and display management (DM) operations are moved from a digital media player to its attached EDR display. On a video receiver, base and enhancement video layers are decoded and multiplexed together with overlay graphics into an interleaved stream. The video and graphics signals are all converted to a common format which allows metadata to be embedded in the interleaved signal as part of the least significant bits in the chroma channels. On the display, the video and the graphics are de-interleaved. After compositing and display management operations guided by the received metadata, the received graphics data are blended with the output of the DM process and the final video output is displayed on the display's panel. | 06-04-2015 |
20150154966 | Haptic Signal Synthesis and Transport in a Bit Stream - Techniques for synthesizing a parameterized haptic track from a multichannel audio signal and embedding the haptic track in a multichannel audio codec bit stream. The haptic track synthesis can occur while encoding the multichannel audio signal as the multichannel audio codec bit stream. The haptic track can be synthesized in a way that allows select parameters of the haptic track to be extracted from the audio signal. The parameters can be adjusted by a user during the encoding process. For example, the parameter adjustments can be made using a authoring/monitoring tool. The parameter adjustments can be recorded as metadata that, along with the haptic track, is included in the codec bit stream. In some aspects the present technology, the adjustable parameters include center frequency, gain, decay rate, and other parameters that allow for modulation of the haptic track during decoding of the codec bit stream. By synthesizing a parameterized haptic track, A/V content creators can be provided greater control in authoring haptic content to accompany the A/V content they create. | 06-04-2015 |
20150146873 | Rendering and Playback of Spatial Audio Using Channel-Based Audio Systems - Embodiments are described for a method and system of rendering and playing back spatial audio content using a channel-based format. Spatial audio content that is played back through legacy channel-based equipment is transformed into the appropriate channel-based format resulting in the loss of certain positional information within the audio objects and positional metadata comprising the spatial audio content. To retain this information for use in spatial audio equipment even after the audio content is rendered as channel-based audio, certain metadata generated by the spatial audio processor is incorporated into the channel-based data. The channel-based audio can then be sent to a channel-based audio decoder or a spatial audio decoder. The spatial audio decoder processes the metadata to recover at least some positional information that was lost during the down-mix operation by upmixing the channel-based audio content back to the spatial audio content for optimal playback in a spatial audio environment. | 05-28-2015 |
20150142451 | ERROR CONCEALMENT STRATEGY IN A DECODING SYSTEM - A decoding system reconstructs an audio signal based on an input signal representing the audio signal by parametric coding or by n discretely coded channels. Parametric decoding proceeds on the basis of a core signal and mixing parameters controlling a spatial synthesis stage, which is supplied with a downmix signal. A controller is responsible for controlling the components of the decoding system, whether in steady-state parametric mode, steady-state discrete decoding mode and transitions between these. In defective frames of the input signal, which do not allow the mixing parameters to be decoded, the controller is configured to perform various error handling procedures including: parametric decoding using previous values of the mixing parameters; continuing parametric decoding for a limited duration, and/or outputting the core signal without spatial synthesis. | 05-21-2015 |
20150142424 | Enhancement of Multichannel Audio - The invention relates to audio signal processing. More specifically, the invention relates to enhancing multichannel audio, such as television audio, by applying a gain to the audio that has been smoothed between portions of the audio. The invention relates to methods, apparatus for performing such methods, and to software stored on a computer-readable medium for causing a computer to perform such methods. | 05-21-2015 |
20150138250 | Systems and Methods for Controlling Dual Modulation Displays - In one embodiment, a dual modulator display systems and methods for rendering target image data upon the dual modulator display system are disclosed where the display system receives target image data, possible HDR image data and first calculates display control signals and then calculates backlight control signals from the display control signals. This order of calculating display signals and then backlight control signals later as a function of the display systems may tend to reduce clipping artifacts. In other embodiments, it is possible to split the input target HDR image data into a base layer and a detail layer, wherein the base layer is the low spatial resolution image data that may be utilized as for backlight illumination data. The detail layer is higher spatial resolution image data that may be utilized for display control data. | 05-21-2015 |
20150131800 | Efficient Encoding and Decoding of Multi-Channel Audio Signal with Multiple Substreams - The present document relates to audio encoding/decoding. In particular, the present document relates to a method and system for improving the quality of encoded multi-channel audio signals. An audio encoder configured to encode a multi-channel audio signal according to a total available data-rate is described. The multi-channel audio signal is representable as a basic group ( | 05-14-2015 |
20150124176 | Enhanced Global Dimming for Projector Display Systems - Projector display systems comprising a light dimmer and first modulator are disclosed. The light dimmer may comprise an adjustable iris, adjustable light sources and/or LCD stack that is capable of lowering the luminance of the light source illuminated the first modulator. The first modulator may comprise a plurality of analog mirrors (e.g. MEMS array) and the second modulator may comprise a plurality of mirrors (e.g., DMD array). The display system may further comprise a controller that sends control signals to the light dimmer and first modulator. The display system may render a desired dynamic range for rendering a projected image by a combination of such control signals. | 05-07-2015 |
20150117551 | Error Control in Multi-Stream EDR Video Codec - Error control in multi-stream visual dynamic range (VDR) codecs is described, including for a case of a layer-decomposed (non-backward compatible) video codecs. Error control can be provided by concealing lost and/or corrupted data in data frames of a decoded VDR bitstream prior to rendering a corresponding VDR image. Various algorithms and methods for concealing lost and/or corrupted data are provided. | 04-30-2015 |
20150109530 | LOW LATENCY AND LOW COMPLEXITY PHASE SHIFT NETWORK - A high performance, low complexity phase shift network may be created with one or more non-first-order all-pass recursive filters that are built on top of a plurality of first-order and/or second-order all-pass recursive filters and/or delay lines. A target time delay, whether large or small, may be specified as a constraint for a non-first-order all-pass recursive filter. A target phase response may be determined for the non-first-order all-pass recursive filter. Phase errors between the target phase response and a calculated phase response with filter coefficients of the non-first-order all-pass recursive filter may be minimized to yield a set of optimized values for the filter coefficients of the non-first-order all-pass recursive filter. | 04-23-2015 |
20150104022 | Audio Processing Method and Audio Processing Apparatus - An audio processing method and apparatus are described. In one embodiment, at least one first sub-band of a first audio signal is suppressed to obtain a reduced first audio signal with reserved sub-bands; suppressing at least one second sub-band of the at least one second audio signal to obtain at least one reduced second audio signal with reserved sub-bands; and mixing the reduced first audio signal and at least one reduced second audio signal. Alternatively, a first spatial auditory property is assigned to a first audio signal so that the first audio signal may be perceived as originating from a first position. Alternatively, rhythmic similarity between at least two audio signals is detected, and time scaling is applied to an audio signal in response to relatively high rhythmic similarity between the audio signal and the other audio signal(s); and then at least two audio signals are mixed. | 04-16-2015 |
20150104021 | SYSTEM FOR MAINTAINING REVERSIBLE DYNAMIC RANGE CONTROL INFORMATION ASSOCIATED WITH PARAMETRIC AUDIO CODERS - On the basis of a bitstream (P), an n-channel audio signal (X) is reconstructed by deriving an m-channel core signal (Y) and multichannel coding parameters (α) from the bitstream, where 1≦m04-16-2015 | |
20150103091 | Tone and Gamut Mapping Methods and Apparatus - Tone and/or gamut mapping apparatus and methods may be applied to map color values in image data for display on a particular display or other downstream device. A mapping algorithm may be selected based on location and/or color coordinates for pixel data being mapped. The apparatus and methods may be configured to map color coordinates differently depending on whether or not a pixel corresponds to a light source in an image and/or has special or reserved color values. | 04-16-2015 |
20150098650 | Systems and Methods to Optimize Conversions for Wide Gamut Opponent Color Spaces - Novel methods and systems for color space conversions are disclosed, relating to the optimization of a transformation matrix to convert between wide gamut opponent color spaces. The optimization may be based on whether the color values are in or out of gamut. | 04-09-2015 |
20150095512 | Network-Synchronized Media Playback - In a system that includes two or more computing systems connected to a computer network, a network control and synchronization (NetSync) application controls the in-sync playback of media files across different computing systems in the network, where each computing system is playing a local version of a media file using a local instance of a Media player. The NetSync application receives status messages from all Media players and controls the playback of all media files by sending them playback commands based on the received status messages, so that video playback among the players is in sync with a Master Media player. Alternatively, media playback across all Media players is based on user-entered playback commands, such as Play, Pause, Stop, and the like, entered using either the NetSync application interface or NetSync command scripts. | 04-02-2015 |
20150092950 | Matching Reverberation in Teleconferencing Environments - A system and method of matching reverberation in teleconferencing environments. When the two ends of a conversation are in environments with differing reverberations, the method filters the reverberation so that when both signals are output at the near end (e.g., the audio signal from the far end and the sidetone from the near end), the reverberations match. In this manner, the user does not perceive an annoying difference in reverberations, and the user experience is improved. | 04-02-2015 |
20150092850 | Weighted Multi-Band Cross Color Channel Predictor - A sequence of visual dynamic range (VDR) images is encoded using a standard dynamic range (SDR) base layer and one or more enhancement layers. A predicted VDR image is generated from an SDR input by using a weighted, multi-band, cross-color channel prediction model. Exponential weights with an adaptable decay parameter for each band are also presented. | 04-02-2015 |
20150092847 | Hardware Efficient Sparse FIR Filtering in Video Codec - In an embodiment, a control map of false contour filtering is generated for a predicted image. The predicted image is predicted from a low dynamic range image mapped from the wide dynamic range image. Based at least in part on the control map of false contour filtering and the predicted image, one or more filter parameters for a sparse finite-impulse-response (FIR) filter are determined. The sparse FIR filter is applied to filter pixel values in a portion of the predicted image based at least in part on the control map of false contour filtering. The control map of false contour filtering is encoded into a part of a multi-layer video signal that includes the low dynamic range image. | 04-02-2015 |
20150085925 | Video Codecs with Integrated Gamut Management - Image decoders encoders and transcoders incorporate gamut transformations. The gamut transformations alter tone, color or other characteristics of image data. The gamut transformations may comprise interpolation, extrapolation, direct mapping of pixel values and/or modification of an expansion function. Gamut transformations may be applied to generate image output (video or still) adapted for display on a target display. | 03-26-2015 |
20150081283 | HARMONICITY ESTIMATION, AUDIO CLASSIFICATION, PITCH DETERMINATION AND NOISE ESTIMATION - Embodiments are described for harmonicity estimation, audio classification, pitch determination and noise estimation. Measuring harmonicity of an audio signal includes calculation a log amplitude spectrum of audio signal. A first spectrum is derived by calculating each component of the first spectrum as a sum of components of the log amplitude spectrum on frequencies. In linear frequency scale, the frequencies are odd multiples of the component's frequency of the first spectrum. A second spectrum is derived by calculating each component of the second spectrum as a sum of components of the log amplitude spectrum on frequencies. In linear frequency scale, the frequencies are even multiples of the component's frequency of the second spectrum. A difference spectrum is derived subtracting the first spectrum from the second spectrum. A measure of harmonicity is generated as a monotonically increasing function of the maximum component of the difference spectrum within predetermined frequency range. | 03-19-2015 |
20150078594 | System and Method of Speaker Cluster Design and Rendering - A method of outputting audio in a teleconferencing environment includes receiving audio streams, processing the audio streams according to information regarding effective spatial positions, and outputting, by at least three speakers arranged in more than one dimension, the audio streams having been processed. The information regarding the plurality of effective spatial positions corresponds to a perceived spatial scene that extends beyond the speakers in at least two dimensions. In this manner, participants in the teleconference perceive the audio from the remote participants as originating at different positions in the teleconference room. | 03-19-2015 |
20150078585 | SYSTEM AND METHOD FOR LEVELING LOUDNESS VARIATION IN AN AUDIO SIGNAL - Systems and methods for leveling loudness variation in an audio signal are described. Embodiments use both a perceptual leveling algorithm and a standards-based loudness measure together to minimize audio process artifacts and ensure that the measured loudness of the processed audio is close to a required measure, according to a particular standard measurement of loudness. These systems and methods can be used either offline or in real-time. | 03-19-2015 |
20150071615 | Video Display Control Using Embedded Metadata - Systems, apparatus and methods are provided for generating, delivering, processing and displaying video data to preserve the video creator=s creative intent. Metadata which may be used to guide the processing and display of the video data is dynamically generated and embedded in the video data throughout a video delivery pipeline. The metadata may be embedded in the guard bits or least significant bits of chrominance channels of video data. Methods are provided for managing the encoding, delivery and decoding of the metadata to avoid unintentional communication of reserved words, minimize image artifacts which may result from overwriting least significant bits with metadata, prioritize delivery of metadata in the event of limited metadata bandwidth, and facilitate timely delivery of metadata to a downstream device. | 03-12-2015 |
20150071446 | Audio Processing Method and Audio Processing Apparatus - An audio processing method and an audio processing apparatus are described. A mono-channel audio signal is transformed into a plurality of first subband signals. Proportions of a desired component and a noise component are estimated in each of the subband signals. Second subband signals corresponding respectively to a plurality of channels are generated from each of the first subband signals. Each of the second subband signals comprises a first component and a second component obtained by assigning a spatial hearing property and a perceptual hearing property different from the spatial hearing property to the desired component and the noise component in the corresponding first subband signal respectively, based on a multi-dimensional auditory presentation method. The second subband signals are transformed into signals for rendering with the multi-dimensional auditory presentation method. By assigning different hearing properties to desired sound and noise, the intelligibility of the audio signal can be improved. | 03-12-2015 |
20150066923 | REFERENCE CARD FOR SCENE REFERRED METADATA CAPTURE - Scene-referred metadata comprising correspondence relationships between coded values used in generated images and reference values defined independent of any specific image may be provided as a part of image metadata for the generated images. Downstream image processing devices or image rendering devices may use the scene-referred metadata to perform image processing or rendering operations. When coded values of input images are altered in corresponding output images, the scene-referred metadata may be updated with new coded values used in the output images. Reference values refer to reference color values or reference gray levels. Coded values refer to color values or gray levels coded in pixels or sub-pixels of one or more images. | 03-05-2015 |
20150058010 | METHOD AND SYSTEM FOR BIAS CORRECTED SPEECH LEVEL DETERMINATION - Method for measuring level of speech determined by an audio signal in a manner which corrects for and reduces the effect of modification of the signal by the addition of noise thereto and/or amplitude compression thereof, and a system configured to perform any embodiment of the method. In some embodiments, the method includes steps of generating frequency banded, frequency-domain data indicative of an input speech signal, determining from the data a Gaussian parametric spectral model of the speech signal, and determining from the parametric spectral model an estimated mean speech level and a standard deviation value for each frequency band of the data; and generating speech level data indicative of a bias corrected mean speech level for each frequency band, including using at least one correction value to correct the estimated mean speech level for the frequency band, where each correction value has been predetermined using a reference speech model. | 02-26-2015 |
20150057779 | Live Engine - Non-media data relating to real-world objects or persons are collected from a scene while media data from the same scene are collected. The media data comprise audio data only or audiovisual data, whereas the non-media data comprise telemetry data and/or non-telemetry data. Based at least in part on the non-media data relating to the real-world objects or persons in the scene, emitter-listener relationships between a listener and some or all of the real-world objects or persons are determined Audio objects comprising audio content portions and non-audio data portions are generated. At least one audio object is generated based at least in part on the emitter-listener relationships. | 02-26-2015 |
20150055770 | Placement of Sound Signals in a 2D or 3D Audio Conference - A conference controller ( | 02-26-2015 |
20150054807 | Methods and Apparatus for Estimating Light Adaptation Levels of Persons Viewing Displays - Methods and apparatus for estimating adaptation of the human visual system take into account the distribution of light detectors (rods and cones) in the human eye to weight contributions to adaptation from displayed content and ambient lighting. The estimated adaptation may be applied to control factors such as contrast and saturation of displayed content. | 02-26-2015 |
20150052455 | Schemes for Emphasizing Talkers in a 2D or 3D Conference Scene - The present document relates to methods and systems for setting up and managing two-dimensional or three-dimensional scenes for audio conferences. A conference controller ( | 02-19-2015 |
20150051906 | Hierarchical Active Voice Detection - One or more audio signals are processed using a multi-stage (hierarchical) voice and/or signal activity detector (VAD/SAD). A first stage is capable of reducing the workload bandwidth by employing an inexpensive VAD/SAD processor. One or more subsequent stages may further process the audio signals from the first stage. Other implementations may include a first stage that also performs continuity preservation between last blocks of audio signal and the first blocks of audio after it is detected that relevant audio signals are resumed. In yet other implementations, the first stage may extract features from audio signals when they are presented in their coded domain, and possibly with little or no decoding of the audio signal. | 02-19-2015 |
20150049868 | Clustering of Audio Streams in a 2D / 3D Conference Scene - The present document relates to methods and systems for setting up and managing two-dimensional or three-dimensional scenes for audio conferences. A conference controller ( | 02-19-2015 |
20150049583 | Conferencing Device Self Test - A plurality of acoustic sensors in a non-anechoic environment are calibrated with the aim of removing manufacturing tolerances and degradation over time but preserving position-dependent differences between the sensors, The sensors are excited by an acoustic stimulus which has either time-dependent characteristics or finite duration. The calibration is to be based on diffuse-field excitation only, in which indirect propagation (including single or multiple reflections) dominate over any direct-path excitation. For this purpose, the calibration process considers only a non-initial portion of sensor outputs and/or of an impulse response derived there-from. Based on these data, a frequency-dependent magnitude response function is estimated and compared with a target response function, from which a calibration function is derived. | 02-19-2015 |
20150049132 | Backlight Simulation at Reduced Resolutions to Determine Spatial Modulation of Light for High Dynamic Range Images - Embodiments of the invention relate generally to generating images with an enhanced range of brightness levels, and more particularly, to facilitating high dynamic range imaging by adjusting pixel data and/or using predicted values of luminance, for example, at different resolutions. In at least one embodiment, a method generates an image with an enhanced range of brightness levels. The method can include accessing a model of backlight that includes data representing values of luminance for a number of first samples. The method also can include inverting the values of luminance, as well as upsampling inverted values of luminance to determine upsampled values of luminance. Further, the method can include scaling pixel data for a number of second samples by the upsampled values of luminance to control a modulator to generate an image. | 02-19-2015 |
20150043754 | System and Method for Non-Destructively Normalizing Loudness of Audio Signals within Portable Devices - Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A playback device may also adaptively apply gain and limiting to the playback audio Implementations in encoders, in transcoders and in decoders are disclosed. | 02-12-2015 |
20150042890 | METHOD AND SYSTEM FOR VIDEO EQUALIZATION - Video equalization including performing equalization such that a sequence of images have dynamic range (optionally other characteristics) that is constant to a predetermined degree, where the input video includes high and standard dynamic range videos and images from both. Equalization is performed with a common anchor point (e.g., 20% gray level, or log mean of luminance) input video and the equalized video, and such that the images determined by the equalized video have at least substantially the same average luminance as images determined by the input video. Other aspects are systems (e.g., display systems and video delivery systems) configured to perform embodiments of the equalization method. | 02-12-2015 |
20150036679 | METHODS AND APPARATUSES FOR TRANSMITTING AND RECEIVING AUDIO SIGNALS - Methods and corresponding apparatuses for transmitting and receiving audio signals are described. A transformation is performed on the audio signals in units of frame in order to obtain transformed audio data of each frame, said transformed audio data consisting of multiple signal components in the frequency domain. These signal components of each frame are distributed into multiple adjacent packets in order to generate packets in which signal components distributed from multiple frames are interleaved. Subsequently, the generated packets are transmitted. Accordingly, in case that packet loss occurs during transmission, the audio signals can be recovered based on the received signal components without consuming additional bandwidth. Therefore, robustness against packet loss can be achieved with little overhead. | 02-05-2015 |
20150036057 | Multiple Stage Modulation Projector Display Systems Having Efficient Light Utilization - Dual or multi-modulation display systems comprising a first modulator and a second modulator are disclosed. The first modulator may comprise a plurality of analog mirrors (e.g. MEMS array) and the second modulator may comprise a plurality of mirrors (e.g., DMD array). The display system may further comprise a controller that sends control signals to the first and second modulator. The display system may render highlight features within a projected image by affecting a time multiplexing scheme. In one embodiment, the first modulator may be switched on a sub-frame basis such that a desired proportion of the available light may be focused or directed onto the second modulator to form the highlight feature on a sub-frame rendering basis. | 02-05-2015 |
20150036023 | LIGHTING SYSTEM AND METHOD FOR IMAGE AND OBJECT ENHANCEMENT - A novel lighting system includes an image capture device, an image processor, and an image projector. In a particular embodiment, the image capture device captures and converts images of a subject into image data, the image processor generates illumination patterns based on the image data, and the image projector projects the illumination patterns onto the subject. Optionally, the lighting system includes a controlled feedback mechanism for periodically updating the illumination pattern. In a more particular embodiment, the lighting system continually updates the illumination pattern to illuminate the subject in real-time. | 02-05-2015 |
20150032447 | Determining a Harmonicity Measure for Voice Processing - A method, an apparatus, and a computer-readable medium configured with instructions that when executed carry out the method for determining a measure of harmonicity. In one embodiment the method includes selecting candidate fundamental frequencies within a range, and for candidate determining a mask or retrieving a pre-calculated mask that has positive value for each frequency that contributed to harmonicity, and negative value for each frequency that contributes to inharmonicity. A candidate harmonicity measure is calculated for each candidate fundamental by summing the product of the mask and the magnitude measure spectrum. The harmonicity measure is selected as the maximum of the candidate harmonicity measures. | 01-29-2015 |
20150032446 | METHOD AND SYSTEM FOR SIGNAL TRANSMISSION CONTROL - An audio signal with a temporal sequence of blocks or frames is received or accessed. Features are determined as characterizing aggregately the sequential audio blocks/frames that have been processed recently, relative to current time. The feature determination exceeds a specificity criterion and is delayed, relative to the recently processed audio blocks/frames. Voice activity indication is detected in the audio signal. VAD is based on a decision that exceeds a preset sensitivity threshold and is computed over a brief time period, relative to blocks/frames duration, and relates to current block/frame features. The VAD and the recent feature determination are combined with state related information, which is based on a history of previous feature determinations that are compiled from multiple features, determined over a time prior to the recent feature determination time period. Decisions to commence or terminate the audio signal, or related gains, are outputted based on the combination. | 01-29-2015 |
20150030180 | POST-PROCESSING GAINS FOR SIGNAL ENHANCEMENT - A method, an apparatus, and logic to post-process raw gains determined by input processing to generate post-processed gains, comprising using one or both of delta gain smoothing and decision-directed gain smoothing. The delta gain smoothing comprises applying a smoothing filter to the raw gain with a smoothing factor that depends on the gain delta: the absolute value of the difference between the raw gain for the current frame and the post-processed gain for a previous frame. The decision-directed gain smoothing comprises converting the raw gain to a signal-to-noise ratio, applying a smoothing filter with a smoothing factor to the signal-to-noise ratio to calculate a smoothed signal-to-noise ratio, and converting the smoothed signal-to-noise ratio to determine the second smoothed gain, with smoothing factor possibly dependent on the gain delta. | 01-29-2015 |
20150030161 | Decoding of Multichannel Audio Encoded Bit Streams Using Adaptive Hybrid Transformation - The processing efficiency of a process used to decode frames of an enhanced AC-3 bit stream is improved by processing each audio block in a frame only once. Audio blocks of encoded data are decoded in block order rather than in channel order. Exemplary decoding processes for enhanced bit stream coding features such as adaptive hybrid transform processing and spectral extension are disclosed. | 01-29-2015 |
20150030017 | VOICE COMMUNICATION METHOD AND APPARATUS AND METHOD AND APPARATUS FOR OPERATING JITTER BUFFER - Voice communication method and apparatus and method and apparatus for operating jitter buffer are described. Audio blocks are acquired in sequence. Each of the audio blocks includes one or more audio frames. Voice activity detection is performed on the audio blocks. In response to deciding voice onset for a present one of the audio blocks, a subsequence of the sequence of the acquired audio blocks is retrieved. The subsequence precedes the present audio block immediately. The subsequence has a predetermined length and non-voice is decided for each audio block in the subsequence. The present audio block and the audio blocks in the subsequence are transmitted to a receiving party. The audio blocks in the subsequence are identified as reprocessed audio blocks. In response to deciding non-voice for the present audio block, the present audio block is cached. | 01-29-2015 |
20150029210 | Systems and Methods for ISO-Perceptible Power Reduction for Displays - Several embodiments of systems and methods are disclosed that create iso-perceptible image data from input image data. Such iso-perceptible image data may be created from Just-Noticeable-Difference (JND) modeling that leverages models from the Human Visual System (HVS). From a set of iso-perceptible image data set, an output image data may be selected, such that the chosen output image data has a less power and/or energy requirement to render than the input image data. Further, the output image data may have a substantially lower power and/or energy requirement than the set of iso-perceptible image data. | 01-29-2015 |
20150025664 | Interactive Audio Content Generation, Delivery, Playback and Sharing - Control data templates are generated independent of a plurality of audio elements based on user input. The user input relates to parameter values and control inputs for operations. In response to receiving audio elements after the control data templates are generated, audio objects are generated to store audio sample data representing the audio elements. Control data is generated based on the parameter values and the control inputs for the operations in the control data templates. The control data specifies the operations to be performed while rendering the audio objects. The control data is then stored separately from the audio sample data in the audio objects. The audio objects can be communicated to downstream recipient devices for rendering and/or remixing. | 01-22-2015 |
20150023514 | Method and Apparatus for Acoustic Echo Control - Embodiments of method and apparatus for acoustic echo control are described. According to the method, an echo energy-based doubletalk detection is performed to determine whether there is a doubletalk in a microphone signal with reference to a loudspeaker signal. A spectral similarity between spectra of the microphone signal and the loudspeaker signal is calculated. It is determined that there is no doubletalk in the microphone signal if the spectral similarity is higher than a threshold level. Adaption of an adaptive filter for applying acoustic echo cancellation or acoustic echo suppression on the microphone signal is enabled if it is determined that there is no doubletalk in the microphone signal through the echo energy-based doubletalk detection, or there is no doubletalk through the spectral similarity-based doubletalk detection. | 01-22-2015 |
20150022685 | Spectral Synthesis for Image Capture Device Processing - A substantially rectangular spectral representation is synthesized, which is adapted to produce image capture device sensor outputs if applied to an image capture device. The synthesized substantially rectangular spectral representation can be utilized in generating output color values of an output color space from image capture device sensor outputs, where the image capture device sensor outputs correspond to an image captured by an image capture device. The generated output color values correspond to colors perceived by the human visual system for the same image as that captured by the image capture device. Image capture device gamut is also determined. | 01-22-2015 |
20150012266 | Talker Collisions in an Auditory Scene - From a plurality of received voice signals, a signal interval in which there is a talker collision between at least a first and a second voice signal is detected. A processor receives a positive detection result and processes, in response to this, at least one of the voice signals with the aim of making it perceptually distinguishable. A mixer mixes the voice signals to supply an output signal, wherein the processed signal(s) replaces the corresponding received signals. In example embodiments, signal content is shifted away from the talker collision in frequency or in time. The invention may be useful in a conferencing system. | 01-08-2015 |
20150009302 | AUTOSTEREO TAPESTRY REPRESENTATION - Representation and coding of multi-view images using tapestry encoding are described. A tapestry comprises information on a tapestry image, a left-shift displacement map and a right-shift displacement map. Perspective images of a scene can be generated from the tapestry and the displacement maps. The tapestry image is generated from a leftmost view image, a rightmost view image, a disparity map and an occlusion map. | 01-08-2015 |