Patent application number | Description | Published |
20100281439 | Method to Control Perspective for a Camera-Controlled Computer - Systems, methods and computer readable media are disclosed for controlling perspective of a camera-controlled computer. A capture device captures user gestures and sends corresponding data to a recognizer engine. The recognizer engine analyzes the data with a plurality of filters, each filter corresponding to a gesture. Based on the output of those filters, a perspective control is determined, and a display device displays a new perspective corresponding to the perspective control. | 11-04-2010 |
20110080336 | Human Tracking System - An image such as a depth image of a scene may be received, observed, or captured by a device. A grid of voxels may then be generated based on the depth image such that the depth image may be downsampled. A background included in the grid of voxels may also be removed to isolate one or more voxels associated with a foreground object such as a human target. A location or position of one or more extremities of the isolated human target may be determined and a model may be adjusted based on the location or position of the one or more extremities. | 04-07-2011 |
20110080475 | Methods And Systems For Determining And Tracking Extremities Of A Target - An image such as a depth image of a scene may be received, observed, or captured by a device. A grid of voxels may then be generated based on the depth image such that the depth image may be downsampled. A background included in the grid of voxels may also be removed to isolate one or more voxels associated with a foreground object such as a human target. A location or position of one or more extremities of the isolated human target may then be determined. | 04-07-2011 |
20110081044 | Systems And Methods For Removing A Background Of An Image - An image such as a depth image of a scene may be received, observed, or captured by a device. A grid of voxels may then be generated based on the depth image such that the depth image may be downsampled. A background included in the grid of voxels may then be discarded to isolate one or more voxels associated with a foreground object such as a human target and the isolated voxels associated with the foreground object may be processed. | 04-07-2011 |
20110150271 | MOTION DETECTION USING DEPTH IMAGES - A sensor system creates a sequence of depth images that are used to detect and track motion of objects within range of the sensor system. A reference image is created and updated based on a moving average (or other function) of a set of depth images. A new depth images is compared to the reference image to create a motion image, which is an image file (or other data structure) with data representing motion. The new depth image is also used to update the reference image. The data in the motion image is grouped and associated with one or more objects being tracked. The tracking of the objects is updated by the grouped data in the motion image. The new positions of the objects are used to update an application. For example, a video game system will update the position of images displayed in the video based on the new positions of the objects. In one implementation, avatars can be moved based on movement of the user in front of a camera. | 06-23-2011 |
20120128208 | Human Tracking System - An image such as a depth image of a scene may be received, observed, or captured by a device. A grid of voxels may then be generated based on the depth image such that the depth image may be downsampled. A background included in the grid of voxels may also be removed to isolate one or more voxels associated with a foreground object such as a human target. A location or position of one or more extremities of the isolated human target may be determined and a model may be adjusted based on the location or position of the one or more extremities. | 05-24-2012 |
20120177254 | MOTION DETECTION USING DEPTH IMAGES - A sensor system creates a sequence of depth images that are used to detect and track motion of objects within range of the sensor system. A reference image is created and updated based on a moving average (or other function) of a set of depth images. A new depth images is compared to the reference image to create a motion image, which is an image file (or other data structure) with data representing motion. The new depth image is also used to update the reference image. The data in the motion image is grouped and associated with one or more objects being tracked. The tracking of the objects is updated by the grouped data in the motion image. The new positions of the objects are used to update an application. | 07-12-2012 |
20130129155 | MOTION DETECTION USING DEPTH IMAGES - A sensor system creates a sequence of depth images that are used to detect and track motion of objects within range of the sensor system. A reference image is created and updated based on a moving average (or other function) of a set of depth images. A new depth images is compared to the reference image to create a motion image, which is an image file (or other data structure) with data representing motion. The new depth image is also used to update the reference image. The data in the motion image is grouped and associated with one or more objects being tracked. The tracking of the objects is updated by the grouped data in the motion image. The new positions of the objects are used to update an application. | 05-23-2013 |
20140022161 | HUMAN TRACKING SYSTEM - An image such as a depth image of a scene may be received, observed, or captured by a device. A grid of voxels may then be generated based on the depth image such that the depth image may be downsampled. A background included in the grid of voxels may also be removed to isolate one or more voxels associated with a foreground object such as a human target. A location or position of one or more extremities of the isolated human target may be determined and a model may be adjusted based on the location or position of the one or more extremities. | 01-23-2014 |
20140044309 | HUMAN TRACKING SYSTEM - An image such as a depth image of a scene may be received, observed, or captured by a device. A grid of voxels may then be generated based on the depth image such that the depth image may be downsampled. A background included in the grid of voxels may also be removed to isolate one or more voxels associated with a foreground object such as a human target. A location or position of one or more extremities of the isolated human target may be determined and a model may be adjusted based on the location or position of the one or more extremities. | 02-13-2014 |
20140112547 | SYSTEMS AND METHODS FOR REMOVING A BACKGROUND OF AN IMAGE - An image such as a depth image of a scene may be received, observed, or captured by a device. A grid of voxels may then be generated based on the depth image such that the depth image may be downsampled. A background included in the grid of voxels may then be discarded to isolate one or more voxels associated with a foreground object such as a human target and the isolated voxels associated with the foreground object may be processed. | 04-24-2014 |
20140168075 | Method to Control Perspective for a Camera-Controlled Computer - Systems, methods and computer readable media are disclosed for controlling perspective of a camera-controlled computer. A capture device captures user gestures and sends corresponding data to a recognizer engine. The recognizer engine analyzes the data with a plurality of filters, each filter corresponding to a gesture. Based on the output of those filters, a perspective control is determined, and a display device displays a new perspective corresponding to the perspective control. | 06-19-2014 |
20150131862 | HUMAN TRACKING SYSTEM - An image such as a depth image of a scene may be received, observed, or captured by a device. A grid of voxels may then be generated based on the depth image such that the depth image may be downsampled. A background included in the grid of voxels may also be removed to isolate one or more voxels associated with a foreground object such as a human target. A location or position of one or more extremities of the isolated human target may be determined and a model may be adjusted based on the location or position of the one or more extremities. | 05-14-2015 |
Patent application number | Description | Published |
20100195867 | VISUAL TARGET TRACKING USING MODEL FITTING AND EXEMPLAR - A method of tracking a target includes receiving an observed depth image of the target from a source and analyzing the observed depth image with a prior-trained collection of known poses to find an exemplar pose that represents an observed pose of the target. The method further includes rasterizing a model of the target into a synthesized depth image having a rasterized pose and adjusting the rasterized pose of the model into a model-fitting pose based, at least in part, on differences between the observed depth image and the synthesized depth image. Either the exemplar pose or the model-fitting pose is then selected to represent the target. | 08-05-2010 |
20110058709 | VISUAL TARGET TRACKING USING MODEL FITTING AND EXEMPLAR - A method of tracking a target includes receiving an observed depth image of the target from a source and analyzing the observed depth image with a prior-trained collection of known poses to find an exemplar pose that represents an observed pose of the target. The method further includes rasterizing a model of the target into a synthesized depth image having a rasterized pose and adjusting the rasterized pose of the model into a model-fitting pose based, at least in part, on differences between the observed depth image and the synthesized depth image. Either the exemplar pose or the model-fitting pose is then selected to represent the target. | 03-10-2011 |
20110081045 | Systems And Methods For Tracking A Model - An image such as a depth image of a scene may be received, observed, or captured by a device. A grid of voxels may then be generated based on the depth image such that the depth image may be downsampled. A model may be adjusted based on a location or position of one or more extremities estimated or determined for a human target in the grid of voxels. The model may also be adjusted based on a default location or position of the model in a default pose such as a T-pose, a DaVinci pose, and/or a natural pose. | 04-07-2011 |
20110234589 | SYSTEMS AND METHODS FOR TRACKING A MODEL - An image such as a depth image of a scene may be received, observed, or captured by a device. A grid of voxels may then be generated based on the depth image such that the depth image may be downsampled. A model may be adjusted based on a location or position of one or more extremities estimated or determined for a human target in the grid of voxels. The model may also be adjusted based on a default location or position of the model in a default pose such as a T-pose, a DaVinci pose, and/or a natural pose. | 09-29-2011 |
20110317871 | SKELETAL JOINT RECOGNITION AND TRACKING SYSTEM - A system and method are disclosed for recognizing and tracking a user's skeletal joints with a NUI system and further, for recognizing and tracking only some skeletal joints, such as for example a user's upper body. The system may include a limb identification engine which may use various methods to evaluate, identify and track positions of body parts of one or more users in a scene. In examples, further processing efficiency may be achieved by segmenting the field of view in smaller zones, and focusing on one zone at a time. Moreover, each zone may have its own set of predefined gestures which are recognized. | 12-29-2011 |
20120057753 | SYSTEMS AND METHODS FOR TRACKING A MODEL - An image such as a depth image of a scene may be received, observed, or captured by a device. A grid of voxels may then be generated based on the depth image such that the depth image may be downsampled. A model may be adjusted based on a location or position of one or more extremities estimated or determined for a human target in the grid of voxels. The model may also be adjusted based on a default location or position of the model in a default pose such as a T-pose, a DaVinci pose, and/or a natural pose. | 03-08-2012 |
20120162065 | SKELETAL JOINT RECOGNITION AND TRACKING SYSTEM - A system and method are disclosed for recognizing and tracking a user's skeletal joints with a NUI system and further, for recognizing and tracking only some skeletal joints, such as for example a user's upper body. The system may include a limb identification engine which may use various methods to evaluate, identify and track positions of body parts of one or more users in a scene. In examples, further processing efficiency may be achieved by segmenting the field of view in smaller zones, and focusing on one zone at a time. Moreover, each zone may have its own set of predefined gestures which are recognized. | 06-28-2012 |
20130243257 | SYSTEMS AND METHODS FOR TRACKING A MODEL - An image such as a depth image of a scene may be received, observed, or captured by a device. A grid of voxels may then be generated based on the depth image such that the depth image may be downsampled. A model may be adjusted based on a location or position of one or more extremities estimated or determined for a human target in the grid of voxels. The model may also be adjusted based on a default location or position of the model in a default pose such as a T-pose, a DaVinci pose, and/or a natural pose. | 09-19-2013 |
20150098619 | METHODS AND SYSTEMS FOR DETERMINING AND TRACKING EXTREMITIES OF A TARGET - An image such as a depth image of a scene may be received, observed, or captured by a device. A grid of voxels may then be generated based on the depth image such that the depth image may be downsampled. A background included in the grid of voxels may also be removed to isolate one or more voxels associated with a foreground object such as a human target. A location or position of one or more extremities of the isolated human target may then be determined. | 04-09-2015 |
Patent application number | Description | Published |
20080240250 | Regions of interest for quality adjustments - Quality settings established by an encoder are adjusted based on information associated with regions of interest (“ROIs”). For example, quantization step sizes can be reduced (to improve quality) or increased (to reduce bit rate). ROIs can be identified and quality settings can be adjusted based on input received from a user interface. An overlap setting can be determined for a portion of a picture that corresponds to an ROI overlap area. For example, an overlap setting is chosen from step sizes corresponding to a first overlapping ROI and a second overlapping ROI, or from relative reductions in step size corresponding to the first ROI and the second ROI. ROIs can be parameterized by information (e.g., using data structures) that indicates spatial dimensions of the ROIs and quality adjustment information (e.g., dead zone information, step size information, and quantization mode information). | 10-02-2008 |
20090093164 | HIGH-DEFINITION CONNECTOR FOR TELEVISIONS - A connector is disclosed that includes one or more pins allowing power to be provided to a connector board attached externally to the television set. In one embodiment, the connector has one or more USB pins to allow serial communication between the television set and the connector board. In another embodiment, the connector has one or more pins to allow communication of television setup information to the connector board. In yet another embodiment, the connector board may use the setup information provided from the television to perform audio processing and deliver enhanced audio sound to the speaker system associated with the television. The connector board may also wirelessly communicate with a personal computer, thereby coupling the personal computer to the television set. | 04-09-2009 |
20090326962 | QUALITY IMPROVEMENT TECHNIQUES IN AN AUDIO ENCODER - An audio encoder implements multi-channel coding decision, band truncation, multi-channel rematrixing, and header reduction techniques to improve quality and coding efficiency. In the multi-channel coding decision technique, the audio encoder dynamically selects between joint and independent coding of a multi-channel audio signal via an open-loop decision based upon (a) energy separation between the coding channels, and (b) the disparity between excitation patterns of the separate input channels. In the band truncation technique, the audio encoder performs open-loop band truncation at a cut-off frequency based on a target perceptual quality measure. In multi-channel rematrixing technique, the audio encoder suppresses certain coefficients of a difference channel by scaling according to a scale factor, which is based on current average levels of perceptual quality, current rate control buffer fullness, coding mode, and the amount of channel separation in the source. In the header reduction technique, the audio encoder selectively modifies the quantization step size of zeroed quantization bands so as to encode in fewer frame header bits. | 12-31-2009 |
20100135412 | MEDIA CODING FOR LOSS RECOVERY WITH REMOTELY PREDICTED DATA UNITS - An improved loss recovery method for coding streaming media classifies each data unit in the media stream as an independent data unit (I unit), a remotely predicted unit (R unit) or a predicted data unit (P unit). Each of these units is organized into independent segments having an I unit, multiple P units and R units interspersed among the P units. The beginning of each segment is the start of a random access point, while each R unit provides a loss recovery point that can be placed independently of the I unit. This approach separates the random access point from the loss recovery points provided by the R units, and makes the stream more impervious to data losses without substantially impacting coding efficiency. The most important data units are transmitted with the most reliability to ensure that the majority of the data received by the client is usable. The I units are the least sensitive to transmission losses because they are coded using only their own data. While they provide the best coding efficiency, the P units are the most sensitive to data loss because the loss of one P unit renders useless all of the P units that depend on it. The remotely predicted units are dependent on the I unit, or in an alternative implementation, on another R unit. | 06-03-2010 |
20100149301 | Video Conferencing Subscription Using Multiple Bit Rate Streams - Subscriptions in a video conference may be provided using multiple bit rate streams. A video conference server may receive video streams from each client in a video conference and may receive subscription requests from each client. The subscription requests may include requests to see video streams from specific other clients at a given resolution and/or frame rate. The video conference server may match up the received video streams with the subscription requests in order to send the subscribing clients their desired video streams. The server may also be able to request different versions of video streams from participants (e.g. different resolutions) and/or alter the video streams in order to better comply with the subscription request. | 06-17-2010 |
20100153574 | Video Conference Rate Matching - Video conference rate matching may be provided. A video conference server may receive video source streams from clients on a video conference. The server may analyze each client's capabilities and choose a video stream to send to each client based on those capabilities. For example, a client capable of encoding and decoding a high definition video stream may provide three source video streams—a high definition stream, a medium resolution stream, and a low resolution stream. The server may send only the low resolution stream to a client with a low amount of available bandwidth. The server may send the medium resolution stream to another client with sufficient bandwidth for the high definition stream, but which lacks the ability to decode the high definition stream.
| 06-17-2010 |
20100296575 | OPTIMIZED ALLOCATION OF MULTI-CORE COMPUTATION FOR VIDEO ENCODING - Video encoding computations are optimized by dynamically adjusting slice patterns of video frames based on complexity of each frame and allocating multi-core threading based on the slices. The complexity may be based on predefined parameters such as color, motion, and comparable ones for each slice. Allocation is determined based on capacity and queue of each processing core such that overall computation performance for video encoding is improved. | 11-25-2010 |
20110166864 | QUANTIZATION MATRICES FOR DIGITAL AUDIO - Quantization matrices facilitate digital audio encoding and decoding. An audio encoder generates and compresses quantization matrices; an audio decoder decompresses and applies the quantization matrices. The invention includes several techniques and tools, which can be used in combination or separately. For example, the audio encoder can generate quantization matrices from critical band patterns for blocks of audio data. The encoder can compute the quantization matrices directly from the critical band patterns, which can be computed from the same audio data that is being compressed. The audio encoder/decoder can use different modes for generating/applying quantization matrices depending on the coding channel mode of multi-channel audio data. The audio encoder/decoder can use different compression/decompression modes for the quantization matrices, including a parametric compression/decompression mode. | 07-07-2011 |
20110310216 | COMBINING MULTIPLE BIT RATE AND SCALABLE VIDEO CODING - Video streams are generated using a combination of Multiple Bit Rate (MBR) encoding and Scalable Video Coding (SVC). Capabilities and requests of the clients are used in determining the video streams to generate as well as what video streams to deliver to the clients. The clients are placed into groups based on a resolution capability of the client. For each resolution grouping, MBR is used for generating spatial streams and SVC is used for generating temporal and quality streams. | 12-22-2011 |
20110310217 | REDUCING USE OF PERIODIC KEY FRAMES IN VIDEO CONFERENCING - The generation and delivery of key frames to clients of a video conference are performed in response to a need for a synchronization point. Instead of automatically sending a key frame periodically to each of the clients in the video conference, a key frame is sent to one or more clients upon the occurrence of an event in the video conference. For example, a key frame may be sent to a client when the client joins the video conference. A key frame may also be sent to a client that has packet loss, upon the request of a client, a speaker change within the video conference, when a new stream is added by the client, and the like. The clients in the video conference that are not affected by the event continue to receive predicted frames. | 12-22-2011 |
20120269266 | REGIONS OF INTEREST FOR QUALITY ADJUSTMENTS - Quality settings established by an encoder are adjusted based on information associated with regions of interest (“ROIs”). For example, quantization step sizes can be reduced (to improve quality) or increased (to reduce bit rate). ROIs can be identified and quality settings can be adjusted based on input received from a user interface. An overlap setting can be determined for a portion of a picture that corresponds to an ROI overlap area. For example, an overlap setting is chosen from step sizes corresponding to a first overlapping ROI and a second overlapping ROI, or from relative reductions in step size corresponding to the first ROI and the second ROI. ROIs can be parameterized by information (e.g., using data structures) that indicates spatial dimensions of the ROIs and quality adjustment information (e.g., dead zone information, step size information, and quantization mode information). | 10-25-2012 |
20120307890 | TECHNIQUES FOR ADAPTIVE ROUNDING OFFSET IN VIDEO ENCODING - Techniques adaptive rounding offset in video encoding are described. An apparatus may comprise a rounding offset adaptation component operative to adjust a quantization parameter rounding factor for a current macroblock of a current frame of a video stream being compressed by a video encoding system. Other embodiments are described and claimed. | 12-06-2012 |
20130039414 | EFFICIENT MACROBLOCK HEADER CODING FOR VIDEO COMPRESSION - The coded block parameters used to code blocks of image samples into structures called macroblocks are compressed more efficiently by exploiting the correlation between chrominance and luminance blocks in each macroblock. In particular, the coded block pattern for chrominance and luminance are combined into a single parameter for the macroblock and jointly coded with a single variable length code. To further enhance coding efficiency, the spatial coherence of coded block patterns can be exploited by using spatial prediction to compute predicted values for coded block pattern parameters. | 02-14-2013 |
20130055326 | TECHNIQUES FOR DYNAMIC SWITCHING BETWEEN CODED BITSTREAMS - Techniques for dynamic switching in coded bitstreams are described. An apparatus may comprise a switching component operative to determine a timepoint to switch from broadcasting a first video stream to broadcasting a second video stream, the first video stream a first encoding of a video source at a first quality level and the second video stream a second encoding of the video source at a second quality level. Other embodiments are described and claimed. | 02-28-2013 |
20130070859 | MULTI-LAYER ENCODING AND DECODING - Innovations described herein provide a generic encoding and decoding framework that includes some features of simulcast and some features of scalable video coding. For example, a bitstream multiplexer multiplexes component bitstreams into a multi-layer encoding (MLE) bitstream that provides temporal scalability, spatial resolution scalability and/or signal to noise ratio scalability. Each of the component bitstreams provides an alternative version of input video, and a given component bitstream can be a non-scalable bitstream or scalable bitstream. The multiplexer follows composition rules for the MLE bitstream and may rewrite values of certain syntax elements of component bitstreams using an approach that avoids bit shifting operations. A corresponding demultiplexer receives an MLE bitstream that includes component bitstreams and demultiplexes at least part of at least one of the component bitstreams from the MLE bitstream, following decomposition rules for the demultiplexing. | 03-21-2013 |
20130114718 | ADDING TEMPORAL SCALABILITY TO A NON-SCALABLE BITSTREAM - Innovations described herein facilitate the addition of temporal scalability to non-scalable bitstreams. For example, a bitstream rewriter receives units of encoded video data for a non-scalable bitstream from components of a hardware-based encoder. The bitstream rewriter changes at least some of the units of encoded video data so as to produce a scalable bitstream with temporal scalability. In doing so, the bitstream rewriter can associate an original sequence parameter set (SPS) and original picture parameter set (PPS) with pictures for a temporal base layer, and associate a new SPS and new PPS with pictures for a temporal enhancement layer. The bitstream rewriter can also alter syntax elements in the units of encoded video data, for example, changing syntax elements in a slice header in ways that avoid bit shifting operations for following coded slice data for a unit of encoded video data for the temporal enhancement layer. | 05-09-2013 |
20130156101 | HARDWARE-ACCELERATED DECODING OF SCALABLE VIDEO BITSTREAMS - In various respects, hardware-accelerated decoding is adapted for decoding of video that has been encoded using scalable video coding. For example, for a given picture to be decoded, a host decoder determines whether a corresponding base picture will be stored for use as a reference picture. If so, the host decoder directs decoding with an accelerator such that the some of the same decoding operations can be used for the given picture and the reference base picture. Or, as another example, the host decoder groups encoded data associated with a given layer representation in buffers. The host decoder provides the encoded data for the layer to the accelerator. The host decoder repeats the process layer-after-layer in the order that layers appear in the bitstream, according to a defined call pattern for an acceleration interface, which helps the accelerator determine the layers with which buffers are associated. | 06-20-2013 |
20130177071 | CAPABILITY ADVERTISEMENT, CONFIGURATION AND CONTROL FOR VIDEO CODING AND DECODING - Innovations described herein provide a framework for advertising encoder capabilities, initializing encoder configuration, and signaling run-time control messages for video coding and decoding. For example, an encoding controller receives a request for encoder capability data from a decoding host controller, determines the capability data, and sends the capability data in reply. The capability data can include data that indicate a number of bitstreams, each providing an alternative version of input video, as well as data that indicate scalable video coding capabilities. The decoding host controller creates stream configuration request data based on the encoder capability data, and sends the configuration request data to the encoding controller. During decoding, the decoding host controller can create and send a control message for run-time control of encoding, where the control message includes a stream identifier for a bitstream and layer identifiers for a given layer of the bitstream. | 07-11-2013 |
20130208075 | ENCODING PROCESSING FOR CONFERENCING SYSTEMS - Optimization of conference call encoding processes is provided. A first client of a multi-party conference call may receive client capability data, including video scalability support, from each of the other clients to the conference call. Based on the client capability data and the transmission capabilities of the first client, including video scalability support, the first client may determine a total number of data streams and properties for each data stream, such that the total number of data streams and the plurality of properties for each data stream are optimized and supported by the respective client capability data and the transmission capabilities. Subsequently, the first client generates one or more data streams according to the total number of data streams and the properties that were determined for each data stream and transmits the one or more data streams to the other clients of the conference call. | 08-15-2013 |
20130208809 | MULTI-LAYER RATE CONTROL - Concepts and technologies are described herein for multi-layer rate control. In accordance with the concepts and technologies disclosed herein, a video server obtains video data and encodes the video data into a multi-layer video stream. Layers of the video stream cart be output buffers and the buffers can be monitored to determine bit usage. A rate controller can obtain bit usage feedback for each layer of the encoded video stream and determine, based upon the bit usage feedback, a quantization parameter associated with each layer of the encoded video stream. In determining the quantization parameters, the rate controller can consider not only bitrates of the entire encoded video stream, but also bitrates and bit usage feedback associated with each layer of the encoded video stream. Further encoding can be based upon the quantization parameters determined by the video server. | 08-15-2013 |
20130208901 | QUANTIZATION MATRICES FOR DIGITAL AUDIO - Quantization matrices facilitate digital audio encoding and decoding. An audio encoder generates and compresses quantization matrices; an audio decoder decompresses and applies the quantization matrices. The invention includes several techniques and tools, which can be used in combination or separately. For example, the audio encoder can generate quantization matrices from critical band patterns for blocks of audio data. The encoder can compute the quantization matrices directly from the critical band patterns, which can be computed from the same audio data that is being compressed. The audio encoder/decoder can use different modes for generating/applying quantization matrices depending on the coding channel mode of multi-channel audio data. The audio encoder/decoder can use different compression/decompression modes for the quantization matrices, including a parametric compression/decompression mode. | 08-15-2013 |
20130223524 | DYNAMIC INSERTION OF SYNCHRONIZATION PREDICTED VIDEO FRAMES - A video bitstream can be encoded and sent over a computer network to a decoding computer system. The bitstream can follow a regular prediction structure when an encoding computer system is not notified of lost data from the bitstream. A notification of lost data in the bitstream can be received. The lost data can include at least a portion of a reference frame of the bitstream. In response, a synchronization predicted frame can be dynamically encoded with a prediction that references one or more other previously-sent frames in the bitstream and that does not reference the lost data. The synchronization predicted frame can be inserted in the bitstream in a position where the regular prediction structure would have dictated inserting a different predicted frame with a prediction that would have referenced the lost data according to the regular prediction structure. | 08-29-2013 |
20130301704 | VIDEO CODING / DECODING WITH RE-ORIENTED TRANSFORMS AND SUB-BLOCK TRANSFORM SIZES - Techniques and tools for video coding/decoding with sub-block transform coding/decoding and re-oriented transforms are described. For example, a video encoder adaptively switches between 8×8, 8×4, and 4×8 DCTs when encoding 8×8 prediction residual blocks; a corresponding video decoder switches between 8×8, 8×4, and 4×8 inverse DCTs during decoding. The video encoder may determine the transform sizes as well as switching levels (e.g., frame, macroblock, or block) in a closed loop evaluation of the different transform sizes and switching levels. When a video encoder or decoder uses spatial extrapolation from pixel values in a causal neighborhood to predict pixel values of a block of pixels, the encoder/decoder can use a re-oriented transform to address non-stationarity of prediction residual values. | 11-14-2013 |
20130301732 | VIDEO CODING / DECODING WITH MOTION RESOLUTION SWITCHING AND SUB-BLOCK TRANSFORM SIZES - Techniques and tools for video coding/decoding with motion resolution switching and sub-block transform coding/decoding are described. For example, a video encoder adaptively switches the resolution of motion estimation and compensation between quarter-pixel and half-pixel resolutions; a corresponding video decoder adaptively switches the resolution of motion compensation between quarter-pixel and half-pixel resolutions. For sub-block transform sizes, for example, a video encoder adaptively switches between 8×8, 8×4, and 4×8 DCTs when encoding 8×8 prediction residual blocks; a corresponding video decoder switches between 8×8, 8×4, and 4×8 inverse DCTs during decoding. | 11-14-2013 |
20130329779 | MEDIA CODING FOR LOSS RECOVERY WITH REMOTELY PREDICTED DATA UNITS - An improved loss recovery method for coding streaming media classifies each data unit in the media stream as an independent data unit (I unit), a remotely predicted unit (R unit) or a predicted data unit (P unit). Each of these units is organized into independent segments having an I unit, multiple P units and R units interspersed among the P units. The beginning of each segment is the start of a random access point, while each R unit provides a loss recovery point that can be placed independently of the I unit. This approach separates the random access point from the loss recovery points provided by the R units, and makes the stream more impervious to data losses without substantially impacting coding efficiency. The most important data units are transmitted with the most reliability to ensure that the majority of the data received by the client is usable. The I units are the least sensitive to transmission losses because they are coded using only their own data. While they provide the best coding efficiency, the P units are the most sensitive to data loss because the loss of one P unit renders useless all of the P units that depend on it. The remotely predicted units are dependent on the I unit, or in an alternative implementation, on another R unit. | 12-12-2013 |
20140039884 | QUALITY IMPROVEMENT TECHNIQUES IN AN AUDIO ENCODER - An audio encoder implements multi-channel coding decision, band truncation, multi-channel rematrixing, and header reduction techniques to improve quality and coding efficiency. In the multi-channel coding decision technique, the audio encoder dynamically selects between joint and independent coding of a multi-channel audio signal via an open-loop decision based upon (a) energy separation between the coding channels, and (b) the disparity between excitation patterns of the separate input channels. In the band truncation technique, the audio encoder performs open-loop band truncation at a cut-off frequency based on a target perceptual quality measure. In multi-channel rematrixing technique, the audio encoder suppresses certain coefficients of a difference channel by scaling according to a scale factor, which is based on current average levels of perceptual quality, current rate control buffer fullness, coding mode, and the amount of channel separation in the source. In the header reduction technique, the audio encoder selectively modifies the quantization step size of zeroed quantization bands so as to encode in fewer frame header bits. | 02-06-2014 |
20140307776 | VIDEO CODING / DECODING WITH RE-ORIENTED TRANSFORMS AND SUB-BLOCK TRANSFORM SIZES - Techniques and tools for video coding/decoding with sub-block transform coding/decoding and re-oriented transforms are described. For example, a video encoder adaptively switches between 8×8, 8×4, and 4×8 DCTs when encoding 8×8 prediction residual blocks; a corresponding video decoder switches between 8×8, 8×4, and 4×8 inverse DCTs during decoding. The video encoder may determine the transform sizes as well as switching levels (e.g., frame, macroblock, or block) in a closed loop evaluation of the different transform sizes and switching levels. When a video encoder or decoder uses spatial extrapolation from pixel values in a causal neighborhood to predict pixel values of a block of pixels, the encoder/decoder can use a re-oriented transform to address non-stationarity of prediction residual values. | 10-16-2014 |
20140316788 | QUALITY IMPROVEMENT TECHNIQUES IN AN AUDIO ENCODER - An audio encoder implements multi-channel coding decision, band truncation, multi-channel rematrixing, and header reduction techniques to improve quality and coding efficiency. In the multi-channel coding decision technique, the audio encoder dynamically selects between joint and independent coding of a multi-channel audio signal via an open-loop decision based upon (a) energy separation between the coding channels, and (b) the disparity between excitation patterns of the separate input channels. In the band truncation technique, the audio encoder performs open-loop band truncation at a cut-off frequency based on a target perceptual quality measure. In multi-channel rematrixing technique, the audio encoder suppresses certain coefficients of a difference channel by scaling according to a scale factor, which is based on current average levels of perceptual quality, current rate control buffer fullness, coding mode, and the amount of channel separation in the source. In the header reduction technique, the audio encoder selectively modifies the quantization step size of zeroed quantization bands so as to encode in fewer frame header bits. | 10-23-2014 |
20140369405 | Multi-Layered Rate Control for Scalable Video Coding - Multi-layered rate control for scalable video coding is provided. A parameter value may be calculated based on a current layer target bit rate and a current layer buffer state for a frame in a video stream. The frame may include a lower layer and one or more higher layers. A determination may then be made as to whether the current layer is the lower layer. If the current layer is the lower layer, a determination may then be made as to whether a coupling request has been received from a higher layer in the frame. If the coupling request has been received from the higher layer in the frame, the parameter value for the current layer may be increased based on a buffer state threshold value of the higher layer in the frame. | 12-18-2014 |
20150063459 | VIDEO CODING / DECODING WITH MOTION RESOLUTION SWITCHING AND SUB-BLOCK TRANSFORM SIZES - Techniques and tools for video coding/decoding with motion resolution switching and sub-block transform coding/decoding are described. For example, a video encoder adaptively switches the resolution of motion estimation and compensation between quarter-pixel and half-pixel resolutions; a corresponding video decoder adaptively switches the resolution of motion compensation between quarter-pixel and half-pixel resolutions. For sub-block transform sizes, for example, a video encoder adaptively switches between 8×8, 8×4, and 4×8 DCTs when encoding 8×8 prediction residual blocks; a corresponding video decoder switches between 8×8, 8×4, and 4×8 inverse DCTs during decoding. | 03-05-2015 |
20150195525 | SELECTION OF MOTION VECTOR PRECISION - Approaches to selection of motion vector (“MV”) precision during video encoding are presented. These approaches can facilitate compression that is effective in terms of rate-distortion performance and/or computational efficiency. For example, a video encoder determines an MV precision for a unit of video from among multiple MV precisions, which include one or more fractional-sample MV precisions and integer-sample MV precision. The video encoder can identify a set of MV values having a fractional-sample MV precision, then select the MV precision for the unit based at least in part on prevalence of MV values (within the set) having a fractional part of zero. Or, the video encoder can perform rate-distortion analysis, where the rate-distortion analysis is biased towards the integer-sample MV precision. Or, the video encoder can collect information about the video and select the MV precision for the unit based at least in part on the collected information. | 07-09-2015 |
20150195527 | Representing Motion Vectors in an Encoded Bitstream - A format for use in encoding moving image data, comprising: a sequence of frames including plurality of the frames in which at least a region is encoded using motion estimation; a respective set of motion vector values representing motion vectors of the motion estimation for each respective one of these frames or each respective one of one or more regions within each of such frames; and at least one respective indicator associated with each of the respective frames or regions, indicating whether the respective motion vector values of the respective frame or region are encoded at a first resolution or a second resolution. | 07-09-2015 |
20150195557 | Encoding Screen Capture Data - An input of an encoder receives moving image data comprising a sequence of frames to be encoded, each frame comprising a plurality of blocks in two dimensions with each block comprising a plurality of pixels in those two dimensions. A motion prediction module performs encoding by, for at least part of each of a plurality of said frames, coding each block relative to a respective reference portion of another frame of the sequence, with the respective reference portion being offset from the block by a respective motion vector. According to the present disclosure, the moving image data of this plurality of frames comprises a screen capture stream, and the motion prediction module is configured to restrict each of the motion vectors of the screen capture stream to an integer number of pixels in at least one of said dimensions. | 07-09-2015 |