Patent application number | Description | Published |
20110063294 | TESSELLATION SHADER INTER-THREAD COORDINATION - One embodiment of the present invention sets forth a technique for performing a computer-implemented method for tessellating patches. An input block is received that defines a plurality of input patch attributes for each patch as well as instructions for processing each input patch. A plurality of threads is launched to execute the instructions to generate each vertex of a corresponding output patch based on the input patch. Reads of values written during instruction execution are synchronized so threads can read and further process the values of other threads. An output patch is then assembled from the outputs of each of the threads; and emitting the output patch for further processing. | 03-17-2011 |
20110063296 | Global Stores and Atomic Operations - One embodiment of the present invention sets forth a method for storing processed data within buffer objects stored in buffer object memory from within shader engines executing on a GPU. The method comprises the steps of receiving a stream of one or more shading program commands via a graphics driver, executing, within a shader engine, at least one of the one or more shading program commands to generate processed data, determining from the stream of one or more shading program commands an address associated with a first data object stored within the buffer memory, and storing, from within the shader engine, the processed data in the first data object stored within the buffer memory. | 03-17-2011 |
20110063313 | MEMORY COHERENCY IN GRAPHICS COMMAND STREAMS AND SHADERS - One embodiment of the present invention sets forth a technique for performing a computer-implemented method that controls memory access operations. A stream of graphics commands includes at least one memory barrier command. Each memory barrier command in the stream of graphics command delays memory access operations scheduled for any command specified after the memory barrier command until all memory access operations scheduled for commands specified prior to the memory barrier command have completely executed. | 03-17-2011 |
20110063318 | Image Loads, Stores and Atomic Operations - One embodiment of the present invention sets forth a method for accessing texture objects stored within a texture memory. The method comprises the steps of receiving a texture bind request from an application program, wherein the texture bind request includes an object identifier that identifies a first texture object stored in the texture memory and an image identifier that identifies a first image unit, binding the first texture object to the first image unit based on the texture bind request, receiving, within a shader engine, a first shading program command from the application program for performing a first memory access operation on the first texture object, wherein the memory access operation is a store operation or atomic operation to an arbitrary location in the image, and performing, within the shader engine, the first memory access operation on the first texture object via the first image unit. | 03-17-2011 |
20110072211 | Hardware For Parallel Command List Generation - A method for providing state inheritance across command lists in a multi-threaded processing environment. The method includes receiving an application program that includes a plurality of parallel threads; generating a command list for each thread of the plurality of parallel threads; causing a first command list associated with a first thread of the plurality of parallel threads to be executed by a processing unit; and causing a second command list associated with a second thread of the plurality of parallel threads to be executed by the processing unit, where the second command list inherits from the first command list state associated with the processing unit. | 03-24-2011 |
20110072245 | HARDWARE FOR PARALLEL COMMAND LIST GENERATION - A method for providing state inheritance across command lists in a multi-threaded processing environment. The method includes receiving an application program that includes a plurality of parallel threads; generating a command list for each thread of the plurality of parallel threads; causing a first command list associated with a first thread of the plurality of parallel threads to be executed by a processing unit; and causing a second command list associated with a second thread of the plurality of parallel threads to be executed by the processing unit, where the second command list inherits from the first command list state associated with the processing unit. | 03-24-2011 |
20110080407 | PIXEL SHADER OUTPUT MAP - One embodiment of the present invention sets forth a technique for storing only the enabled components for each enabled vector and writing only enabled components to one or more specified render targets. A shader program header (SPH) file provides per-component mask bits for each render target. Each enabled mask bit indicates that the pixel shader generates the corresponding component as an output to the raster operations unit. In the hardware, the per-component mask bits are combined with the applications programming interface (API)-level per-component write masks to determine the components that are updated by the shader program. The combined mask is used as the write enable bits for components in one or more render targets. One advantage of the combined mask is that the components that are not updated are not forwarded from the pixel shader to the ROP, thereby saving bandwidth between those processing units. | 04-07-2011 |
20110080416 | Methods to Facilitate Primitive Batching - One embodiment of the present invention sets forth a technique for splitting a set of vertices into a plurality of batches for processing. The method includes receiving one or more primitives each containing an associated set of vertices. For each of the one or more primitives, one or more vertices are gathered from the set of vertices, the vertices are arranged into one or more batches, the batch is routed to a processing pipeline line to process each batch as a separate primitive, and the one or more batches are processed to produce results identical to those of processing the entire primitive as a single entity. | 04-07-2011 |
20110084976 | Shader Program Headers - One embodiment of the present invention sets forth a technique for configuring a graphics processing pipeline (GPP) to process data according to one or more shader programs. The method includes receiving a plurality of pointers, where each pointer references a different shader program header (SPH) included in a plurality of SPHs, and each SPH is associated with a different shader program that executes within the GPP. For each SPH included in the plurality of SPHs, one or more GPP configuration parameters included in the SPH are identified, and the GPP is adjusted based on the one or more GPP configuration parameters. | 04-14-2011 |
20110084977 | STATE SHADOWING TO SUPPORT A MULTI-THREADED DRIVER ENVIRONMENT - One embodiment of the present invention sets forth a technique for tracking and filtering state change methods provided to a graphics pipeline. State shadow circuitry at the start of the graphics pipeline may be configured in different modes. A track mode is used to capture the current state by storing state change methods that are transmitted to the graphics pipeline. A passthrough mode is used to provide different state data to the graphics pipeline without updating the current state stored in the state shadow circuitry. A replay mode is used to restore the current state to the graphics pipeline using the state shadow circuitry. Additionally, the state shadow circuitry may also be configured to filter the state change methods that are transmitted to graphics pipeline by removing redundant state change methods. | 04-14-2011 |
20110242117 | BINDLESS TEXTURE AND IMAGE API - One embodiment of the present invention sets for a method for accessing data objects stored in a memory that is accessible by a graphics processing unit (GPU). The method comprises the steps of creating a data object in the memory based on a command received from an application program, wherein the data object is organized non-linearly in the memory, transmitting a first handle associated with the data object to the application program such that data associated with different draw commands can be accessed by the GPU, wherein the first handle includes an address related to the location of the data object in the memory, receiving a first draw command as well as the first handle from the application program, and transmitting the first draw command and the first handle to the GPU for processing. | 10-06-2011 |
20110242119 | GPU Work Creation and Stateless Graphics in OPENGL - One embodiment of the present invention sets forth a method for generating work to be processed by a graphics pipeline residing within a graphics processor. The method includes the steps of receiving an indication that a first graphics workload is to be submitted to a command queue associated with the graphics processor, allocating a first portion of shader accessible memory for one or more units of state information that are necessary for processing the first graphics workload, populating the first portion of shader accessible memory with the one or more units of state information, and transmitting to the command queue of the graphics processor the one or more units of state information stored within the first portion of shader accessible memory, wherein the first graphics workload is processed within the graphics pipeline based on the one or more units of state information. | 10-06-2011 |
20110285742 | SYSTEM AND METHOD FOR PATH RENDERING WITH MULTIPLE STENCIL SAMPLES PER COLOR SAMPLE - One embodiment of the present invention sets forth a technique for improving path rendering on computer systems by efficiently representing and computing sub-pixel coverage for path objects. A stencil buffer is configured to store multiple stencil samples per pixel stored in an image buffer. The stencil samples undergo stencil testing to produce a set of Boolean values per pixel, which collectively define a geometric coverage percentage for the pixel. The coverage percentage is used to modulate a color value for the pixel. The modulated color value is then blended into the image buffer as an anti-aliased pixel. This technique advantageously enables efficient anti-aliasing for path rendering. | 11-24-2011 |
20120072701 | Method Macro Expander - One embodiment of the present invention sets forth a [TODO once claims are reviewed] | 03-22-2012 |
20140267315 | MULTI-SAMPLE SURFACE PROCESSING USING ONE SAMPLE - A system, method, and computer program product are provided for multi-sample processing. The multi-sample pixel data is received and an encoding state associated with the multi-sample pixel data is determined. Data for one sample of a multi-sample pixel and the encoding state are provided to a processing unit. The one sample of the multi-sample pixel is processed by the processing unit to generate processed data for the one sample that represents processed multi-sample pixel data for all samples of the multi-sample pixel or two or more samples of the multi-sample pixel. | 09-18-2014 |
20140267356 | MULTI-SAMPLE SURFACE PROCESSING USING SAMPLE SUBSETS - A system, method, and computer program product are provided for multi-sample processing. The multi-sample pixel data is received and is analyzed to identify subsets of samples of a multi-sample pixel that have equal data, such that data for one sample in a subset represents multi-sample pixel data for all samples in the subset. An encoding state is generated that indicates which samples of the multi-sample pixel are included in each one of the subsets. | 09-18-2014 |
20140267376 | SYSTEM, METHOD, AND COMPUTER PROGRAM PRODUCT FOR ACCESSING MULTI-SAMPLE SURFACES - A system, method, and computer program product are provided for accessing multi-sample surfaces. A multi-sample store instruction that specifies data for a single sample of a multi-sample pixel and a sample mask is received and the data for the single sample is stored to each sample of the multi-sample pixel that is enabled according to the sample mask. A multi-sample load instruction that specifies a multi-sample pixel is received, and, in response to executing the multi-sample load instruction, data for one sample of the multi-sample pixel is received. A determination is made that the data for the one sample of the multi-sample pixel represents multi-sample pixel data for at least one additional sample of the multi-sample pixel. | 09-18-2014 |
20150054836 | SYSTEM, METHOD, AND COMPUTER PROGRAM PRODUCT FOR REDISTRIBUTING A MULTI-SAMPLE PROCESSING WORKLOAD BETWEEN THREADS - A system, method, and computer program product are provided for redistributing multi-sample processing workloads between threads. A workload for a plurality of multi-sample pixels is received and each thread in a parallel thread group is associated with a corresponding multi-sample pixel of the plurality of pixels. The workload is redistributed between the threads in the parallel thread group based on a characteristic of the workload and the workload is processed by the parallel thread group. In one embodiment, the characteristic is rasterized coverage information for the plurality of multi-sample pixels. | 02-26-2015 |