Entries |
Document | Title | Date |
20080198168 | EFFICIENT 2-D AND 3-D GRAPHICS PROCESSING - Techniques for supporting both 2-D and 3-D graphics are described. A graphics processing unit (GPU) may perform 3-D graphics processing in accordance with a 3-D graphics pipeline to render 3-D images and may also perform 2-D graphics processing in accordance with a 2-D graphics pipeline to render 2-D images. Each stage of the 2-D graphics pipeline may be mapped to at least one stage of the 3-D graphics pipeline. For example, a clipping, masking and scissoring stage in 2-D graphics may be mapped to a depth test stage in 3-D graphics. Coverage values for pixels within paths in 2-D graphics may be determined using rasterization and depth test stages in 3-D graphics. A paint generation stage and an image interpolation stage in 2-D graphics may be mapped to a fragment shader stage in 3-D graphics. A blending stage in 2-D graphics may be mapped to a blending stage in 3-D graphics. | 08-21-2008 |
20080204461 | Auto Software Configurable Register Address Space For Low Power Programmable Processor - A configurable graphics pipeline has more than one possible process flow of pixel packets through elements of the graphics pipeline. In one embodiment, a data packet triggers an element of the graphics pipeline to discover an identifier. | 08-28-2008 |
20080266301 | DISPLAY CONTROLLER OPERATING MODE USING MULTIPLE DATA BUFFERS - A display controller unit for controlling a display on a display panel comprises a first set of registers to hold data to be displayed and a second set of registers loadable from the first set of registers. A set of multiplexers has first data inputs coupled to the first set of registers, second data inputs coupled to the second set of registers, and select inputs. Logic circuitry is coupled to the output of the set of multiplexers and to the control inputs of the multiplexers, the control circuitry providing select information to the set of multiplexers and providing waveforms to the display panel to selectively display data from the first set of registers and the second set of registers in accordance with the select information. | 10-30-2008 |
20080291208 | METHOD AND SYSTEM FOR PROCESSING DATA VIA A 3D PIPELINE COUPLED TO A GENERIC VIDEO PROCESSING UNIT - Methods and systems for coupling a 3D pipeline to a generic video processing unit (VPU) are disclosed. Aspects of one method may include concurrently accessing different portion of stored graphics data by the generic VPU and the 3D pipeline within a chip. The graphics data may be processed by the VPU and the 3D pipeline. The VPU may be able to perform, for example, vector processing and scalar processing. The vector processing may be performed on the graphics data by a plurality of pixel processors. The graphics data may be stored and/or accessed in a vector register file (VRF), which may comprise a plurality of banks. Graphics data may be stored as a plurality of vectors in each of the banks in the VRF. The graphics data may be stored and/or read a vector at a time by the VPU and the 3D pipeline. Each vector may comprise, for example, 512 bits. | 11-27-2008 |
20090002378 | Pipeline Architecture for Video Encoder and Decoder - An image data-processing apparatus ( | 01-01-2009 |
20090027404 | IMAGE PROCESSING METHOD AND APPARATUS - An image processing apparatus comprises a plurality of processing blocks connected in series, and each respective processing block comprises a processor. In each respective processing block, the processor employs data input into that processing block to perform an image process upon the data. Also, each processing block performs a process upon the processor in response to a command input into the processing block. Each processing block causes an output corresponding to the command that is input after the data to wait until an output of the processor that employed the data input into the processing block prior to the command to perform the process is finished, such that the output of the processor that employed the data to perform the image processing and the output that corresponds to the command is outputted from the processing block in an order whereby the data and the command are input. | 01-29-2009 |
20090046103 | Shared readable and writeable global values in a graphics processor unit pipeline - An arithmetic logic stage in a graphics processor unit includes arithmetic logic units (ALUs) and global registers. The registers contain global values for a group of pixels. Global values may be read from any of the registers, regardless of which of the pixels is being operated on by the ALUs. However, when writing results of the ALU operations, only some of the global registers are candidates to be written to, depending on the pixel number. Accordingly, overwriting of data is prevented. | 02-19-2009 |
20090079747 | Distributed Antialiasing In A Multiprocessor Graphics System - Multiprocessor graphics systems support distributed antialiasing. In one embodiment, two (or more) graphics processors each render a version of the same image, with a difference in the sampling location (or locations) used for each pixel. A display head combines corresponding pixels generated by different graphics processors to produce an antialiased image. This distributed antialiasing technique can be scaled to any number of graphics processors. | 03-26-2009 |
20090091577 | Compression of Multiple-Sample-Anti-Aliasing Title Data in a Graphics Pipeline - Provided is a system for compressing multiple-sample-anti-aliasing (MSAA) tile data in a computer graphics pipeline. The system includes a plurality of pixels configured as a tile, where the tile has a plurality of samples of descriptor data for the pixels. Multiple graphics data processing units configured to receive the plurality of samples contain a plurality of coverage masks, which correspond to covered subtiles and compression logic encodes the tile descriptor data for receipt by a buffer. | 04-09-2009 |
20090096797 | DEMAND BASED POWER CONTROL IN A GRAPHICS PROCESSING UNIT - Disclosed herein is power controller for use with a graphics processing unit. The power controller monitors, manages and controls power supplied to components of a pipeline of the graphics processing unit. The power controller determining whether and to what extent power is to be supplied to a pipeline component based on status information received by the power controller in connection with the pipeline component. The power controller is capable of identifying a trend using the received status information, and determining whether and to what extent power is to be supplied to a pipeline component based on the identified trend. | 04-16-2009 |
20090096798 | Graphics Processing and Display System Employing Multiple Graphics Cores on a Silicon Chip of Monolithic Construction - A high performance graphics processing and display system architecture realized on a monolithic silicon chip, supporting a cluster of multiple cores of graphic processing units (GPUs) that cooperate to provide a powerful and highly scalable visualization solution supporting photo-realistic graphics capabilities for diverse applications. The present invention eliminates rendering bottlenecks along the graphics pipeline by dynamically managing various parallel rendering techniques and enabling adaptive handling of diverse graphics applications. | 04-16-2009 |
20090109230 | METHODS AND APPARATUSES FOR LOAD BALANCING BETWEEN MULTIPLE PROCESSING UNITS - Exemplary embodiments of methods and apparatuses to dynamically redistribute computational processes in a system that includes a plurality of processing units are described. The power consumption, the performance, and the power/performance value are determined for various computational processes between a plurality of subsystems where each of the subsystems is capable of performing the computational processes. The computational processes are exemplarily graphics rendering process, image processing process, signal processing process, Bayer decoding process, or video decoding process, which can be performed by a central processing unit, a graphics processing units or a digital signal processing unit. In one embodiment, the distribution of computational processes between capable subsystems is based on a power setting, a performance setting, a dynamic setting or a value setting. | 04-30-2009 |
20090135190 | Multimode parallel graphics rendering systems and methods supporting task-object division - In a PC-level hosting computing system embodying a parallel graphics processing subsystem (PGPS) having a plurality of GPPLs and supporting at least a task-based object division mode of parallel operation, a method of method of operating the GPPLs its task-based object division mode of operation during the run-time of a graphics based application executing on the CPU(s) of the host computing system, and, within each frame in the scene to be rendered, analyzing the stream of graphics commands and data generated by the graphics application for graphics processing tasks associated the frame. The graphics processing tasks are then distributed among plurality of GPPLs, and each GPPL executes its received graphics processing tasks, by processing the graphics commands and data associated with its distributed graphics processing tasks, and renders partial image components. The partial image components are ultimately recomposited to produce a complete image for the frame and the complete image is displayed on one or more display screens. In a preferred embodiment, the partial image components are rendered in the GPPLs using a depth-less based method of image rendering. | 05-28-2009 |
20090141033 | SYSTEM AND METHOD FOR USING A SECONDARY PROCESSOR IN A GRAPHICS SYSTEM - A system, method and apparatus are disclosed, in which a processing unit is configured to perform secondary processing on graphics pipeline data outside the graphics pipeline, with the output from the secondary processing being integrated into the graphics pipeline so that it is made available to the graphics pipeline. A determination is made whether to use secondary processing, and in a case that secondary processing is to be used, a command stream, which can comprise one or more commands, is provided to the secondary processing unit, so that the unit can locate and operate on buffered graphics pipeline data. Secondary processing is managed and monitored so as to synchronize data access by the secondary processing unit with the graphics pipeline processing modules. | 06-04-2009 |
20090153571 | Interrupt handling techniques in the rasterizer of a GPU - Techniques for handling an interrupt in the rasterizer, in accordance with embodiment of the present technology, start with rasterizing one or more primitives of a first context. If an interrupt is received, state information of the rasterizer is saved in a backing store after coarse rasterizing a given tile. After storing the raster state information, the one or more primitives of a second context are rasterized. After the second context is served, the raster state information of the first context is restored and rasterization of the one or more primitives of the first context is restarted. | 06-18-2009 |
20090167772 | GRAPHIC SYSTEM COMPRISING A FRAGMENT GRAPHIC MODULE AND RELATIVE RENDERING METHOD - A graphic system having a central processing unit; a system memory coupled to the central processing unit; a display unit provided with a corresponding screen; a graphic module coupled to and controlled by the central processing unit to render an image on the screen of the display unit, the graphic module including a fragment graphic module having a depth test buffer for storing a current depth value; a depth test stage coupled to the depth test buffer for comparing the current depth value with a depth coordinate associated with an incoming fragment and defining a resulting fragment; a test stage for testing the resulting fragment and defining a retained fragment; a buffer writing stage operatively associated with the test stage for receiving the retained fragment, the buffer writing stage coupled to the depth test buffer for updating the current depth value with a depth value of the retained fragment. | 07-02-2009 |
20090189909 | Graphics Processor having Unified Cache System - Graphics processing units (GPUs) are used, for example, to process data related to three-dimensional objects or scenes and to render the three-dimensional data onto a two-dimensional display screen. One embodiment, among others, of a unified cache system used in a GPU comprises a data storage device and a storage device controller. The data storage device is configured to store graphics data processed by or to be processed by one or more shader units. The storage device controller is placed in communication with the data storage device. The storage device controller is configured to dynamically control a storage allocation of the graphics data within the data storage device. | 07-30-2009 |
20090189910 | DELIVERING PIXELS RECEIVED AT A LOWER DATA TRANSFER RATE OVER AN INTERFACE THAT OPERATES AT A HIGHER DATA TRANSFER RATE - A number of pixels are received at a pixel rate that corresponds to a lower data transfer rate. The received pixels are delivered for display on a display device, over an interface that operates at a higher data transfer rate. These pixels are delivered as part of a stream that includes one or more codes that have been inserted between each adjacent pair of pixels so that the pixels in the stream are still delivered at the pixel rate. Other embodiments are also described and claimed. | 07-30-2009 |
20090231348 | Image Processing with Highly Threaded Texture Fragment Generation - A circuit arrangement and method support a multithreaded rendering architecture capable of dynamically routing pixel fragments from a pixel fragment generator to any pixel shader from among a pool of pixel shaders. The pixel fragment generator is therefore not tied to a specific pixel shader, but is instead able to utilize multiple pixel shaders in a pool of pixel shaders to minimize bottlenecks and improve overall hardware utilization and performance during image processing. | 09-17-2009 |
20090231349 | Rolling Context Data Structure for Maintaining State Data in a Multithreaded Image Processing Pipeline - A multithreaded rendering software pipeline architecture utilizes a rolling context data structure to store multiple contexts that are associated with different image elements that are being processed in the software pipeline. Each context stores state data for a particular image element, and the association of each image element with a context is maintained as the image element is passed from stage to stage of the software pipeline, thus ensuring that the state used by the different stages of the software pipeline when processing the image element remains coherent irrespective of state changes made for other image elements being processed by the software pipeline. Multiple image elements may therefore be processed concurrently by the software pipeline, and often without regard for synchronization or serialization of state changes that affect only certain image elements. | 09-17-2009 |
20100026692 | HYBRID GRAPHIC DISPLAY - A method of displaying graphics data is described. The method involves accessing the graphics data in a memory subsystem associated with one graphics subsystem. The graphics data is transmitted to a second graphics subsystem, where it is displayed on a monitor coupled to the second graphics subsystem. | 02-04-2010 |
20100045683 | HARDWARE TYPE VECTOR GRAPHICS ACCELERATOR - Techniques, apparatus and system are described for providing a hardware-type vector graphics acceleration. In one aspect, a hardware-type vector graphics accelerator includes graphics processing modules to communicate with a controller unit. The graphics processing modules include at least one of a rasterizing setup module, a scissor module, a paint generation module, an alpha masking module, and a blending module connected together according to a pipeline architecture to perform two-dimensional (2D) vector graphics acceleration in response to one or more commands received from the controller unit. | 02-25-2010 |
20100060651 | Pipelined image processing engine - The present invention related to processing image frames through a pipeline of effects by breaking the image frames into multiple blocks of image data. The example method includes generating a plurality of blocks from each frame, processing each block through a pipeline of effects in a predefined consecutive order, and aggregating the processed blocks to produce an output frame by combining the primary pixels from each processed block. The pipeline of effects may be distributed over a plurality of processing nodes, and each effect may process a block, provided as input to the node. Each processing node may independently process a block using an effect. | 03-11-2010 |
20100110083 | Metaprocessor for GPU Control and Synchronization in a Multiprocessor Environment - Included are embodiments of systems and methods for processing metacommands. In at least one exemplary embodiment a Graphics Processing Unit (GPU) includes a metaprocessor configured to process at least one context register, the metaprocessor including context management logic and a metaprocessor control register block coupled to the metaprocessor, the metaprocessor control register block configured to receive metaprocessor configuration data, the metaprocessor control register block further configured to define metacommand execution logic block behavior. Some embodiments include a Bus Interface Unit (BIU) configured to provide the access from a system processor to the metaprocessor and a GPU command stream processor configured to fetch a current context command stream and send commands for execution to a GPU pipeline and metaprocessor. | 05-06-2010 |
20100110084 | PARALLEL PIPELINE GRAPHICS SYSTEM - The present invention relates to a parallel pipeline graphics system. The parallel pipeline graphics system includes a back-end configured to receive primitives and combinations of primitives (i.e., geometry) and process the geometry to produce values to place in a frame buffer for rendering on screen. Unlike prior single pipeline implementation, some embodiments use two or four parallel pipelines, though other configurations having 2̂n pipelines may be used. When geometry data is sent to the back-end, it is divided up and provided to one of the parallel pipelines. Each pipeline is a component of a raster back-end, where the display screen is divided into tiles and a defined portion of the screen is sent through a pipeline that owns that portion of the screen's tiles. In one embodiment, each pipeline comprises a scan converter, a hierarchical-Z unit, a z buffer logic, a rasterizer, a shader, and a color buffer logic. | 05-06-2010 |
20100164965 | RENDERING MODULE FOR BIDIMENSIONAL GRAPHICS, PREFERABLY BASED ON PRIMITIVES OF ACTIVE EDGE TYPE - A graphics module for the rendering of a bidimensional scene on a displaying screen is described, comprising a sort-middle-type graphics pipeline, said graphics pipeline comprising: a first rasterizer module so configured as to convert an edge-type input primitive received by a path processing module into a primitive of active-edge-type; a first processing module so configured as to associate said primitive of active-edge-type to respective macro-blocks corresponding to portions of the screen and to store said primitive of active-edge-type into a scene buffer; a second processing module so configured as to read said scene buffer and to provide said primitive of active-edge-type to a second rasterizer module. | 07-01-2010 |
20100220103 | MEMORY SYSTEM AND METHOD FOR IMPROVED UTILIZATION OF READ AND WRITE BANDWIDTH OF A GRAPHICS PROCESSING SYSTEM - A system and method for processing graphics data which requires less read and write bandwidth. The graphics processing system includes an embedded memory array having at least three separate banks of single-ported memory in which graphics data are stored. A memory controller coupled to the banks of memory writes post-processed data to a first bank of memory while reading data from a second bank of memory. A synchronous graphics processing pipeline processes the data read from the second bank of memory and provides the post-processed graphics data to the memory controller to be written back to a bank of memory. The processing pipeline concurrently processes an amount of graphics data at least equal to that included in a page of memory. A third bank of memory is precharged concurrently with writing data to the first bank and reading data from the second bank in preparation for access when reading data from the second bank of memory is completed. | 09-02-2010 |
20100265259 | Generating and resolving pixel values within a graphics processing pipeline - A graphics processing apparatus | 10-21-2010 |
20100277486 | DYNAMIC GRAPHICS PIPELINE AND IN-PLACE RASTERIZATION - A pluggable graphics system is described herein that leverages high-end graphical capabilities of various mobile devices while keeping overhead for handling the variations to a negligible level. The pluggable graphics system breaks a graphics pipeline into functional blocks and includes base templates for handling different device capabilities for each functional block. During execution, based on capabilities of the device, the system composes appropriate functional blocks together through just-in-time (JIT) compilation to reduce runtime overhead in performance-sensitive code paths. The functional blocks include code designed to perform well with a particular set of hardware capabilities. In addition, for hardware platforms with large registers, the system provides advanced in-place blending that avoids wasteful memory accesses to reduce blending time. Thus, the pluggable graphics system abstracts differences in hardware capabilities from software applications and utilizes routines designed to perform well on each type of hardware. | 11-04-2010 |
20100302261 | Fixed Function Pipeline Application Remoting Through A Shader Pipeline Conversion Layer - Systems, methods and computer readable media are disclosed for sending a client graphics data across a remote session for an application, where the application makes fixed function pipeline API calls and the client and server support shader pipeline API calls for the remote session. fixed function pipeline graphics calls from sent from the application are intercepted, wrapped, converted into their fixed function pipeline equivalent graphics call or calls and then sent across the communications network to the client according to a protocol of the remote session. | 12-02-2010 |
20110080415 | INTER-SHADER ATTRIBUTE BUFFER OPTIMIZATION - One embodiment of the present invention sets forth a technique for reducing the amount of memory required to store vertex data processed within a processing pipeline that includes a plurality of shading engines. The method includes determining a first active shading engine and a second active shading engine included within the processing pipeline, wherein the second active shading engine receives vertex data output by the first active shading engine. An output map is received and indicates one or more attributes that are included in the vertex data and output by the first active shading engine. An input map is received and indicates one or more attributes that are included in the vertex data and received by the second active shading engine from the first active shading engine. Then, a buffer map is generated based on the input map, the output map, and a pre-defined set of rules that includes rule data associated with both the first shading engine and the second shading engine, wherein the buffer map indicates one or more attributes that are included in the vertex data and stored in a memory that is accessible by both the first active shading engine and the second active shading engine. | 04-07-2011 |
20110080416 | Methods to Facilitate Primitive Batching - One embodiment of the present invention sets forth a technique for splitting a set of vertices into a plurality of batches for processing. The method includes receiving one or more primitives each containing an associated set of vertices. For each of the one or more primitives, one or more vertices are gathered from the set of vertices, the vertices are arranged into one or more batches, the batch is routed to a processing pipeline line to process each batch as a separate primitive, and the one or more batches are processed to produce results identical to those of processing the entire primitive as a single entity. | 04-07-2011 |
20110084973 | Saving, Transferring and Recreating GPU Context Information Across Heterogeneous GPUs During Hot Migration of a Virtual Machine - A system and method are disclosed for recreating graphics processing unit (GPU) state information associated with a migrated virtual machine (VM). A VM running on a first VM host coupled to a first graphics device, comprising a first GPU, is migrated to a second VM host coupled to a second graphics device, in turn comprising a second GPU. A context module coupled to the first GPU reads its GPU state information in its native GPU state representation format and then converts the GPU state information into an intermediary GPU state representation format. The GPU state information is conveyed in the intermediary GPU state representation format to the second VM host, where it is received by a context module coupled to the second GPU. The context module converts the GPU state information related to the first GPU from the intermediary GPU state representation format to the native GPU state representation format of the second GPU. Once converted, the GPU state information of the first GPU is restored to the second GPU in its native GPU state representation format. | 04-14-2011 |
20110090232 | GRAPHICS PROCESSING SYSTEMS WITH MULTIPLE PROCESSORS CONNECTED IN A RING TOPOLOGY - Multiple graphics processors in a graphics processing system are interconnected in a unidirectional or bidirectional ring topology, allowing pixels to transferred from any one graphics processor to any other graphics processor. The system can automatically identify one or more “master” graphics processors to which one or more monitors are connected and configures the links of the ring such that one or more other graphics processors can deliver pixels to the master graphics processor, facilitating distributed rendering operations. The system can also automatically detect the connections or lack thereof between the graphics processors. | 04-21-2011 |
20110109637 | PROCESSING DEVICE, PROCESSING METHOD AND COMPUTER READABLE MEDIUM - A processing device has plural processing modules executing a processing; and plural connectors each having a linking section, an associating section, and a controller. The linking section is able to link with at least one other connector at an input side or an output side. The associating section associates the connector with one of the processing modules. In accordance with a linked state, the controller controls the processing module associated by the associating section. | 05-12-2011 |
20110148889 | Method and System for Improving Display Underflow Using Variable Hblank - Methods and apparatus for improving the effects of display underflow using a variable horizontal blanking interval are disclosed. One embodiment of the present invention is a method of display that includes detecting a data ready signal that indicates availability of display data for transmission from a display pipeline, and generating a line-transmit signal based upon a clock signal and the data ready signal. The line-transmit signal is provided to the display pipeline. The line-transmit signal is substantially coincident with the clock signal if the data ready signal is set, and may be delayed if the data ready signal is not asserted. The display pipeline transmits the display data upon receiving the line-transmit signal. Another embodiment is an apparatus including a display pipeline configured to set a data ready signal when the display data is available for transmission, and a timing generator coupled to the display pipeline and configured to generate a line-transmit signal based on the status of the data ready signal. | 06-23-2011 |
20110169841 | SILICON CHIP OF A MONOLITHIC CONSTRUCTION FOR USE IN IMPLEMENTING MULTIPLE GRAPHIC CORES IN A GRAPHICS PROCESSING AND DISPLAY SUBSYSTEM - A silicon chip of a monolithic construction for use in implementing a multiple core graphics processing and display subsystem in a computing system having a CPU, a system memory, an operating system (OS), a CPU bus, and a display device with a display surface. The computing system supports (i) one or more software applications for issuing graphics commands, (ii) one or more graphics libraries for storing data used to implement said graphics commands. The silicon chip comprises multiple graphic pipeline cores, a partial frame buffer for buffering pixels corresponding to image fragments, a routing center, control unit, and a display interface, for displaying composited images on the display surface of the computing system. | 07-14-2011 |
20110249010 | UTILIZATION OF A GRAPHICS PROCESSING UNIT BASED ON PRODUCTION PIPELINE TASKS - A method includes performing a task in response to a request of a secondary user interface of a secondary device. The method also includes calculating a utilization of a graphics processing unit of a machine based on the task performed by the graphics processing unit. The method further includes determining the utilization, through a processor, based on a comparison of a consumption of a computing resource of the graphics processing unit and a sum of the computing resource available. The method furthermore includes performing another task in response to the request of another secondary user interface of another secondary device. The method furthermore includes calculating another utilization of another graphics processing unit based on the another task performed by the another graphics processing unit. The method furthermore includes determining the another utilization based on the comparison of a consumption of the computing resource of the another graphics processing unit. | 10-13-2011 |
20110298813 | Tile Rendering for Image Processing - The time needed for back-end work can be estimated without actually doing the back-end work. Front-end counters record information for a cost model and heuristics may be used for when to split a tile and ordering work dispatch for cores. A special rasterizer discards triangles and fragments outside a sub-tile. | 12-08-2011 |
20110316864 | MULTITHREADED SOFTWARE RENDERING PIPELINE WITH DYNAMIC PERFORMANCE-BASED REALLOCATION OF RASTER THREADS - A multithreaded rendering software pipeline architecture dynamically reallocates regions of an image space to raster threads based upon performance data collected by the raster threads. The reallocation of the regions typically includes resizing the regions assigned to particular raster threads and/or reassigning regions to different raster threads to better balance the relative workloads of the raster threads. | 12-29-2011 |
20120062574 | AUTOMATED RECOGNITION OF PROCESS MODELING SEMANTICS IN FLOW DIAGRAMS - An example embodiment disclosed is a system for automated model extraction of documents containing flow diagrams. An extractor is configured to extract from the flow diagrams flow graphs. The extractor further extracts nodes and edges, and relational, geometric and textual features for the extracted nodes and edges. A classifier is configured to recognize process semantics based on the extracted nodes and edges, and the relational, geometric and textual features of the extracted nodes and edges. A process modeling language code is generated based on the recognized process semantics. Rules to recognize patterns in process diagrams may be determined using supervised learning and/or unsupervised learning. During supervised learning, an expert labels example flow diagrams so that a classifier can derive the classification rules. During unsupervised learning flow diagrams are clustered based on relational, geometric and textual features of nodes and edges. | 03-15-2012 |
20120127183 | Distribution Processing Pipeline and Distributed Layered Application Processing - The present invention contemplates a variety of improved methods and systems for distributing different processing aspects of layered application, and distributing a processing pipeline among a variety of different computer devices. The system uses multiple devices resources to speed up or enhance applications. In one embodiment, application layers can be distributed among different devices for execution or rendering. The teaching further expands on this distribution of processing aspects by considering a processing pipeline such as that found in a graphics processing unit (GPU), where execution of parallelized operations and/or different stages of the processing pipeline can be distributed among different devices. There are many suitable ways of describing, characterizing and implementing the methods and systems contemplated herein. | 05-24-2012 |
20120147017 | MULTI-FUNCTION ENCODER AND DECODER DEVICES, AND METHODS THEREOF - A technique for encoding and decoding video information uses a plurality of video processing modules (VPMs), whereby each video processing module is dedicated to a particular video processing function, such as filtering, matrix arithmetic operations, and the like. Information is transferred between the video processing modules using a set of first-in first-out (FIFO) buffers. For example, to transfer pixel information from a first VPM to a second VPM, the first VPM stores the pixel information at the head of a FIFO buffer, while the second VPM retrieves information from the tail of the FIFO buffer. The FIFO buffer thus permits transfer of information between the VPMs without storage of the information to a cache or other techniques that can reduce video processing speed. | 06-14-2012 |
20120154411 | MULTIPLE DISPLAY FRAME RENDERING METHOD AND APPARATUS - An apparatus includes a plurality of image processing circuits. Each image processing circuit generates an image frame corresponding to a single large surface. The first image processing circuit provides a portion of the generated image frame for a first display or plurality of displays and provides a remaining portion of the image frame to the remaining image processing circuits. The next image processing circuits provides the remaining portion of the image frame for the next plurality of displays. | 06-21-2012 |
20120242670 | MEMORY SYSTEM AND METHOD FOR IMPROVED UTILIZATION OF READ AND WRITE BANDWIDTH OF A GRAPHICS PROCESSING SYSTEM - A system for processing graphics data. The graphics processing system includes an embedded memory array having at least three separate banks of single-ported memory in which graphics data are stored. A memory controller coupled to the banks of memory writes post-processed data to a first bank of memory while reading data from a second bank of memory. A synchronous graphics processing pipeline processes the data read from the second bank of memory and provides the post-processed graphics data to the memory controller to be written back to a bank of memory. The processing pipeline concurrently processes an amount of graphics data at least equal to that included in a page of memory. A third bank of memory is precharged concurrently with writing data to the first bank and reading data from the second bank in preparation for access when reading data from the second bank of memory is completed. | 09-27-2012 |
20130002689 | MAXIMIZING PARALLEL PROCESSING IN GRAPHICS PROCESSORS - Methods and systems may include a computing system having a graphics processor with a three-dimensional (3D) pipeline, one or more processing units, and compute kernel logic to process two-dimensional (2D) command. A graphics processing unit (GPU) scheduler may dispatch the 2D command directly to the one or more processing units. In one example, the 2D command includes at least one of a render target clear command, a depth-stencil clear command, a resource resolving command and a resource copy command. | 01-03-2013 |
20130021350 | APPARATUS AND METHOD FOR DECODING USING COEFFICIENT COMPRESSION - Methods and apparatus for utilizing coefficient compression in graphics decoding are provided. In one example, a computer processing unit (CPU) is interfaced with a graphic processing unit (GPU) where the CPU extracts coefficients and passes compressed coefficient data, preferably in uniformly sized data packets, to the GPU for decoding and coefficient processing. Preferably the extracted coefficients are inverse transform (iT) coefficients and CPU includes an encoder control component configured to adaptively select a coefficient encoding process for performing the iT coefficient data compression based on the data content of the iT coefficients such that data packets are generated that include data that indentifies the selected coefficient encoding process used for encoding the compressed iT coefficient data contained in the data packet. In such case, the GPU is configured to receive such data packets and decode the iT coefficient data within each packet using a coefficient decoding method complementary to the selected coefficient encoding process identified within the packet. The GPU preferably uses massively parallel coefficient decoding of such data packets. | 01-24-2013 |
20130033505 | System and Method for Processing Data Using a Network - Systems and methods are disclosed for video processing modules. More specifically a network is disclosed for processing data. The network comprises a register DMA controller adapted to support register access and at least one node adapted to the data. At least one link communicates with the node, and is adapted to transmit data and at least one network module communicates with at least the link, and is adapted to route data to at least the link. | 02-07-2013 |
20130033506 | MULTI-CONTEXT GRAPHICS PROCESSING - A method of managing multiple contexts for a single mode display includes receiving a plurality of tasks from one or more applications and determining respective contexts for each task, each context having a range of memory addresses. The method also includes selecting one context for output to the single mode display | 02-07-2013 |
20130044118 | ERROR RECOVERY OPERATIONS FOR A HARDWARE ACCELERATOR - In at least some embodiments, an apparatus includes a hardware accelerator subsystem with a pipeline. The hardware accelerator subsystem is configured to perform error recovery operations in response to a bit stream error. The error recovery operations comprise a pipe-down process to completely decode a data block that is already in the pipeline, an overwrite process to overwrite commands in the hardware accelerator subsystem with null operations (NOPs) once the pipe-down process is complete, and a pipe-up process to restart decoding operations of the pipeline at a next synchronization point. | 02-21-2013 |
20130069961 | PROJECTOR, IMAGE PROCESSING APPARATUS AND IMAGE PROCESSING METHOD - A projector of an embodiment is provided with: an input line memory which holds an input image signal corresponding to one line; an image processor which generates an intermediate image signal correction-processed according to distortion of a projection lens, using the input image signal transferred from the input line memory; an output line memory which holds the intermediate image signal corresponding to one line; and an LCOS which guides light radiated from a light source to the projection lens in accordance with the intermediate image signal. The image processor is provided with an input supplementation buffer which stores the input image signals of a plurality of lines, an input data buffer which stores input image signals required to generate the intermediate image signal corresponding to one line, and a number-of-supplementary-lines calculator which calculates the number of supplementary lines of the input image signals. | 03-21-2013 |
20130076763 | Tone and Gamut Mapping Methods and Apparatus - Tone and/or gamut mapping apparatus and methods may be applied to map color values in image data for display on a particular display or other downstream device. A mapping algorithm may be selected based on location and/or color coordinates for pixel data being mapped. The apparatus and methods may be configured to map color coordinates differently depending on whether or not a pixel corresponds to a light source in an image and/or has special or reserved color values | 03-28-2013 |
20130083043 | Methods and Systems to Reduce Display Artifacts When Changing Display Clock Rate - Methods, systems, and computer readable media embodiments for reducing or eliminating display artifacts caused by on-the-fly changing of the display clock are disclosed. According to an embodiment of the present invention, a method includes, changing a rate of a display clock, and adapting a display data processing pipeline clocked by the display clock to prevent a substantial change in a pixel output rate from the display data processing pipeline based upon the changing. | 04-04-2013 |
20130100146 | DYNAMICALLY RECONFIGURABLE PIPELINED PRE-PROCESSOR - A pipelined video pre-processor includes a plurality of configurable image-processing modules. The modules may be configured using direct processor control, DMA access, or both. A block-control list, accessible via DMA, facilitates configuration of the modules in a manner similar to direct processor control. Parameters in the modules may be updated on a frame-by-frame basis. | 04-25-2013 |
20130100147 | FRAME-BY-FRAME CONTROL OF A DYNAMICALLY RECONFIGURABLE PIPELINED PRE-PROCESSOR - A pipelined video pre-processor includes a plurality of configurable image-processing modules. The modules may be configured using direct processor control, DMA access, or both. A block-control list, accessible via DMA, facilitates configuration of the modules in a manner similar to direct processor control. Parameters in the modules may be updated on a frame-by-frame basis. | 04-25-2013 |
20130120412 | METHOD FOR HANDLING STATE TRANSITIONS IN A NETWORK OF VIRTUAL PROCESSING NODES - One embodiment of the present invention sets forth a technique for executing an operation once work associated with a version of a state object has been completed. The method includes receiving the version of the state object at a first stage in a processing pipeline, where the version of the state object is associated with a reference count object, determining that the version of the state object is relevant to the first stage, incrementing a counter included in the reference count object, transmitting the version of the state object to a second stage in the processing pipeline, processing work associated with the version of the state object, decrementing the counter, determining that the counter is equal to zero, and in response, executing an operation specified by the reference count object. | 05-16-2013 |
20130120413 | METHOD FOR HANDLING STATE TRANSITIONS IN A NETWORK OF VIRTUAL PROCESSING NODES - One embodiment of the present invention sets forth a technique for receiving versions of state objects at one or more stages in a processing pipeline. The method includes receiving a first version of a state object at a first stage in the processing pipeline, determining that the first version of the state object is relevant to the first stage, incrementing a first reference counter associated with the first version of the state object, assigning the first version of the state object to work requests that arrive at the first stage subsequent to the receipt of the first version of the state object, and transmitting the first version of the state object to a second stage in the processing pipeline. | 05-16-2013 |
20130141445 | METHODS OF AND APPARATUS FOR PROCESSING COMPUTER GRAPHICS - When carrying out a second, higher level of anti-aliasing such as 8× MSAA, in a graphics processing pipeline | 06-06-2013 |
20130147817 | Systems and Methods for Reducing Clock Domain Crossings - In an embodiment, a graphics processing device is provided. The graphics processing device includes a global clock generator configured to generate a global clock signal and a plurality of graphics pipelines each configured to transmit image frames to a respective display device. Each of the graphics pipelines comprises a timing generator. Each of the timing generators is configured to generate a respective virtual clock signal based on the global clock signal and wherein each virtual clock signal is used to advance logic of a respective one of the display devices. | 06-13-2013 |
20130155077 | Policies for Shader Resource Allocation in a Shader Core - A method of determining priority within an accelerated processing device is provided. The accelerated processing device includes compute pipeline queues that are processed in accordance with predetermined criteria. The queues are selected based on priority characteristics and the selected queue is processed until a time quantum lapses or a queue having a higher priority becomes available for processing. | 06-20-2013 |
20130169652 | IMAGE PROCESSING APPARATUS, UPGRADE APPARATUS, DISPLAY SYSTEM INCLUDING THE SAME, AND CONTROL METHOD THEREOF - An image processing apparatus, upgrade apparatus, display system and control method are provided. The image processing apparatus includes a signal input unit; a first image processing unit which processes an input signal input by the signal input unit to output a first output signal; an upgrade apparatus connection unit connected to an upgrade apparatus which includes a second image processing unit; and a first controller which controls at least one of the input signal processed by the first image processing unit and the first output signal to be transmitted to the upgrade apparatus and processed by the second image processing unit if the upgrade apparatus is connected to the upgrade apparatus connection unit | 07-04-2013 |
20130181999 | PARA-VIRTUALIZED DOMAIN, HULL, AND GEOMETRY SHADERS - The present invention extends to methods, systems, and computer program products for providing domain, hull, and geometry shaders in a para-virtualized environment. As such, a guest application executing in a child partition is enabled use a programmable GPU pipeline of a physical GPU. A vGPU (executing in the child partition) is presented to the guest application. The vGPU exposes DDIs of a rendering framework. The DDIs enable the guest application to send graphics commands to the vGPU, including commands for utilizing a domain shader, a hull shader, and/or a geometric shader at a physical GPU. A render component (executing within the root partition) receives physical GPU-specific commands from the vGPU, including commands for using the domain shader, the hull shader, and/or the geometric shader. The render component schedules the physical GPU-specific command(s) for execution at the physical GPU. | 07-18-2013 |
20130222398 | GRAPHIC PROCESSING UNIT AND GRAPHIC DATA ACCESSING METHOD THEREOF - A graphic processing unit and a graphic data accessing method are provided. The graphic processing unit receives a graphic processing request instruction which comprises a first coordinate bits and a second coordinate bits of a under processing texel image, from the server processing unit. The graphic processing unit retrieves at least one first bit of the first coordination bits, retrieves at least one second bit of the second coordination bits, and derives a cache index from the at least one first bits and the at least one second bits via an arithmetic logic operation. | 08-29-2013 |
20130222399 | EXECUTION MODEL FOR HETEROGENEOUS COMPUTING - The techniques are generally related to implementing a pipeline topology of a data processing algorithm on a graphics processing unit (GPU). A developer may define the pipeline topology in a platform-independent manner. A processor may receive an indication of the pipeline topology and generate instructions that define the platform-dependent manner in which the pipeline topology is to be implemented on the GPU. | 08-29-2013 |
20130222400 | IMAGE PROCESSING APPARATUS, UPGRADE APPARATUS, DISPLAY SYSTEM INCLUDING THE SAME, AND CONTROL METHOD THEREOF - An image processing apparatus, upgrade apparatus, display system and control method are provided. The image processing apparatus includes a signal input unit; a first image processing unit which processes an input signal input by the signal input unit to output a first output signal; an upgrade apparatus connection unit connected to an upgrade apparatus which includes a second image processing unit; and a first controller which controls at least one of the input signal processed by the first image processing unit and the first output signal to be transmitted to the upgrade apparatus and processed by the second image processing unit if the upgrade apparatus is connected to the upgrade apparatus connection unit. | 08-29-2013 |
20130257883 | IMAGE STREAM PIPELINE CONTROLLER FOR DEPLOYING IMAGE PRIMITIVES TO A COMPUTATION FABRIC - According to some embodiments, an image pipeline controller may determine an image stream having a plurality of image primitives to be executed. Each image primitive may be, for example, associated with an image algorithm and a set of primitive attributes. The image pipeline controller may then automatically deploy the set of image primitives to an image computation fabric based at least in part on primitive attributes. | 10-03-2013 |
20130335429 | Using Cost Estimation to Improve Performance of Tile Rendering for Image Processing - An analysis of the cost of processing tiles may be used to decide how to process the tiles. In one case two tiles may be merged. In another case a culling algorithm may be selected based on tile processing cost. | 12-19-2013 |
20140022263 | METHOD FOR URGENCY-BASED PREEMPTION OF A PROCESS - The desire to use an Accelerated Processing Device (APD) for general computation has increased due to the APD's exemplary performance characteristics. However, current systems incur high overhead when dispatching work to the APD because a process cannot be efficiently identified or preempted. The occupying of the APD by a rogue process for arbitrary amounts of time can prevent the effective utilization of the available system capacity and can reduce the processing progress of the system. Embodiments described herein can overcome this deficiency by enabling the system software to pre-empt a process executing on the APD for any reason. The APD provides an interface for initiating such a pre-emption. This interface exposes an urgency of the request which determines whether the process being preempted is allowed a grace period to complete its issued work before being forced off the hardware. | 01-23-2014 |
20140055465 | METHOD AND SYSTEM FOR COORDINATED DATA EXECUTION USING A PRIMARY GRAPHICS PROCESSOR AND A SECONDARY GRAPHICS PROCESSOR - A method and system for coordinated data execution in a computer system. The system includes a first graphics processor coupled to a first memory and a second graphics processor coupled to a second memory. A graphics bus is configured to couple the first graphics processor and the second graphics processor. The first graphics processor and the second graphics processor are configured for coordinated data execution via communication across the graphics bus. | 02-27-2014 |
20140063024 | THREE-DIMENSIONAL RANGE DATA COMPRESSION USING COMPUTER GRAPHICS RENDERING PIPELINE - A method includes obtaining three-dimensional range data, using a computer graphics rendering pipeline to encode the three-dimensional range data into two-dimensional images, retrieving depth information for each sampled pixel in the two-dimensional images, and encoding the depth information into red, green and blue color channels of the two-dimensional images. The two-dimensional images may be compressed using two-dimensional techniques including dithering. The step of obtaining the three-dimensional range data may be performed using a three-dimensional range scanning device. The method may further include storing the two-dimensional images on a computer readable storage medium. The method may further include setting up the viewing angle for the three-dimensional range data. The viewing angle for the three-dimensional range data is a viewing angle of a camera used in obtaining the three-dimensional range data. The computer graphics rendering pipeline may provide for geometry processing, projection, and rasterization. | 03-06-2014 |
20140063025 | Pipelined Image Processing Sequencer - To provide optimal power and performance policy choices for imaging and analytic processing, In accordance with some embodiments, reusable, reconfigurable, dedicated function process elements may be allocated to execution sequences made up of sequentially executed process elements. Any given process element may be reconfigured in any given execution sequence to meet a sequence performance metric. A plurality of sequences may then run in parallel. | 03-06-2014 |
20140071140 | DISPLAY PIPE REQUEST AGGREGATION - A system and method for efficiently scheduling memory access requests. A semiconductor chip includes a memory controller for controlling accesses to a shared memory and a display controller for processing frame data. In response to detecting an idle state for the system and the supported one or more displays, the display controller aggregates memory requests for a given display pipeline of one or more display pipelines prior to attempting to send any memory requests from the given display pipeline to the memory controller. Arbitration may be performed while the given display pipeline sends the aggregated memory requests. In response to not receiving memory access requests from the functional blocks or the display controller, the memory controller may transition to a low-power mode. | 03-13-2014 |
20140078158 | SYSTEM AND METHOD FOR CONFIGURING A DISPLAY PIPELINE - Systems and methods are disclosed for video processing modules. More specifically a network is disclosed for processing data. The network comprises a register DMA controller adapted to support register access and at least one node adapted to the data. At least one link communicates with the node, and is adapted to transmit data and at least one network module communicates with at least the link, and is adapted to route data to at least the link. | 03-20-2014 |
20140118365 | DATA STRUCTURES FOR EFFICIENT TILED RENDERING - One embodiment of the present invention includes a method for tracking which cache tiles included in a plurality of cache tiles are intersected by a plurality of bounding boxes. The method includes receiving the plurality of bounding boxes, wherein each bounding box is associated with one or more graphics primitives being rendered to a render surface, and wherein the render surface is divided into the plurality of cache tiles. The method further includes, for each bounding box included in the plurality of bounding boxes, determining one or more cache tiles included in the plurality of cache tiles that are intersected by the bounding box, and storing a result in an array for each cache tile that is intersected by the bounding box. Finally, the method includes determining not to process a cache tile included in the plurality of cache tiles based on the results stored in the array. | 05-01-2014 |
20140118366 | SCHEDULING CACHE TRAFFIC IN A TILE-BASED ARCHITECTURE - A tile-based system for processing graphics data. The tile based system includes a first screen-space pipeline, a cache unit, and a first tiling unit. The first tiling unit is configured to transmit a first set of primitives that overlap a first cache tile and a first prefetch command to the first screen-space pipeline for processing, and transmit a second set of primitives that overlap a second cache tile to the first screen-space pipeline for processing. The first prefetch command is configured to cause the cache unit to fetch data associated with the second cache tile from an external memory unit. The first tiling unit may also be configured to transmit a first flush command to the screen-space pipeline for processing with the first set of primitives. The first flush command is configured to cause the cache unit to flush data associated with the first cache tile. | 05-01-2014 |
20140152675 | Load Balancing for Optimal Tessellation Performance - A system, method and a computer-readable medium for load balancing patch processing pre-tessellation are provided. The patches for drawing objects on a display screen are distributed to shader engines for parallel processing. Each shader engine generates tessellation factors for a patch, wherein a value of generated tessellation factors for the patch is unknown prior to distribution. The patches are redistributed to the shader engines pre-tessellation to load balance the shader engines for processing the patches based on the value of tessellation factors in each patch. | 06-05-2014 |
20140168231 | TRIGGERING PERFORMANCE EVENT CAPTURE VIA PIPELINED STATE BUNDLES - One embodiment of the present invention sets forth a method for analyzing the performance of a graphics processing pipeline. A first workload and a second workload are combined together in a pipeline to generate a combined workload. The first workload is associated with a first instance and the second workload is associated with a second instance. A first and second initial event are generated for the combined workload, indicating that the first and second workloads have begun processing at a first position in the graphics processing pipeline. A first and second final event are generated, indicating that the first and second workloads have finished processing at a second position in the graphics processing pipeline. | 06-19-2014 |
20140176577 | METHOD AND MECHANISM FOR PREEMPTING CONTROL OF A GRAPHICS PIPELINE - A method of operating a graphics pipeline, a graphics processing unit and a GPU computing system are provided by this disclosure. In one embodiment, the graphics processing unit includes: (1) a processor configured to assist in operating the graphics processing unit and (2) a graphics pipeline coupled to the processor and including a programmable shader stage, the programmable shader stage configured to determine occurrence of a pipeline exception during execution of the graphics pipeline, initiate preempting the execution in response to determining the occurrence and initiate resolving the pipeline exception before execution is restarted. | 06-26-2014 |
20140176578 | INPUT OUTPUT CONNECTOR FOR ACCESSING GRAPHICS FIXED FUNCTION UNITS IN A SOFTWARE-DEFINED PIPELINE AND A METHOD OF OPERATING A PIPELINE - An input output connector for a graphics processing unit having a graphics pipeline including fixed function units and programmable function units is disclosed. Additionally, a graphics processing unit and a method of operating a graphics pipeline are disclosed. In one embodiment, the input output connector includes: (1) a request arbiter configured to connect to each of the programmable function units, receive fixed function requests therefrom and arbitrate the requests and (2) fixed unit converters, wherein each of the fixed unit converters is dedicated to a single one of the fixed function units and is configured to convert the requests directed to the single one to an input format for the single one. | 06-26-2014 |
20140176579 | EFFICIENT SUPER-SAMPLING WITH PER-PIXEL SHADER THREADS - Techniques are disclosed for dispatching pixel information in a graphics processing pipeline. A fragment processing unit generates a pixel that includes multiple samples based on a first portion of a graphics primitive received by a first thread. The fragment processing unit calculates a first value for the first pixel, where the first value is calculated only once for the pixel. The fragment processing unit calculates a first set of values for the samples, where each value in the first set of values corresponds to a different sample and is calculated only once for the corresponding sample. The fragment processing unit combines the first value with each value in the first set of values to create a second set of values. The fragment processing unit creates one or more dispatch messages to store the second set of values in a set of output registers. One advantage of the disclosed techniques is that pixel shader programs perform per-sample operations with increased efficiency. | 06-26-2014 |
20140184617 | MID-PRIMITIVE GRAPHICS EXECUTION PREEMPTION - One embodiment of the present invention sets forth a technique for mid-primitive execution preemption. When preemption is initiated no new instructions are issued, in-flight instructions progress to an execution unit boundary, and the execution state is unloaded from the processing pipeline. The execution units within the processing pipeline, including the coarse rasterization unit complete execution of in-flight instructions and become idle. However, rasterization of a triangle may be preempted at a coarse raster region boundary. The amount of context state to be stored is reduced because the execution units are idle. Preempting at the mid-primitive level during rasterization reduces the time from when preemption is initiated to when another process can execute because the entire triangle is not rasterized. | 07-03-2014 |
20140184618 | GENERATING CANONICAL IMAGING FUNCTIONS - A method for coalescing monolithic imaging functions includes providing a canonical imaging function template. A set of canonical imaging functions is formed from the monolithic imaging functions. The set of canonical imaging functions adhere to the canonical imaging function template. One or more of the canonical imaging functions of the set of canonical imaging functions are coalesced into a coalesced imaging function. | 07-03-2014 |
20140192066 | PARALLEL PROCESSOR WITH INTEGRATED CORRELATION AND CONVOLUTION ENGINE - A system and method for performing computer algorithms. The system includes a graphics pipeline operable to perform graphics processing and an engine operable to perform at least one of a correlation determination and a convolution determination for the graphics pipeline. The graphics pipeline is further operable to execute general computing tasks. The engine comprises a plurality of functional units operable to be configured to perform at least one of the correlation determination and the convolution determination. In one embodiment, the engine is coupled to the graphics pipeline. The system further includes a configuration module operable to configure the engine to perform at least one of the correlation determination and the convolution determination. | 07-10-2014 |
20140192067 | System for Non-Destructive Image Processing - An image processor comprises a plurality of processing modules coupled together in series. Each of at least two of the processing modules includes an image data input to receive at least one of i) an original image or ii) image data output by a previous processing module in the series. Each of the at least two of the processing modules also includes a processing unit configured to i) detect that image data is to be generated and ii) process image data received via the at least one image data input to generate image data. Each of the at least two of the processing modules also includes a memory to store image data generated by the processing unit. | 07-10-2014 |
20140232729 | POWER EFFICIENT ATTRIBUTE HANDLING FOR TESSELLATION AND GEOMETRY SHADERS - Attributes of graphics objects are processed in a plurality of graphics processing pipelines. A streaming multiprocessor (SM) retrieves a first set of parameters associated with a set of graphics objects from a first set of buffers. The SM performs a first set of operations on the first set of parameters according to a first phase of processing to produce a second set of parameters stored in a second set of buffers. The SM performs a second set of operations on the second set of parameters according to a second phase of processing to produce a third set of parameters stored in a third set of buffers. One advantage of the disclosed techniques is that work is redistributed from a first phase to a second phase of graphics processing without having to copy the attributes to and retrieve the attributes from the cache or system memory, resulting in reduced power consumption. | 08-21-2014 |
20140240328 | TECHNIQUES FOR LOW ENERGY COMPUTATION IN GRAPHICS PROCESSING - Techniques and architecture are disclosed for using a latency first-in/first-out (FIFO) to modally enable and disable a compute block in a graphics pipeline. In some example embodiments, the latency FIFO collects valid accesses for a downstream compute and integrates invalid inputs (e.g., bubbles), while the compute is in an off state (e.g., sleep). Once a sufficient number of valid accesses are stored in the latency FIFO, the compute is turned on, and the latency FIFO drains a burst of valid inputs thereto. In some embodiments, this burst helps to prevent or reduce any underutilization of the compute which otherwise might occur, thus providing power savings for a graphics pipeline or otherwise improving the energy efficiency of a given graphics system. In some instances, throughput demand at the latency FIFO input is maintained over a time window corresponding to the on and off time of the compute block. | 08-28-2014 |
20140267318 | PIXEL SHADER BYPASS FOR LOW POWER GRAPHICS RENDERING - A computer-implemented method for drawing graphical objects within a graphics processing pipeline is disclosed. The method includes determining that a bypass mode for a first primitive is a no-bypass mode. The method further includes rasterizing the first primitive to generate a first set of rasterization results. The method further includes generating a first set of colors for the first set of rasterization results via a pixel shader unit. The method further includes rasterizing a second primitive to generate a second set of rasterization results. The method further includes generating a second set of colors for the second set of rasterization results without the pixel shader unit performing any processing operations on the second set of rasterization results. The method further includes transmitting the first set of pixel colors and the second set of pixel colors to a raster operations (ROP) unit for further processing. | 09-18-2014 |
20140267319 | TECHNIQUE FOR IMPROVING THE PERFORMANCE OF A TESSELLATION PIPELINE - A tessellation pipeline includes an alpha phase and a beta phase. The alpha phase includes pre-tessellation processing stages, while the beta phase includes post-tessellation processing stages. A processing unit configured to implement a processing stage in the alpha phase stores input graphics data within a buffer and then copies over that buffer with output graphics data, thereby conserving memory resources. The processing unit may also copy output graphics data directly to a level 2 (L2) cache for beta phase processing by other tessellation pipelines, thereby avoiding the need for fixed function copy-out hardware. | 09-18-2014 |
20140267320 | TECHNIQUE FOR IMPROVING THE PERFORMANCE OF A TESSELLATION PIPELINE - A tessellation pipeline includes an alpha phase and a beta phase. The alpha phase includes pre-tessellation processing stages, while the beta phase includes post-tessellation processing stages. A processing unit configured to implement a processing stage in the alpha phase stores input graphics data within a buffer and then copies over that buffer with output graphics data, thereby conserving memory resources. The processing unit may also copy output graphics data directly to a level 2 (L2) cache for beta phase processing by other tessellation pipelines, thereby avoiding the need for fixed function copy-out hardware. | 09-18-2014 |
20140285500 | PROGRAMMABLE GRAPHICS PROCESSOR FOR MULTITHREADED EXECUTION OF PROGRAMS - A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer. | 09-25-2014 |
20140327684 | GRAPHICS PROCESSING SYSTEMS - A tile-based graphics processing system comprises a host processor | 11-06-2014 |
20140347374 | IMAGE PROCESSING CIRCUIT AND SEMICONDUCTOR INTEGRATED CIRCUIT - This image processing circuit performs, with reduced power consumption, pipeline processing of image data. This image processing circuit has an image processing unit which performs pipeline processing of image data having N-bit pixel data. The image processing unit has a pipeline register ( | 11-27-2014 |
20150035843 | GRAPHICS PROCESSING UNIT MANAGEMENT SYSTEM FOR COMPUTED TOMOGRAPHY - A method and apparatus for managing operation of graphics processing units in a computer system. A processing thread in the computer system controls processing of a set of sinogram data by a graphics processing unit to form a set of graphics data. An output thread in the computer system controls writing of the set of graphics data to a storage system. The graphics processing unit is available to perform processing of another set of sonogram data while the set of graphics data is written to the storage system such that the increased availability of a plurality of graphics processing units is present. | 02-05-2015 |
20150049096 | Systems for Handling Virtual Machine Graphics Processing Requests - A system for handling graphics processing requests that includes a hypervisor having access to one or more graphics processing units (GPUs) and a network communication pipeline which transmits unprocessed graphics data and processed graphics data between virtual machines. The system further includes a first virtual machine (VM) having software installed thereon capable of obtaining graphics processing requests and associated unprocessed graphics data generated by the first VM, and transmitting the unprocessed graphics data and receiving processed graphics data via the network communication pipeline, and a second VM having access to the one or more graphics processing units (GPUs) via the hypervisor, and having software installed thereon capable of receiving transmitted unprocessed graphics data and transmitting processed graphics data via the network communication pipeline. | 02-19-2015 |
20150054837 | GPU PREDICATION - Techniques are disclosed relating to predication. In one embodiment, a graphics processing unit is disclosed that includes a first set of architecturally-defined registers configured to store predication information. The graphics processing unit further includes a second set of registers configured to mirror the first set of registers and an execution pipeline configured to discontinue execution of an instruction sequence based on predication information in the second set of registers. In one embodiment, the second set of registers includes one or more registers proximal to an output of the execution pipeline. In some embodiments, the execution pipeline writes back a predicate value determined for a predicate writer to the second set of registers. The first set of architecturally-defined registers is then updated with the predicate value written back to the second set of registers. In some embodiments, the execution pipeline discontinues execution of the instruction sequence without stalling. | 02-26-2015 |
20150062134 | PARAMETER FIFO FOR CONFIGURING VIDEO RELATED SETTINGS - A graphics system may include one or more processing units for processing a current display frame, each processing unit including a plurality of parameter registers for storing parameter settings used in processing the current display frame. A parameter buffer in the graphics system may store frame packets, with each frame packet containing information corresponding to parameter settings to be used for at least one display frame. A control circuit coupled to the buffer and to the one or more processing units may retrieve a top frame packet from the parameter buffer and determine if the frame packet is an internal type, i.e., intended for internal registers in a respective processing unit or if it is an external type, i.e., intended for an external register elsewhere in the graphics system. Based on the type of frame packet, the control circuit may update one or more register values accordingly. | 03-05-2015 |
20150070365 | ARBITRATION METHOD FOR MULTI-REQUEST DISPLAY PIPELINE - Embodiments of an apparatus and method are disclosed that may allow for arbitrating multiple read requests to fetch pixel data from a memory. The apparatus may include a first and a second processing pipeline, and a control unit. Each of the processing pipelines may be configured to generate a plurality of read requests to fetch a respective one of a plurality of portions of stored pixel data. The control unit may be configured to determine a priority for each read request dependent upon display coordinates of one or more pixels corresponding to each of the plurality of portions of stored pixel data, and determine an order for the plurality of read requests dependent upon the determined priority for each read request. | 03-12-2015 |
20150084968 | NEIGHBOR CONTEXT CACHING IN BLOCK PROCESSING PIPELINES - Methods and apparatus for caching neighbor data in a block processing pipeline that processes blocks in knight's order with quadrow constraints. Stages of the pipeline may maintain two local buffers that contain data from neighbor blocks of a current block. A first buffer contains data from the last C blocks processed at the stage. A second buffer contains data from neighbor blocks on the last row of a previous quadrow. Data for blocks on the bottom row of a quadrow are stored to an external memory at the end of the pipeline. When a block on the top row of a quadrow is input to the pipeline, neighbor data from the bottom row of the previous quadrow is read from the external memory and passed down the pipeline, each stage storing the data in its second buffer and using the neighbor data in the second buffer when processing the block. | 03-26-2015 |
20150084969 | NEIGHBOR CONTEXT PROCESSING IN BLOCK PROCESSING PIPELINES - A block processing pipeline in which blocks are input to and processed according to row groups so that adjacent blocks on a row are not concurrently at adjacent stages of the pipeline. A stage of the pipeline may process a current block according to neighbor pixels from one or more neighbor blocks. Since adjacent blocks are not concurrently at adjacent stages, the left neighbor of the current block is at least two stages downstream from the stage. Thus, processed pixels from the left neighbor can be passed back to the stage for use in processing the current block without the need to wait for the left neighbor to complete processing at a next stage of the pipeline. In addition, the neighbor blocks may include blocks from the row above the current block. Information from these neighbor blocks may be passed to the stage from an upstream stage of the pipeline. | 03-26-2015 |
20150084970 | REFERENCE FRAME DATA PREFETCHING IN BLOCK PROCESSING PIPELINES - Block processing pipeline methods and apparatus in which pixel data from a reference frame is prefetched into a search window memory. The search window may include two or more overlapping regions of pixels from the reference frame corresponding to blocks from the rows in the input frame that are currently being processed in the pipeline. Thus, the pipeline may process blocks from multiple rows of an input frame using one set of pixel data from a reference frame that is stored in a shared search window memory. The search window may be advanced by one column of blocks by initiating a prefetch for a next column of reference data from a memory. The pipeline may also include a reference data cache that may be used to cache a portion of a reference frame and from which at least a portion of a prefetch for the search window may be satisfied. | 03-26-2015 |
20150091913 | TECHNIQUES AND ARCHITECTURE FOR IMPROVED VERTEX PROCESSING - An apparatus may include an index buffer to store an index stream having a multiplicity of index entries corresponding to vertices of a mesh and a vertex cache to store a multiplicity of processed vertices of the mesh. The apparatus may further include a processor circuit, and a vertex manager for execution on the processor circuit to read a reference bitstream comprising a multiplicity of bitstream entries, each bitstream entry corresponding to an index entry of the index stream, and to remove a processed vertex from the vertex cache when a value of the reference bitstream entry corresponding to the processed vertex is equal to a defined value. | 04-02-2015 |
20150091914 | PROCESSING ORDER IN BLOCK PROCESSING PIPELINES - A knight's order processing method for block processing pipelines in which the next block input to the pipeline is taken from the row below and one or more columns to the left in the frame. The knight's order method may provide spacing between adjacent blocks in the pipeline to facilitate feedback of data from a downstream stage to an upstream stage. The rows of blocks in the input frame may be divided into sets of rows that constrain the knight's order method to maintain locality of neighbor block data. Invalid blocks may be input to the pipeline at the left of the first set of rows and at the right of the last set of rows, and the sets of rows may be treated as if they are horizontally arranged rather than vertically arranged, to maintain continuity of the knight's order algorithm. | 04-02-2015 |
20150091915 | CURRENT CHANGE MITIGATION POLICY FOR LIMITING VOLTAGE DROOP IN GRAPHICS LOGIC - Methods and apparatus relating to a current change mitigation policy for limiting voltage droop in graphics logic are described. In an embodiment, logic inserts one or more bubbles in one or more Execution Unit (EU) logic pipelines or one or more sampler logic pipelines of a processor. The bubbles at least temporarily reduce execution of operations in one or more subsystems of the processor based at least partially on a comparison of a first value and one or more clamping threshold values. The first value is determined based at least partially on a summation of products of one or more event counts and dynamic capacitance weights for one or more subsystems of the processor. Other embodiments are also disclosed and claimed. | 04-02-2015 |
20150091916 | SYSTEM ON CHIP INCLUDING CONFIGURABLE IMAGE PROCESSING PIPELINE AND SYSTEM INCLUDING THE SAME - A system on chip (SoC) including a configurable image processing pipeline is provided. The SoC includes a bus; a first image processing module configured to be connected to the bus and to process image data; a first image processing stage configured to transmit either first image data or second image data received from the bus to at least one of the bus and the first image processing module through a first bypass path in response to first control signals; and a second image processing stage configured to transmit either third image data received from the first image processing module or fourth image data received from the bus to the bus through one of a second bypass path and a second scaler path in response to second control signals. | 04-02-2015 |
20150097846 | External Validation of Graphics Pipelines - Data may be streamed out of a graphics pipeline during run time without preprogramming the stream out. A command stream may be captured, draw commands monitored, and shader output definitions may be parsed to determine how to stream out shader data, for example for debugging. | 04-09-2015 |
20150103082 | PIPELINE SYSTEM INCLUDING FEEDBACK ROUTES AND METHOD OF OPERATING THE SAME - A pipeline system includes input buffers, a relay for controlling withdrawal of data stored in the input buffers, and functional blocks for performing one or more processing operations. A method of operating a pipeline system includes withdrawing data from one of input buffers and performing different one or more processing operations. | 04-16-2015 |
20150138210 | METHOD AND SYSTEM FOR CONTROLLING DISPLAY PARAMETERS THROUGH MULTIPLE PIPELINES - A method and a system for controlling display parameters through multiple inter-integrated circuit (I2C) pipelines are provided. The method includes creating the multiple I2C pipelines to control the display parameters in one or more of the display devices. The method also includes sending control data to graphic cards associated with one or more display devices through the multiple I2C pipelines. Further, the method includes forwarding the control data from the graphic cards to the associated one or more display devices. Additionally, the method includes applying the display parameters automatically based on the control data. | 05-21-2015 |
20150145873 | Image Processing Techniques - Techniques are described that can delay or even prevent use of memory to store triangles associated with tiles as well as processing resources associated with vertex shading and binning triangles. The techniques can also provide better load balancing among a set of cores, and hence provide better performance. A bounding volume is generated to represent a geometry group. Culling takes place to determine whether a geometry group is to have triangles rendered. Vertex shading and association of triangles with tiles can be performed across multiple cores in parallel. Processing resources are allocated for rasterizing tiles that have been vertex shaded and binned triangles over tiles that have yet to be vertex shaded and binned triangles. Rasterization of triangles of different tiles can be performed by multiple cores in parallel. | 05-28-2015 |
20150145874 | IMPLEMENTATION DESIGN FOR HYBRID TRANSFORM CODING SCHEME - A method and system may identify a video data block using a video codec and apply a transform kernel of a butterfly asymmetric discrete sine transform (ADST) to the video data block in a pipeline. | 05-28-2015 |
20150302544 | COORDINATE BASED QOS ESCALATION - Systems and methods for determining priorities of pixel fetch requests of separate requestors in a display control unit. The distance between the oldest pixel in an output buffer and the output equivalent coordinate of the oldest outstanding source pixel read request for each requestor in the display control unit is calculated. Then, a priority is assigned to each requestor based on this calculated distance. If a given requestor lags behind the other requestors based on a comparison of the distance between the oldest pixel and the output equivalent coordinate of the oldest outstanding source pixel read, then source pixel fetch requests for this given requestor are given a higher priority than source pixel fetch requests for the other requestors. | 10-22-2015 |
20150309940 | GPU SHARED VIRTUAL MEMORY WORKING SET MANAGEMENT - A method and apparatus of a device that manages virtual memory for a graphics processing unit is described. In an exemplary embodiment, the device manages a graphics processing unit working set of pages. In this embodiment, the device determines the set of pages of the device to be analyzed, where the device includes a central processing unit and the graphics processing unit. The device additionally classifies the set of pages based on a graphics processing unit activity associated with the set of pages and evicts a page of the set of pages based on the classifying. | 10-29-2015 |
20150317763 | GRAPHICS PROCESSING SYSTEMS - A tile based graphics processing pipeline comprises a plurality of processing stages, including at least a rasteriser that rasterises input primitives to generate graphics fragments to be processed, and a renderer that processes fragments generated by the rasteriser to generate rendered fragment data, and a processing stage | 11-05-2015 |
20150348224 | Graphics Pipeline State Object And Model - An innovative GPU framework and related APIs present more accurate representations of the target hardware so that the distinctions between the fixed-function and programmable features of the GPU are perceived by a developer. This permits a program and/or a graphics object generated or manipulated by the program to be understood as not just code, but machine states that are associated with the code. When such an object is defined, the definitional components requiring programmable GPU features can be compiled only once and reused repeatedly as needed. Similarly, when a state change is made, the state changes correspond to the state changes made on the hardware. Additionally, the creation of these immutable objects prevents a developer from inadvertently changing portions of the program or object that cause it to behave differently than intended. | 12-03-2015 |
20150379679 | Single Read Composer with Outputs - A processing unit for generating multiple output items for output to a display or encoder. The processing unit may include a memory that stores data that will be used by a composer to generate the multiple output items. The processing unit may include a composer that executes only a single memory read operation when obtaining the data and splits the data to generate the multiple output items. The composer also may perform a function on the data before the data is split if all of the multiple output items require the data to undergo this function. The processing unit may also include a number of output buffers that each receive an output item from the composer and deliver the output item to an output such as a display or encoder. | 12-31-2015 |
20160005140 | GRAPHICS PROCESSING - A graphics processing pipeline ( | 01-07-2016 |
20160042488 | METHOD AND SYSTEM FOR FRAME PACING - A frame pacing method, computer program product, and computing system are provided for graphics processing. | 02-11-2016 |
20160055614 | Image Processing Techniques - Techniques are described that can delay or even prevent use of memory to store triangles associated with tiles as well as processing resources associated with vertex shading and binning triangles. The techniques can also provide better load balancing among a set of cores, and hence provide better performance. A bounding volume is generated to represent a geometry group. Culling takes place to determine whether a geometry group is to have triangles rendered. Vertex shading and association of triangles with tiles can be performed across multiple cores in parallel. Processing resources are allocated for rasterizing tiles whose triangles have been vertex shaded and binned over tiles whose triangles have yet to be vertex shaded and binned. Rasterization of triangles of different tiles can be performed by multiple cores in parallel. | 02-25-2016 |
20160063662 | PIPELINE DEPENDENCY RESOLUTION - Techniques are disclosed relating to dependency resolution among processor pipelines. In one embodiment, an apparatus includes a first special-purpose pipeline configured to execute, in parallel, a first type of graphics instruction for a group of graphics elements and a second special-purpose pipeline configured to execute, in parallel, a second type of graphics instruction for the group of graphics elements. In this embodiment, the apparatus is configured, in response to dispatch of an instruction of the second type, to mark a particular instruction of the first type with information indicative of the dispatched instruction. In this embodiment, the particular instruction and the dispatched instruction correspond to the same group of graphics elements. In this embodiment, the apparatus is configured to stall performance of the dispatched instruction until the first special-purpose pipeline has completed execution of the marked particular instruction. Exemplary instruction types include interpolate and sample instructions. | 03-03-2016 |
20160086298 | DISPLAY PIPE LINE BUFFER SHARING - An apparatus for processing graphics data may include a plurality of processing pipelines, each pipeline configured to receive and process pixel data. A functional unit may combine the outputs of each processing pipeline. A buffer included in a given processing pipeline may be configured to store data from the functional unit in response to a determination that the given processing pipeline is inactive. The buffer may then send the stored data to a memory. | 03-24-2016 |
20160093087 | LOW LATENCY INK RENDERING PIPELINE - Systems and methods are provided for improving the latency for display of ink during user creation of ink content with a stylus, mouse, finger (or other touch input), or other drawing device for tracing a desired location for ink content in a display area. In order to reduce or minimize the time for display of ink content created by a user using a stylus/mouse/touch input/other device, a separate ink rendering process thread can be used that operates within the operating system and in parallel to other application threads. When it is desired to create ink content within an application, user interactions corresponding to creation of ink content can be handled by the separate ink rendering process thread. This can avoid potential delays in displaying ink content due to an application handling other events in a process flow. | 03-31-2016 |
20160104263 | Method And Apparatus Of Latency Profiling Mechanism - Techniques related to a latency profiling mechanism are described. A method may monitor at least one attribute associated with each of one or more frames of images by tracking a respective identifier of each of the one or more frames as each of the one or more frames is processed through a first pipeline of one or more processing stages of an image processing device. The method may also obtain one or more indications related to one or more performance indices in the first pipeline of one or more processing stages based at least in part on the monitoring of the at least one attribute. | 04-14-2016 |
20160117796 | Content Adaptive Decoder Quality Management - In one example, a quality management controller of a video processing system may optimize a video recovery action through the selective dropping of video frames. The video processing system may store a compressed video data set in memory. The video processing system may receive a recovery quality indication describing a recovery priority of a user. The video processing system may apply a quality management controller in a video pipeline to execute a video recovery action to retrieve an output data set from the compressed video data set using a video decoder. The quality management controller may select a recovery initiation frame from the compressed video data set to be an initial frame to decompress based upon the recovery quality indication. | 04-28-2016 |
20160140684 | SORT-FREE THREADING MODEL FOR A MULTI-THREADED GRAPHICS PIPELINE - Methods and apparatus relating to sort-free threading model for a multi-threaded graphics pipeline are described. In an embodiment, draw requests, corresponding to one or more primitives in an image, are stored in entries of a queue (e.g., in the order received). Each entry remains locked until both a front-end and a back-end of a graphics pipeline have completed one or more operations associated with the draw request. Other embodiments are also disclosed and claimed. | 05-19-2016 |
20160155399 | VARIABLE FRAME REFRESH RATE | 06-02-2016 |
20160171642 | Overlap Aware Reordering of Rendering Operations for Efficiency | 06-16-2016 |
20160379331 | APPARATUS AND METHOD FOR VERIFYING THE INTEGRITY OF TRANSFORMED VERTEX DATA IN GRAPHICS PIPELINE PROCESSING - The present application relates to an apparatus for verifying the integrity of transformed vertex data and a method of operating thereof. The apparatus comprises a graphics processing pipeline with a vertex shader unit, a buffer and a comparator unit. The vertex shader unit receives a stream of vertex data according to a vertex specification. The vertex shader unit applies a transformation on each of the vertices in the received stream of vertex data to and outputs a stream of transformed vertex data. The buffer is coupled to the vertex shader unit to buffer the transformed vertex data. The comparator is further configured to verify the integrity of at least a subset of the transformed vertex data on the basis of reference data and to issue a fault indication signal in case the verification fails. | 12-29-2016 |
20160379332 | APPARATUS AND METHOD FOR VERIFYING IMAGE DATA COMPRISING MAPPED TEXTURE IMAGE DATA - The present application relates to an apparatus for verifying the integrity of image data comprising mapped texture data is provided and a method of operating thereof. A fragment shader unit is coupled to first and second frame buffers and at least one texture buffer. A first texture sampler unit is configured to output texture mapped fragments to the first frame buffer. A second texture sampler unit is configured to output texture mapped fragments to the second frame buffer. A comparator unit is further configured to compare the image data stored in the first frame buffer and in the second frame buffer. A fault indication signal is issued in case the image data of the first and the second frame buffers mismatch. | 12-29-2016 |
20160379333 | APPARATUS AND METHOD FOR VERIFYING FRAGMENT PROCESSING RELATED DATA IN GRAPHICS PIPELINE PROCESSING - The present application relates to an apparatus for verifying fragment processing related data and a method of operating thereof. The fragment shader unit is coupled to the at least one data buffer. A fragment shader unit of a graphics processing pipeline receives fragment data and records fragment processing related data in the at least one data buffer on processing one or more fragments in accordance with the received fragment data. A comparator unit coupled to the at least one data buffer compares the recorded fragment processing related data in the at least one data buffer to reference data and issues a fault indication signal in case the recorded fragment processing related data and the reference data mismatch. | 12-29-2016 |
20160379335 | GRAPHICS PIPELINE METHOD AND APPARATUS - Provided are a graphics pipeline method and apparatus. For each of plural screen pixels, locations of one or more sampling are determined based on a set pattern to modify an image to be rendered. A pixel corresponding to a set primitive is generated at a determined location of a sampling point, of the one or more sampling points. The image is rendered using the generated pixel. | 12-29-2016 |
20160379336 | METHODS OF A GRAPHICS-PROCESSING UNIT FOR TILE-BASED RENDERING OF A DISPLAY AREA AND GRAPHICS-PROCESSING APPARATUS - A method of a graphics-processing unit (GPU) for tile-based rendering of a display area and a graphics-processing apparatus are provided. The method includes the steps of computing vertex positions of a plurality of vertexes, wherein the first vertex corresponds to a first thread and the second vertex corresponds to a second thread; determining whether a thread merge condition is satisfied; merging the first thread and the second thread to a thread group when determining that the thread merge condition is satisfied; computing vertex varyings of the plurality of vertexes, wherein when the first thread and the second thread are merged to the thread group, a varying of the first vertex and a varying of the second vertex are computed with respect to a program counter. | 12-29-2016 |
20170236321 | CUSTOMIZABLE STATE MACHINE FOR VISUAL EFFECT INSERTION | 08-17-2017 |
20190147829 | DELIVERY OF DISPLAY SYMBOLS TO A DISPLAY SOURCE | 05-16-2019 |