Reducing an impact of a stall or pipeline bubble

Subclass of:

712 - Electrical computers and digital processing systems: processing architectures and instruction processing (e.g., processors)

712216000 - DYNAMIC INSTRUCTION DEPENDENCY CHECKING, MONITORING OR CONFLICT RESOLUTION

Patent class list (only not empty are listed)

Deeper subclasses:

Class / Patent application number	Description	Number of patent applications / Date published
712219000	Reducing an impact of a stall or pipeline bubble	32
20080229078	Dynamic Power Management in a Processor Design - Dynamic power management in a processor design is presented. A pipeline stage's stall detection logic detects a stall condition, and sends a signal to idle detection logic to gate off the pipeline's register clocks. The stall detection logic also monitors a downstream pipeline stage's stall condition, and instructs the idle detection logic to gate off the pipeline stage's registers when the downstream pipeline stage is in a stall condition as well. In addition, when the pipeline stage's stall detection logic detects a stall condition, either from the downstream pipeline stage or from its own pipeline units, the pipeline stage's stall detection logic informs an upstream pipeline stage to gate off its clocks and thus, conserve more power.	09-18-2008
20080244234	System and Method for Executing Instructions Prior to an Execution Stage in a Processor - A method of processing a plurality of instructions in multiple pipeline stages within a pipeline processor is disclosed. The method partially or wholly executes a stalled instruction in a pipeline stage that has a function other than instruction execution prior to the execution stage within the processor. Partially or wholly executing the instruction prior to the execution stage in the pipeline speeds up the execution of the instruction and allows the processor to more effectively utilize its resources, thus increasing the processor's efficiency.	10-02-2008
20080244235	CIRCUIT MARGINALITY VALIDATION TEST FOR AN INTEGRATED CIRCUIT - A high volume manufacturing (HVM) and circuit marginality validation (CMV) test for an integrated circuit (IC) is disclosed. The IC comprises a port binding and bubble logic in the front end to provide flexibility in binding a port to the uop and to create empty spaces (bubbles) in the uop flow. The out-of-order (OOO) cluster of the IC comprises reservation disable logic to control the flow sequence of the uops and stop schedule logic to temporarily stop dispatching the uops from the OOO cluster to the execution (EXE) cluster. The EXE cluster of the IC comprises signal event uops to generate fault information and fused uJump uops to specify combination of branch prediction, direction, and resolution in any portion of the test. Such features provide a tester the flexibility to perform HVM and CMV testing of the OOO and EXE clusters of the IC.	10-02-2008
20080250230	Using a Modified Value GPR to Enhance Lookahead Prefetch - The present invention allows a microprocessor to identify and speculatively execute future instructions during a stall condition. This allows forward progress to be made through the instruction stream during the stall condition which would otherwise cause the microprocessor or thread of execution to be idle. The execution of such future instructions can initiate a prefetch of data or instructions from a distant cache or main memory, or otherwise make forward progress through the instruction stream. In this manner, when the instructions are re-executed (non speculatively executed) after the stall condition expires, they will execute with a reduced execution latency; e.g. by accessing data prefetched into the L1 cache, or enroute to the processor, or by executing the target instructions following a speculatively resolved mispredicted branch. In speculative mode, instruction operands may be invalid due to source loads that miss the L1 cache, facilities not available in speculative execution mode, or due to speculative instruction results that are not available. Dependency and dirty (i.e. invalid result) bits are tracked and used to determine which speculative instructions are valid for execution. A modified value register storage and bit vector are used to improve the availability of speculative results that would otherwise be discarded once they leave the execution pipeline because they cannot be written to the architected registers. The modified general purpose registers are used to store speculative results when the corresponding instruction reaches writeback and the modified bit vector tracks the results that have been stored there. Younger speculative instructions that do not bypass directly from older instructions will then use this modified data when the corresponding bit in the modified bit vector indicates the data has been modified. Otherwise, data from the architected registers will be used.	10-09-2008
20080313437	METHOD AND APPARATUS FOR EMPLOYING MULTI-BIT REGISTER FILE CELLS AND SMT THREAD GROUPS - There are provided methods and apparatus for multi-bit cell and SMT thread groups. An apparatus for a register file includes a plurality of multi-bit storage cells for storing a plurality of bits respectively corresponding to a plurality of threads. The apparatus further includes a plurality of port groups, operatively coupled to the plurality of multi-bit storage cells, responsive to physical register identifiers. The plurality of port groups is responsive to respective ones of a plurality of thread identifiers. Each of the plurality of thread identifiers are for uniquely identifying a particular thread from among a plurality of threads.	12-18-2008
20090006819	Single Hot Forward Interconnect Scheme for Delayed Execution Pipelines - A method and apparatus for forwarding data in a processor. The method includes providing at least one cascaded delayed execution pipeline unit having a first pipeline and a second pipeline, wherein the second pipeline executes instructions in a common issue group in a delayed manner relative to the first pipeline. The method further includes determining if a first instruction being executed in the first pipeline modifies data in a data register which is accessed by a second instruction being executed in the second pipeline. If the first instruction being executed in the first pipeline modifies data in the data register which is accessed by the second instruction being executed in the second pipeline, the modified data is forwarded from the first pipeline to the second pipeline.	01-01-2009
20090006820	Issue Unit for Placing a Processor into a Gradual Slow Mode of Operation - An issue unit for placing a processor into a gradual slow down mode of operation is provided. The gradual slow down mode of operation comprises a plurality of stages of slow down operation of an issue unit in a processor in which the issuance of instructions is slowed in accordance with a staging scheme. The gradual slow down of the processor allows the processor to break out of livelock conditions. Moreover, since the slow down is gradual, the processor may flexibly avoid various degrees of livelock conditions. The mechanisms of the illustrative embodiments impact the overall processor performance based on the severity of the livelock condition by taking a small performance impact on less severe livelock conditions and only increasing the processor performance impact when the livelock condition is more severe.	01-01-2009
20090049280	SOFTWARE CONTROLLED CPU PIPELINE PROTECTION - A processor in a digital system executes instructions in an instruction execution pipeline. The processor detects a pipeline protection directive while executing instructions and sets a pipeline protection mode in accordance with the directive. The processor then continues to fetch and execute instructions in an unprotected manner if the pipeline protection mode is off and continues to fetch and execute instruction in a protected manner if the pipeline protection mode is on.	02-19-2009
20090070562	METHOD AND APPARATUS FOR ASSIGNING THREAD PRIORITY IN A PROCESSOR OR THE LIKE - In a multi-threaded processor, thread priority variables are set up in memory. The actual assignment of thread priority is based on the expiration of a thread precedence counter. To further augment, the effectiveness of the thread precedence counters, starting counters are associated with each thread that serve as a multiplier for the value to be used in the thread precedence counter. The value in the starting counters are manipulated so as to prevent one thread from getting undue priority to the resources of the multi-threaded processor.	03-12-2009
20090132791	System and Method for Recovering From A Hang Condition In A Data Processing System - A data processing system, method, and computer-usable medium for recovering from a hang condition in a data processing system. The data processing system includes a collection of coupled processing units. The processing units include a collection of processing unit components such as, two or more processing cores, and a cache array, a processor core master, a cache snooper, and a local hang manager. The local hang manager determines whether at least one component out of the collection of processing unit components has entered into a hang condition. If the local hang manager determines at least one component has entered into a hang condition, a throttling manager throttles the performance of the processing unit in an attempt to break the at least one component out of the hang condition.	05-21-2009
20100017582	STALL PREDICTION THREAD MANAGEMENT - Thread switching prevents pipeline stalls when executing multiple threads. An analysis of a first thread identifies instructions capable of causing pipeline stalls. If pipeline stalls from the identified instructions are likely, thread switching instructions are added to the first thread in place of the identified instructions. Thread switching instructions direct a microprocessor to suspend executing the thread and begin executing a second thread. Thread switching instructions can be added to the second thread to enable the resumption of the first thread at the location specified by the identified instruction. The thread switching instructions are configured to avoid pipeline stalls when switching threads. Thread switching instructions can store and retrieve thread-specific information upon the suspension and resumption of threads. Thread switching instructions can schedule the execution of two or more threads in accordance with load balancing schemes. Threads can be modified using static or dynamic code analysis and modification techniques.	01-21-2010
20100042813	Redundant Execution of Instructions in Multistage Execution Pipeline During Unused Execution Cycles - A pipelined execution unit uses the bubbles that occur during execution to selectively repeat operations performed in one or more stages of a multistage execution pipeline to verify the results of such operations during otherwise unused execution cycles for the execution pipeline. Whenever a bubble follows a particular instruction within an execution pipeline, the result of an operation that is performed for that instruction by a particular stage of the execution pipeline may be stored, and the operation may be repeated by the stage in a subsequent execution cycle in which no productive operation would otherwise be performed due to the presence of the bubble. The results of the operations may then be compared and used to either verify the original result or identify a potential error in the execution of the instruction.	02-18-2010
20100077184	METHOD AND APPARATUS FOR REMOVING A PIPELINE BUBBLE - One embodiment of the present invention provides a system for augmenting a pipeline with a bubble-removal circuit. During operation, the system generates a bubble-removal circuit which determines a clock-enable signal based at least on whether an upstream register has valid data and whether the pipeline is stalled. Next, the system gates the clock signal using the clock-enable signal. The augmented pipeline can determine whether a first register contains invalid data, which is associated with a bubble. Next, the augmented pipeline determines whether a second register contains valid data, wherein the second register is adjacent to and upstream from the first register. If the first register contains invalid data and the second register contains valid data, the augmented pipeline replaces the invalid data of the first register with valid data based on the valid data in the second register without propagating the invalid data to a downstream register.	03-25-2010
20110055525	PROVIDING THREAD FAIRNESS IN A HYPER-THREADED MICROPROCESSOR - A method and apparatus for providing fairness in a multi-processing element environment is herein described. Mask elements are utilized to associated portions of a reservation station with each processing element, while still allowing common access to another portion of reservation station entries. Additionally, bias logic biases selection of processing elements in a pipeline away from a processing element associated with a blocking stall to provide fair utilization of the pipeline.	03-03-2011
20110296145	PIPELINED DIGITAL SIGNAL PROCESSOR - Reducing pipeline stall between a compute unit and address unit in a processor can be accomplished by computing results in a compute unit in response to instructions of an algorithm; storing in a local random access memory array in a compute unit predetermined sets of functions, related to the computed results for predetermined sets of instructions of the algorithm; and providing within the compute unit direct mapping of computed results to related function.	12-01-2011
20110307688	SYNTHESIS SYSTEM FOR PIPELINED DIGITAL CIRCUITS - Computer-implemented methods and systems for synthesizing a hardware description for a pipelined datapath for a digital circuit. A transactional datapath specification framework and a transactional design automation system automatically synthesize pipeline implementations. The transactional datapath specification framework captures an abstract datapath, whose execution semantics is interpreted as a sequence of “transactions” where each transaction reads the state values left by the preceding transaction and computes a new set of state values to be seen by the next transaction. The transactional datapath specification framework exposes sufficient information about state accesses that can occur in a datapath, which is necessary for performing precise data hazards analysis, and eventually pipeline synthesis.	12-15-2011
20110320777	DIRECT MEMORY ACCESS ENGINE PHYSICAL MEMORY DESCRIPTORS FOR MULTI-MEDIA DEMULTIPLEXING OPERATIONS - The architecture and techniques described herein can improve system performance with respect to the following. Communication between two interdependent hardware engines, that are part of pipeline, such that the engines are synchronized to consume resources when the engines are done with the work. Reduction of the role of software/firmware from feeding each stage of the hardware pipeline when the previous stage of the pipeline has completed. Reduction in the memory allocation for software-initialized hardware descriptors to improve performance by reducing pipeline stalls due to software interaction.	12-29-2011
20120036338	FACILITATING PROCESSING IN A COMPUTING ENVIRONMENT USING AN EXTENDED DRAIN INSTRUCTION - An extended DRAIN instruction is used to stall processing within a computing environment. The instruction includes an indication of the one or more processing stages at which processing is to be stalled. It also includes a control that allows processing to be stalled for additional cycles, as desired.	02-09-2012
20120124340	Retirement serialisation of status register access operations - A processor	05-17-2012
20120131313	Error recovery following speculative execution with an instruction processing pipeline - An instruction processing pipeline	05-24-2012
20120265968	Locating Bottleneck Threads in Multi-Thread Applications - A method for identifying a consumer-producer pattern in a multi-threaded application includes obtaining synchronization event data of the multi-threaded application, and identifying the consumer-producer communication pattern from the synchronization event data.	10-18-2012
20120278595	DETERMINING EACH STALL REASON FOR EACH STALLED INSTRUCTION WITHIN A GROUP OF INSTRUCTIONS DURING A PIPELINE STALL - During a pipeline stall in an out of order processor, until a next to complete instruction group completes, a monitoring unit receives, from a completion unit of a processor, a next to finish indicator indicating the finish of an oldest previously unfinished instruction from among a plurality of instructions of a next to complete instruction group. The monitoring unit receives, from a plurality of functional units of the processor, a plurality of finish reports including completion reasons for a plurality of separate instructions. The monitoring unit determines at least one stall reason from among multiple stall reasons for the oldest instruction from a selection of completion reasons from a selection of finish reports aligned with the next to finish indicator from among the plurality of finish reports. Once the monitoring unit receives a complete indicator from the completion unit, indicating the completion of the next to complete instruction group, the monitoring unit stores each determined stall reason aligned with each next to finish indicator in memory.	11-01-2012
20130013898	Managing Multiple Threads In A Single Pipeline - In one embodiment, the present invention includes a method for determining if an instruction of a first thread dispatched from a first queue associated with the first thread is stalled in a pipestage of a pipeline, and if so, dispatching an instruction of a second thread from a second queue associated with the second thread to the pipeline if the second thread is not stalled. Other embodiments are described and claimed.	01-10-2013
20130097409	INSTRUCTION-ISSUANCE CONTROLLING DEVICE AND INSTRUCTION-ISSUANCE CONTROLLING METHOD - In a multithread processor capable of executing a plurality of threads, in order to select a thread and instruction for increasing a throughput of the multithread processor, an instruction-issuance controlling device included in the multithread processor includes a resource management unit configured to manage stall information indicating whether or not each of threads in execution is in a stalled state; a thread selection unit configured to select a thread which is not in the stalled state among the threads in execution; and an instruction-issuance controlling unit configured to perform controlling so that simultaneously issuable instructions are issued from among the selected thread.	04-18-2013
20130246750	CONFIGURABLE PIPELINE BASED ON ERROR DETECTION MODE IN A DATA PROCESSING SYSTEM - A method includes providing a data processor having an instruction pipeline, where the instruction pipeline has a plurality of instruction pipeline stages, and where the plurality of instruction pipeline stages includes a first instruction pipeline stage and a second instruction pipeline stage. The method further includes providing a data processor instruction that causes the data processor to perform a first set of computational operations during execution of the data processor instruction, performing the first set of computational operations in the first instruction pipeline stage if the data processor instruction is being executed and a first mode has been selected, and performing the first set of computational operations in the second instruction pipeline stage if the data processor instruction is being executed and a second mode has been selected.	09-19-2013
20130297913	METHOD AND DEVICE FOR ABORTING TRANSACTIONS, RELATED SYSTEM AND COMPUTER PROGRAM PRODUCT - Current tasks being executed in a set of modules of a signal processing system managed via an interface block are aborted so as to permit the execution of new tasks by pipelining eliminating transactions of said current tasks and executing transactions of the new tasks. Upon arrival of a signal to abort the current tasks, data and/or memory accesses present in said interface block are discarded.	11-07-2013
20140101416	DETERMINING EACH STALL REASON FOR EACH STALLED INSTRUCTION WITHIN A GROUP OF INSTRUCTIONS DURING A PIPELINE STALL - During a pipeline stall in an out of order processor, until a next to complete instruction group completes, a monitoring unit receives, from a completion unit of a processor, a next to finish indicator indicating the finish of an oldest previously unfinished instruction from among a plurality of instructions of a next to complete instruction group. The monitoring unit receives, from a plurality of functional units of the processor, a plurality of finish reports including completion reasons for a plurality of separate instructions. The monitoring unit determines at least one stall reason from among multiple stall reasons for the oldest instruction from a selection of completion reasons from a selection of finish reports aligned with the next to finish indicator from among the plurality of finish reports. Once the monitoring unit receives a complete indicator from the completion unit, indicating the completion of the next to complete instruction group, the monitoring unit stores each determined stall reason aligned with each next to finish indicator in memory.	04-10-2014
20140281416	METHOD FOR IMPLEMENTING A REDUCED SIZE REGISTER VIEW DATA STRUCTURE IN A MICROPROCESSOR - A method for implementing a reduced size register view data structure in a microprocessor. The method includes receiving an incoming instruction sequence using a global front end; grouping the instructions to form instruction blocks; using a plurality of multiplexers to access ports of a scheduling array to store the instruction blocks as a series of chunks.	09-18-2014
20150012731	DISTRIBUTION OF TASKS AMONG ASYMMETRIC PROCESSING ELEMENTS - Techniques to control power and processing among a plurality of asymmetric cores. In one embodiment, one or more asymmetric cores are power managed to migrate processes or threads among a plurality of cores according to the performance and power needs of the system.	01-08-2015
20160092233	DYNAMIC ISSUE MASKS FOR PROCESSOR HANG PREVENTION - Embodiments include issuing dynamic issue masks for processor hang prevention. Aspects include storing an instruction in an issue queue for execution by an execution unit, the instruction including a default issue mask. Aspects further include determining whether the instruction in the issue queue is likely to be rescinded by the execution unit. Based on determining that the instruction is not likely to be rescinded by the execution unit, aspects include issuing the instruction to the execution unit with the default issue mask. Based on determining that the instruction is likely to be rescinded by the execution unit, aspects include issuing the instruction to the execution unit with a likely to be rescinded issue mask.	03-31-2016
20160170770	PROVIDING EARLY INSTRUCTION EXECUTION IN AN OUT-OF-ORDER (OOO) PROCESSOR, AND RELATED APPARATUSES, METHODS, AND COMPUTER-READABLE MEDIA	06-16-2016
20220137976	Removal of Dependent Instructions from an Execution Pipeline - Techniques are disclosed relating to an apparatus, including a data storage circuit having a plurality of entries, and a load-store pipeline configured to allocate an entry in the data storage circuit in response to a determination that a first instruction includes an access to an external memory circuit. The apparatus further includes an execution pipeline configured to make a determination, while performing a second instruction and using the entry in the data storage circuit, that the second instruction uses a result of the first instruction, and cease performance of the second instruction in response to the determination.	05-05-2022

Patent applications in class Reducing an impact of a stall or pipeline bubble

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Reducing an impact of a stall or pipeline bubble

Subclass of:

712 - Electrical computers and digital processing systems: processing architectures and instruction processing (e.g., processors)

712216000 - DYNAMIC INSTRUCTION DEPENDENCY CHECKING, MONITORING OR CONFLICT RESOLUTION

Patent class list (only not empty are listed)

Deeper subclasses: