Patent application number | Description | Published |
20090106539 | METHOD AND SYSTEM FOR ANALYZING A COMPLETION DELAY IN A PROCESSOR USING AN ADDITIVE STALL COUNTER - In a data processing system having a set of components for performing a set of operations, in which one or more of the set of operations has processing dependencies with respect to other of the set of operations, a method for using an additive stall counter to analyze a completion delay is disclosed. The method includes initiating execution of a group of instructions and a performance monitor unit resetting a value stored within the additive stall counter. The method further includes the performance monitor unit incrementing the value within the additive stall counter until all instructions within the group of instructions complete. In response to all instructions within the group of instructions completing a cause of the completion delay is determined. In response to determining that the delay was caused by the first stall cause, the value stored within the additive stall counter is added to a first performance monitor counter designated for the first stall cause, and, in response to determining that the delay was caused by a second stall cause, the value stored within the additive stall counter is added to a second performance monitor counter designated for the second stall cause. | 04-23-2009 |
20090276190 | Method and Apparatus For Evaluating Integrated Circuit Design Performance Using Basic Block Vectors, Cycles Per Instruction (CPI) Information and Microarchitecture Dependent Information - A test system or simulator includes an integrated circuit (IC) benchmark software program that executes workload program software on a semiconductor die IC design model. The benchmark software program includes trace, simulation point, basic block vector (BBV) generation, cycles per instruction (CPI) error, clustering and other programs. The test system also includes CPI stack program software that generates CPI stack data that includes microarchitecture dependent information for each instruction interval of workload program software. The CPI stack data may also include an overall analysis of CPI data for the entire workload program. IC designers may utilize the benchmark software and CPI stack program to develop a reduced representative workload program that includes CPI data as well as microarchitecture dependent information. | 11-05-2009 |
20110302395 | HARDWARE ASSIST THREAD FOR DYNAMIC PERFORMANCE PROFILING - A method and data processing system for managing running of instructions in a program. A processor of the data processing system receives a monitoring instruction of a monitoring unit. The processor determines if at least one secondary thread of a set of secondary threads is available for use as an assist thread. The processor selects the at least one secondary thread from the set of secondary threads to become the assist thread in response to a determination that the at least one secondary thread of the set of secondary threads is available for use as an assist thread. The processor changes profiling of running of instructions in the program from the main thread to the assist thread. | 12-08-2011 |
20120278595 | DETERMINING EACH STALL REASON FOR EACH STALLED INSTRUCTION WITHIN A GROUP OF INSTRUCTIONS DURING A PIPELINE STALL - During a pipeline stall in an out of order processor, until a next to complete instruction group completes, a monitoring unit receives, from a completion unit of a processor, a next to finish indicator indicating the finish of an oldest previously unfinished instruction from among a plurality of instructions of a next to complete instruction group. The monitoring unit receives, from a plurality of functional units of the processor, a plurality of finish reports including completion reasons for a plurality of separate instructions. The monitoring unit determines at least one stall reason from among multiple stall reasons for the oldest instruction from a selection of completion reasons from a selection of finish reports aligned with the next to finish indicator from among the plurality of finish reports. Once the monitoring unit receives a complete indicator from the completion unit, indicating the completion of the next to complete instruction group, the monitoring unit stores each determined stall reason aligned with each next to finish indicator in memory. | 11-01-2012 |
20130142301 | FLOATING-POINT EVENT COUNTERS WITH AUTOMATIC PRESCALING - Occurrences of a particular event in an electronic device are counted by incrementing an event counter each time a variable number of the particular events have occurred, and automatically increasing that variable number as the total count increases. The variable number (prescale value) can increase geometrically according to a programmable counter base each time the count mantissa overflows. The event counter thereby provides hardware-implemented automatic prescaling while significantly reducing the number of interface bits required to support very large count ranges, and retaining high accuracy at very large event counts. | 06-06-2013 |
20130151816 | DELAY IDENTIFICATION IN DATA PROCESSING SYSTEMS - Methods, systems, and computer program products may provide delay-identification in data processing systems. An apparatus may include a delay-identification unit having a delay counter, a threshold register, a delay register, and a delay detector. The delay detector may be configured to start the delay counter in response to detecting that one group of instructions is delayed, and stop the delay counter in response to detecting that the one group of instructions is no longer delayed. The delay detector may additionally be configured to compare the number of cycles counted by the delay counter with a threshold number of cycles in the threshold register, and store at least one effective address of one of the instructions of the one group of instructions when the number of cycles counted by the delay counter is greater than the threshold number of cycles stored in the threshold register. | 06-13-2013 |
20140075158 | IDENTIFYING LOAD-HIT-STORE CONFLICTS - A computing device identifies a load instruction and store instruction pair that causes a load-hit-store conflict. A processor tags a first load instruction that instructs the processor to load a first data set from memory. The processor stores an address at which the first load instruction is located in memory in a special purpose register. The processor determines where the first load instruction has a load-hit-store conflict with a first store instruction. If the processor determines the first load instruction has a load-hit store conflict with the first store instruction, the processor stores an address at which the first data set is located in memory in a second special purpose register, tags the first data set being stored by the first store instruction, stores an address at which the first store instruction is located in memory in a third special purpose register and increases a conflict counter. | 03-13-2014 |
20140075164 | TEMPORAL LOCALITY AWARE INSTRUCTION SAMPLING - A method and system are disclosed for sampling instructions executing on a computer processor. A computer processor determines a number of times a specified event has occurred within a specified temporal window. The computer processor determines to mark an instruction to be executed for monitoring based on the number of times the specified event has occurred within the temporal window, and in response, the computer processor marks the instruction. | 03-13-2014 |
20140101416 | DETERMINING EACH STALL REASON FOR EACH STALLED INSTRUCTION WITHIN A GROUP OF INSTRUCTIONS DURING A PIPELINE STALL - During a pipeline stall in an out of order processor, until a next to complete instruction group completes, a monitoring unit receives, from a completion unit of a processor, a next to finish indicator indicating the finish of an oldest previously unfinished instruction from among a plurality of instructions of a next to complete instruction group. The monitoring unit receives, from a plurality of functional units of the processor, a plurality of finish reports including completion reasons for a plurality of separate instructions. The monitoring unit determines at least one stall reason from among multiple stall reasons for the oldest instruction from a selection of completion reasons from a selection of finish reports aligned with the next to finish indicator from among the plurality of finish reports. Once the monitoring unit receives a complete indicator from the completion unit, indicating the completion of the next to complete instruction group, the monitoring unit stores each determined stall reason aligned with each next to finish indicator in memory. | 04-10-2014 |
20140108770 | IDENTIFYING LOAD-HIT-STORE CONFLICTS - A computing device identifies a load instruction and store instruction pair that causes a load-hit-store conflict. A processor tags a first load instruction that instructs the processor to load a first data set from memory. The processor stores an address at which the first load instruction is located in memory in a special purpose register. The processor determines where the first load instruction has a load-hit-store conflict with a first store instruction. If the processor determines the first load instruction has a load-hit store conflict with the first store instruction, the processor stores an address at which the first data set is located in memory in a second special purpose register, tags the first data set being stored by the first store instruction, stores an address at which the first store instruction is located in memory in a third special purpose register and increases a conflict counter. | 04-17-2014 |