Class / Patent application number | Description | Number of patent applications / Date published |
712217000 | Scoreboarding, reservation station, or aliasing | 68 |
20080256340 | Distributed File Fuzzing - Embodiments provide a distributed file fuzzing environment. In an embodiment, a number of computing devices can be used as part of a distributing fuzzing system. Fuzzing operations can be distributed to the number of computing devices and processed accordingly. A group or team can be defined to process particular fuzzing operations that may be best suited to the group. The time required to perform a fuzzing operation can be reduced by distributing the fuzzing work to the number of computing devices. A client can be associated with each computing device and used in conjunction with fuzzing operations. | 10-16-2008 |
20080263331 | Universal Register Rename Mechanism for Instructions with Multiple Targets in a Microprocessor - A universal register rename mechanism for instructions with multiple targets using a common destination tag. For each instruction that updates multiple destinations, a single rename entry is allocated to handle all destinations associated with it. A rename entry now consists of a DTAG and a vector to indicate the type of destination(s) that is/are being updated by such a particular instruction. For example, a common DTAG can be assigned to a fixed point unit instruction (FXU) that updates general purpose register (GPR), fixed point exception register (XER), and condition code register (CR) destinations. During flush time, the DTAGs in the recovery link may be used to restore the information indicating that the youngest instruction updates a particular architected register. By using a single, universal rename structure for all types of destinations, a large saving in silicon and power can be realized without the need to sacrifice performance. | 10-23-2008 |
20080276076 | METHOD AND APPARATUS FOR REGISTER RENAMING - A method and apparatus for register renaming are provides in the illustrative embodiments. A mapper receives a request for a data in a logical register. The mapper searches an in-flight map table and a set of architected map tables for the data in the logical register. The mapper identifies an entry in one of the in-flight map table and an architected map table in the set of architected map tables that corresponds with the logical register in the request. The mapper returns a location of a physical register, which holds the requested data. | 11-06-2008 |
20080288754 | GENERATING STOP INDICATORS DURING VECTOR PROCESSING - A method for performing parallel operations in a computer system when one or more memory hazards may be present, which may be implemented by a processor, is described. During operation, the processor receives instructions for detecting conflict between memory addresses in vectors when operations are performed in parallel using at least a portion of the vectors, and generating one or more stop indicators corresponding to any detected conflict between the memory addresses, where a given stop indicator indicates a memory hazard. Next, the processor executes the instructions for detecting the conflict between the memory addresses and generating the one or more stop indicators. | 11-20-2008 |
20080301412 | High speed multiplexer - According to one embodiment, a high speed multiplexer includes a number of data inputs, a number of hot code select inputs, and a final data output. In one embodiment, the high speed multiplexer utilizes a number of intermediate multiplexers, each receiving respective hot code select inputs and providing an intermediate data output. In one embodiment, each intermediate multiplexer has a critical delay path comprising a first NAND gate and a second NAND gate. In one implementation a four-to-one intermediate multiplexer comprises a first two-input NAND gate and a second four-input NAND gate. In one embodiment, a 32-to-1 high speed multiplexer comprises four four-to-one intermediate multiplexers. According to one implementation of this embodiment, the 32-to-1 multiplexer has a critical delay path from any of the data inputs to the final data output comprising a first NAND gate, a second NAND gate, a NOR gate, and a third NAND gate. | 12-04-2008 |
20080313435 | Data processing apparatus and method for executing complex instructions - A data processing apparatus and method are provided for executing complex instructions. The data processing apparatus executes instructions defining operations to be performed by the data processing apparatus, those instructions including at least one complex instruction defining a sequence of operations to be performed. The data processing apparatus comprises a plurality of execution pipelines, each execution pipeline having a plurality of pipeline stages and arranged to perform at least one associated operation. Issue circuitry interfaces with the plurality of execution pipelines and is used to schedule performance of the operations defined by the instructions. For the at least one complex instruction, the issue circuitry is arranged to schedule a first operation in the sequence, and to issue control signals to one of the execution pipelines with which that first operation is associated, those control signals including an indication of each additional operation in the sequence. Then, when performance of the first operation reaches a predetermined pipeline stage in that execution pipeline, that predetermined pipeline stage is arranged to schedule a next operation in the sequence, and to issue additional control signals to a further one of the execution pipelines with which that next operation is associated in order to cause that next operation to be performed. This has been found to provide a particularly efficient mechanism for handling the execution of complex instructions without the need to provide dedicated execution pipelines for those complex instructions, and without an increase in complexity of the issue circuitry. | 12-18-2008 |
20090037698 | ADAPTIVE ALLOCATION OF RESERVATION STATION ENTRIES TO AN INSTRUCTION SET WITH VARIABLE OPERANDS IN A MICROPROCESSOR - A method and device for adaptively allocating reservation station entries to an instruction set with variable operands in a microprocessor. The device includes logic for determining free reservation station queue positions in a reservation station. The device allocates an issue queue to an instruction and writes the instruction into the issue queue as an issue queue entry. The device reads an operand corresponding to the instruction from a general purpose register and writes the operand into a reservation station using one of the free reservations station positions as a write address. The device writes each reservation station queue position corresponding to said instruction into said issue queue entry. When the instruction is ready for issue to an execution unit, the device reads out the instruction from the issue queue entry the reservation station queue positions to the execution unit. | 02-05-2009 |
20090150653 | Mechanism for soft error detection and recovery in issue queues - In one embodiment, the present invention includes logic to detect a soft error occurring in certain stages of a core and recover from such error if detected. One embodiment may include logic to determine if a lapsed time from a last instruction to issue from an issue stage of a pipeline exceeds a threshold and if so to reset a dispatch table, as well as to determine if a parity error is detected in an entry of the dispatch table associated with an enqueued instruction and if so to prevent the enqueued instruction from issuance. Other embodiments are described and claimed. | 06-11-2009 |
20090327661 | MECHANISMS TO HANDLE FREE PHYSICAL REGISTER IDENTIFIERS FOR SMT OUT-OF-ORDER PROCESSORS - Methods and apparatus relating to mechanisms to handle free physical register identifiers for SMT (Simultaneous Multi-Threading) out-of-order processors are described. In some embodiments, a physical register file stores both speculative data and architectural data corresponding to a plurality of registers. A free list logic may maintain free physical register identifiers corresponding to the plurality of registers. An instruction may read the architectural data from the physical register file at dispatch. Other embodiments are also described and claimed. | 12-31-2009 |
20090327662 | Managing active thread dependencies in graphics processing - A scoreboard for a video processor may keep track of only dispatched threads which have not yet completed execution. A first thread may itself snoop for execution of a second thread that must be executed before the first thread's execution. Thread execution may be freely reordered, subject only to the rule that a second thread, whose execution is dependent on execution of a first thread, can only be executed after the first thread. | 12-31-2009 |
20100058035 | System and Method for Double-Issue Instructions Using a Dependency Matrix - A method for double-issue complex instructions receives a complex instruction comprising a first portion and a second portion. The method sets a single issue queue slot and allocates an execution unit for the complex instruction, and identifies dependencies in the first and second portions. The method sets a dependency matrix slot and a consumers table slot for the first and section portion. In the event the first portion dependencies have been satisfied, the method issues the first portion and then issues the second portion from the single issue queue slot. In the event the second portion dependencies have not been satisfied, the method cancels the second portion issue. | 03-04-2010 |
20100153690 | USING REGISTER RENAME MAPS TO FACILITATE PRECISE EXCEPTION SEMANTICS - One embodiment of the present invention provides a system that facilitates precise exception semantics. The system includes a processor that uses register rename maps to support out-of-order execution, where the register rename maps track mappings between native architectural registers and physical registers for a program executing on the processor. These register rename maps include: 1) a working rename map that maps architectural registers associated with a decoded instruction to corresponding physical registers; 2) a retire rename map that tracks and preserves a set of physical registers that are associated with retired instructions; and 3) a checkpoint rename map that stores a mapping between a set of architectural registers and a set of physical registers for a preceding checkpoint in the program. When the program signals an exception, the processor uses the checkpoint rename map to roll back program execution to the preceding checkpoint. | 06-17-2010 |
20100205409 | NOVEL REGISTER RENAMING SYSTEM USING MULTI-BANK PHYSICAL REGISTER MAPPING TABLE AND METHOD THEREOF - Embodiments of a processor architecture utilizing multi-bank implementation of physical register mapping table are provided. A register renaming system to correlate architectural registers to physical registers includes a physical register mapping table and a renaming logic. The physical register mapping table has a plurality of entries each indicative of a state of a respective physical register. The mapping table has a plurality of non-overlapping sections each of which having respective entries of the mapping table. The renaming logic is coupled to search a number of the sections of the mapping table in parallel to identify entries that indicate the respective physical registers have a first state. The renaming logic selectively correlates each of a plurality of architectural registers to a respective physical register identified as being in the first state. Methods of utilizing the multi-bank implementation of physical register mapping table are also provided. | 08-12-2010 |
20100306509 | OUT-OF-ORDER EXECUTION MICROPROCESSOR WITH REDUCED STORE COLLISION LOAD REPLAY REDUCTION - An out-of-order execution microprocessor for reducing the likelihood of having to replay a load instruction due to a store collision. The microprocessor includes a queue of entries, each entry configured to hold an instruction pointer of a load instruction and to hold information useable to identify a store instruction that caused the load instruction to be replayed on a first instance of the load instruction. A register alias table (RAT) encounters instructions in program order and generates dependencies used to determine when the instructions may execute out of program order. The RAT encounters the load instruction on a second instance, determines that the load instruction second instance instruction pointer matches the instruction pointer of an entry of the queue, and causes the load instruction on the second instance to have a dependency on the store instruction identified by the information in the matching entry. | 12-02-2010 |
20110010528 | INFORMATION PROCESSING DEVICE AND VECTOR INFORMATION PROCESSING DEVICE - An information processing device implements a register renaming scheme for managing physical registers (e.g. hardware registers HR) coordinated with logical registers (e.g. software usable registers SUR) in conjunction with a renaming table. A first dedicated instruction is incorporated into an instruction set so that a free physical register is coordinated with a logical register designated by the first dedicated instruction. Alternatively, a second dedicated instruction is incorporated into the instruction set so that a physical register coordinated with a logical register designated by the second dedicated instruction is released to be free. In addition, the optimization is performed to change the number of software usable registers (SUR) and the number of renaming registers (RR) within the physical registers in conformity with the software executing the instruction set. Thus, it is possible to prevent the occurrence of an unwanted memory access instruction and dead time needed for releasing registers. | 01-13-2011 |
20110055524 | PROVIDING THREAD FAIRNESS IN A HYPER-THREADED MICROPROCESSOR - A method and apparatus for providing fairness in a multi-processing element environment is herein described. Mask elements are utilized to associated portions of a reservation station with each processing element, while still allowing common access to another portion of reservation station entries. Additionally, bias logic biases selection of processing elements in a pipeline away from a processing element associated with a blocking stall to provide fair utilization of the pipeline. | 03-03-2011 |
20110099355 | MULTI-THREADING PROCESSORS, INTEGRATED CIRCUIT DEVICES, SYSTEMS, AND PROCESSES OF OPERATION AND MANUFACTURE - A multi-threaded microprocessor ( | 04-28-2011 |
20120260072 | REGISTER ALLOCATION IN ROTATION BASED ALIAS PROTECTION REGISTER - A system may comprises an optimizer/scheduler to schedule on a set of instructions, compute a data dependence, a checking constraint and/or an anti-checking constraint for the set of scheduled instructions, and allocate alias registers for the set of scheduled instructions based on the data dependence, the checking constraint and/or the anti-checking constraint. In one embodiment, the optimizer is to release unused registers to reduce the alias registers used to protect the scheduled instructions. The optimizer is further to insert a dummy instruction after a fused instruction to break cycles in the checking and anti-checking constraints. | 10-11-2012 |
20130145130 | DATA PROCESSING APPARATUS AND METHOD FOR PERFORMING REGISTER RENAMING WITHOUT ADDITIONAL REGISTERS - The data processing apparatus (and method) has processing circuitry for performing data processing operations in response to data processing instructions, the data processing instructions referencing logical registers. A set of physical registers are provided for storing data values for access by the processing circuitry when performing the data processing operations. Register renaming storage stores a one-to-one mapping between the logical registers and the physical registers, with the register renaming storage being accessed by the processing circuitry when performing the data processing operations in order to map the referenced logical registers to corresponding physical registers. Update circuitry is arranged to identify the physical registers corresponding to those multiple logical registers in the register renaming storage. Altered one-to-one mapping between multiple logical registers and identified physical registers is employed when performing the current data processing operation. | 06-06-2013 |
20130145131 | Flexible Microprocessor Register File - Architectures and methods for viewing data in multiple formats within a register file. Various disclosed embodiments allow a plurality of consecutive registers within one register file to appear to be temporarily transposed by one instruction, such that each transposed register contains one byte or word from multiple consecutive registers. A program can arbitrarily reorganize the bytes within a register by swapping the value stored in any byte within the register with the value stored in any other byte within the same register. Indirect register access is also provided, without additional scoreboarding hardware, as an apparent move from one register to another. The functionality of a hardware data FIFO at the I/O is also provided, without the power consumption of register-to-register transfers. However, the size of the FIFO can be changed under program control. | 06-06-2013 |
20130151819 | RECOVERING FROM EXCEPTIONS AND TIMING ERRORS - A data processing apparatus with a processing pipeline, the pipeline including exception control circuitry and error detection circuitry. An exception storage unit is configured to maintain an age-ordered list of entries corresponding to instructions issued to the processing pipeline for execution. The unit is configured to store, in association with each entry, an exception indicator indicating whether the instruction is an exception instruction and whether it has generated an exception and an error indicator indicating whether the instruction has generated an error. The apparatus is configured to indicate to the exception storage unit that an instruction is resolved when processing of the instruction has reached a stage such that it is known whether the instruction will generate an error and whether the instruction will generate an exception; and the exception control circuitry is configured to sequentially retire oldest resolved entries from the list in the exception storage unit. | 06-13-2013 |
20130283014 | EXPEDITING EXECUTION TIME MEMORY ALIASING CHECKING - Embodiments of apparatus, computer-implemented methods, systems, and computer-readable media are described herein for expediting execution time memory alias checking. A sequence of instructions targeted for execution on an execution processor may be received or retrieved. The execution processor may include a plurality of alias registers and circuitry configured to check entries in the alias register for memory aliasing. One or more optimizations may be performed on the received or retrieved sequence of instructions to optimize execution performance of the received or retrieved sequence of instructions. This may include a reorder of a plurality of memory instructions in the received or retrieved sequence of instructions. After the optimization, one or more move instructions may be inserted in the optimized sequence of instructions to move one or more entries among the alias registers during execution, to expedite alias checking at execution time. Other embodiments may be described and/or claimed. | 10-24-2013 |
20130339671 | ZERO CYCLE LOAD - A system and method for reducing the latency of load operations. A register rename unit within a processor determines whether a decoded load instruction is eligible for conversion to a zero-cycle load operation. If so, control logic assigns a physical register identifier associated with a source operand of an older dependent store instruction to the destination operand of the load instruction. Additionally, the register rename unit marks the load instruction to prevent it from reading data associated with the source operand of the store instruction from memory. Due to the duplicate renaming, this data may be forwarded from a physical register file to instructions that are younger and dependent on the load instruction. | 12-19-2013 |
20140013085 | LOW POWER AND HIGH PERFORMANCE PHYSICAL REGISTER FREE LIST IMPLEMENTATION FOR MICROPROCESSORS - A system and method for reducing latency and power of register renaming. A free list in processor includes multiple banks for indicating availability of register identifiers used for register renaming. A register rename unit receives one or more destination architectural registers to rename with physical register identifiers. Responsive to determining the multiple banks within the free list are unbalanced with available physical register identifiers, one or more returning physical register identifiers are assigned to the destination architectural registers before assigning any physical register identifiers from any bank of the multiple banks with a lowest number of available physical register identifiers. A returning physical register identifier is a physical register identifier that is available again for assignment to a destination architectural register but not yet indicated in the free list as available. Each of the banks includes a single bit width decoded vector for indicating availability of given physical register identifiers. | 01-09-2014 |
20140047218 | MULTI-STAGE REGISTER RENAMING USING DEPENDENCY REMOVAL - Multi-stage register renaming using dependency removal is described. In an embodiment, the registers are renamed in two stages. The first stage involves removing all the dependencies within a set of instructions which are being renamed together. The final stage then renames all registers in parallel using a renaming map. In various embodiments, the dependencies are removed in the first stage using a fixed mapping to rename destination registers in each instruction and in some embodiments the fixed mapping is based on the position of a destination register within the set of instructions. Dependent registers, which are those registers which are read in an instruction but have been written in a previous instruction in the set, are also renamed in the first stage. In addition to performing the renaming in the final stage, the renaming map is updated. | 02-13-2014 |
20140047219 | Managing A Register Cache Based on an Architected Computer Instruction Set having Operand Last-User Information - A multi-level register hierarchy is disclosed comprising a first level pool of registers for caching registers of a second level pool of registers in a system wherein programs can dynamically release and re-enable architected registers such that released architected registers need not be maintained by the processor, the processor accessing operands from the first level pool of registers, wherein a last-use instruction is identified as having a last use of an architected register before being released, the last-use architected register being released causes the multi-level register hierarchy to discard any correspondence of an entry to said last use architected register. | 02-13-2014 |
20140101415 | REDUCING DATA HAZARDS IN PIPELINED PROCESSORS TO PROVIDE HIGH PROCESSOR UTILIZATION - A pipelined computer processor is presented that reduces data hazards such that high processor utilization is attained. The processor restructures a set of instructions to operate concurrently on multiple pieces of data in multiple passes. One subset of instructions operates on one piece of data while different subsets of instructions operate concurrently on different pieces of data. A validity pipeline tracks the priming and draining of the pipeline processor to ensure that only valid data is written to registers or memory. Pass-dependent addressing is provided to correctly address registers and memory for different pieces of data. | 04-10-2014 |
20140122837 | NOVEL REGISTER RENAMING SYSTEM USING MULTI-BANK PHYSICAL REGISTER MAPPING TABLE AND METHOD THEREOF - Embodiments of a processor architecture utilizing multi-bank implementation of physical register mapping table are provided. A register renaming system to correlate architectural registers to physical registers includes a physical register mapping table and a renaming logic. The physical register mapping table has a plurality of entries each indicative of a state of a respective physical register. The mapping table has a plurality of non-overlapping sections each of which having respective entries of the mapping table. The renaming logic is coupled to search a number of the sections of the mapping table in parallel to identify entries that indicate the respective physical registers have a first state. The renaming logic selectively correlates each of a plurality of architectural registers to a respective physical register identified as being in the first state. Methods of utilizing the multi-bank implementation of physical register mapping table are also provided. | 05-01-2014 |
20140164742 | APPARATUS AND METHOD FOR MAPPING ARCHITECTURAL REGISTERS TO PHYSICAL REGISTERS - An apparatus and method are provided for performing register renaming. Available register identifying circuitry is provided to identify which physical registers form a pool of physical registers available to be mapped by register renaming circuitry to an architectural register specified by an instruction to be executed. Configuration data whose value is modified during operation of the processing circuitry is stored such that, when the configuration data has a first value, the configuration data identifies at least one architectural register of the architectural register set which does not require mapping to a physical register by the register renaming circuitry. The register identifying circuitry is arranged to reference the modified data value, such that when the configuration data has the first value, the number of physical registers in the pool is increased due to the reduction in the number of architectural registers which require mapping to physical registers. | 06-12-2014 |
20140244978 | CHECKPOINTING REGISTERS FOR TRANSACTIONAL MEMORY - The present invention provides a method and apparatus for checkpointing registers for transactional memory. Some embodiments of the apparatus include first rename logic configured to map up to a predetermined number of architectural registers to corresponding first physical registers that hold first values associated with the architectural registers. The mapping is responsive to a transaction modifying one or more of the first values associated with the architectural registers. Some embodiments of the apparatus also include microcode configured to write contents of the first physical registers to a memory in response to the transaction modifying first values associated with a number of the architectural registers that is larger than the predetermined number. | 08-28-2014 |
20140258687 | MICRO-OPS INCLUDING PACKED SOURCE AND DESTINATION FIELDS - A method and apparatus for register packing prior to register renaming in a microprocessor are provided. The method includes: receiving a plurality of micro operations (micro-ops) decoded from one or more instructions; packing a plurality of registers which are included in the micro-ops into a packed register structure including a plurality of packed registers based on a preset number of rename ports of a renamer through which the packed registers are read or written for register renaming; and sending the packed registers for register renaming. | 09-11-2014 |
20140281413 | Superforwarding Processor - Methods and systems that allow the processor to effectively and efficiently reduce or eliminate the latency associated with instructions that copy the value of one register to another register. A processor includes a superforwarding table, a superforwarding logic block, and a computation engine. The superforwarding table stores an entry, wherein the entry has a valid bit, a key field, and a forward field. The superforwarding logic block determines which register contains the information needed for an instruction. The computation engine executes instructions. | 09-18-2014 |
20140281414 | REORDER-BUFFER-BASED DYNAMIC CHECKPOINTING FOR RENAME TABLE REBUILDING - Out-of-order CPUs, devices and methods diminish the time penalty from stalling the pipe to rebuild a rename table, such as due to a misprediction. A microprocessor can include a pipe that has a decoder, a dispatcher, and at least one execution unit. A rename table stores rename data, and a check-point table (“CPT”) stores rename data received from the dispatcher. A Re-Order Buffer (“ROB”) stores ROB data, and has a dynamic mapping relationship with the CPT. If the rename table is flushed, such as due to a misprediction, the rename table is rebuilt at least in part by concurrent copying of rename data stored in the CPT, in coordination with walking the ROB. | 09-18-2014 |
20140281415 | DYNAMIC RENAME BASED REGISTER RECONFIGURATION OF A VECTOR REGISTER FILE - Reconfiguring a register file using a rename table having a plurality of fields that indicate fracture information about a source register of an instruction for instructions which have narrow to wide dependencies. | 09-18-2014 |
20140289501 | TECHNIQUE FOR FREEING RENAMED REGISTERS - Register renaming circuitry for a processing apparatus configured to process a stream of instructions from an instruction set specifying registers from an architectural set of registers. The apparatus including a physical set of registers configured to store data values being processed by the processing apparatus. Register renaming circuitry is configured to receive a stream of operations from an instruction decoder and to map registers that are to be written to by the stream of operations to physical registers within the physical set of registers that are currently available. The register renaming circuitry comprises register release circuitry configured to release the physical registers that have been mapped to the registers when a first set of conditions have been met, and to release the physical registers that have been mapped to the additional registers when a second set of conditions have been met. | 09-25-2014 |
20140304492 | METHOD AND APPARATUS TO INCREASE THE SPEED OF THE LOAD ACCESS AND DATA RETURN SPEED PATH USING EARLY LOWER ADDRESS BITS - A microprocessor implemented method for resolving dependencies for a load instruction in a load store queue (LSQ) is disclosed. The method comprises initiating a computation of a virtual address corresponding to the load instruction in a first clock cycle. It also comprises transmitting early calculated lower address bits of the virtual address to a load store queue (LSQ) in the same cycle as the initiating. Finally, it comprises performing a partial match in the LSQ responsive to and using the lower address bits to find a prior aliasing store, wherein the prior aliasing store stores to a same address as the load instruction. | 10-09-2014 |
20140325188 | SIMULTANEOUS FINISH OF STORES AND DEPENDENT LOADS - A method for reducing a pipeline stall in a multi-pipelined processor includes finding a store instruction having a same target address as a load instruction and having a store value of the store instruction not yet written according to the store instruction, when the store instruction is being concurrently processed in a different pipeline than the load instruction and the store instruction occurs before the load instruction in a program order. The method also includes associating a target rename register of the load instruction as well as the load instruction with the store instruction, responsive to the finding step. The method further includes writing the store value of the store instruction to the target rename register of the load instruction and finishing the load instruction without reissuing the load instruction, responsive to writing the store value of the store instruction according to the store instruction to finish the store instruction. | 10-30-2014 |
20140344554 | MICROPROCESSOR ACCELERATED CODE OPTIMIZER AND DEPENDENCY REORDERING METHOD - A dependency reordering method. The method includes accessing an input sequence of instructions, initializing three registers, and loading instruction numbers into a first register. The method further includes loading destination register numbers into a second register, broadcasting values from the first register to a position in a third register in accordance with a position number in the second register, overwriting positions in the third register in accordance with position numbers in the second register, and using information in the third register to populate a dependency matrix for grouping dependent instructions from the sequence of instructions. | 11-20-2014 |
20140380024 | DEPENDENT INSTRUCTION SUPPRESSION - A method includes suppressing execution of at least one dependent instruction of a load instruction by a processor using stored dependency information responsive to an invalid status of the load instruction. A processor includes an execution unit to execute instructions and a scheduler. The scheduler is to select for execution in the execution unit a load instruction having at least one dependent instruction and suppress execution of the at least one dependent instruction using stored dependency information responsive to an invalid status of the load instruction. | 12-25-2014 |
20150019843 | METHOD AND APPARATUS FOR SELECTIVE RENAMING IN A MICROPROCESSOR - A method and apparatus for allowing an out-of-order processor to reuse an in-use physical register is disclosed herein. The method and apparatus uses identifiers, such as tokens and/or other identifiers in a rename map table (RMT) and a physical register file (PRF), to indicate whether an instruction result is allowed or disallowed to be written into a physical register. | 01-15-2015 |
20150089199 | ROTATE INSTRUCTIONS THAT COMPLETE EXECUTION EITHER WITHOUT WRITING OR READING FLAGS - A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag. | 03-26-2015 |
20150089200 | ROTATE INSTRUCTIONS THAT COMPLETE EXECUTION EITHER WITHOUT WRITING OR READING FLAGS - A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag. | 03-26-2015 |
20150089201 | ROTATE INSTRUCTIONS THAT COMPLETE EXECUTION EITHER WITHOUT WRITING OR READING FLAGS - A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag. | 03-26-2015 |
20150121040 | PROCESSOR AND METHODS FOR FLOATING POINT REGISTER ALIASING - Methods, devices, and systems for accessing packed registers are presented. A state of the packed registers may be tracked and it may be determined whether the register is directly accessible based on the state. If the register is not directly accessible, an action may be performed which allows the register to be accessed directly. The action may include injecting at least one uop for reorganizing the physical storage of the register such that it is directly accessible. The action may include aligning the data with the least significant bit of a physical register or otherwise aligning the data with the datapath. The action may also include changing the state of the packed registers. | 04-30-2015 |
20150121041 | PROCESSOR AND METHODS FOR IMMEDIATE HANDLING AND FLAG HANDLING - Described herein are methods and processors for flag renaming in groups to eliminate dependencies of instructions. Decoder and execution units in the processor may be configured to rename flags into groups that allow each group to be treated separately as appropriate. This flag renaming eliminates flag dependencies with respect to instructions. This allows an instruction to write exactly the flags that the instruction wants without having to create merge dependencies. Methods and processors are provided for handling immediate values embedded in instructions. A 16 bit immediate bus and a 4 bit encoding/control bus are added at the interface between decode and execution units. For an 8 or 12 bit immediate, the upper 4 bits of the immediate bus contain the encoding bits. For a 16 bit immediate, the encoding/control bus contains the encoding bits. The encoding/control bus indicates when to look at the top four bits of the immediate bus. | 04-30-2015 |
20150127926 | INSTRUCTION SCHEDULING APPROACH TO IMPROVE PROCESSOR PERFORMANCE - A processor instruction scheduler comprising an optimization engine which uses an optimization model for a processor architecture with: means to generate an optimization model for the optimization engine from a design of a processor and data representing optimization goals and constraints and a code stream, wherein the processor has at least two execution pipes and at least two registers, and wherein the code stream comprises processor instructions with corresponding register selections; and reordering means to generate an optimized code stream from the code stream with the optimal solution provided by the optimization engine for the optimization model by reordering the code stream, such that optimum values for the optimization goals under the given constraints are achieved without affecting the operation results of the code stream. | 05-07-2015 |
20150309796 | RENAMING WITH GENERATION NUMBERS - A processor including a register file having a plurality of registers, and configured for out-of-order instruction execution, further includes a renamer unit that produces generation numbers that are associated with register file addresses to provide a renamed version of a register that is temporally offset from an existing version of that register rather than assigning a non-programmer-visible physical register as the renamed register. | 10-29-2015 |
20150309797 | Computer Processor With Generation Renaming - A processor including a register file having a plurality of registers, and configured for out-of-order instruction execution, further includes a renamer unit that produces generation numbers that are associated with register file addresses to provide a renamed version of a register that is temporally offset from an existing version of that register rather than assigning a non-programmer-visible physical register as the renamed register. The processor includes a small reset DHL Gshare branch prediction unit coupled to an instruction cache and configured to provide speculative addresses to the instruction cache. | 10-29-2015 |
20150324202 | DETECTING DATA DEPENDENCIES OF INSTRUCTIONS ASSOCIATED WITH THREADS IN A SIMULTANEOUS MULTITHREADING SCHEME - Detecting data dependencies of instructions associated with threads in a simultaneous multithreading (SMT) scheme is disclosed, including: dividing a plurality of comparators of an SMT-enabled device into groups of comparators corresponding to respective ones of threads associated with the SMT-enabled device; simultaneously distributing a first set of instructions associated with a first thread of the plurality of threads to a corresponding first group of comparators from the plurality of groups of comparators and distributing a second set of instructions associated with a second thread of the plurality of threads to a corresponding second group of comparators from the plurality of groups of comparators; and simultaneously performing data dependency detection on the first set of instructions associated with the first thread using the corresponding first group of comparators and performing data dependency detection on the second set of instructions associated with the second thread using the corresponding second group of comparators. | 11-12-2015 |
20150339123 | Restoring a Register Renaming Map - A technique for restoring a register renaming map is described. In one example, a restore table having a number of storage locations saves a copy of the register renaming map whenever a flow-risk instruction is passed to a re-order buffer. When all storage locations are full, further instructions still pass to the re-order buffer, but a copy of the map is not saved. A storage location subsequently becomes available when its associated flow-risk instruction is executed. A register renaming map state for an unrecorded flow-risk instruction passed to the re-order buffer whilst the storage locations were full is generated and stored in the available location. This is generated using the restore table entry for a previous flow-risk instruction and re-order buffer values for intervening instructions between the previous and unrecorded flow-risk instructions. The restore table can be used to restore the map if an unexpected change in instruction flow occurs. | 11-26-2015 |
20160055000 | REGISTER RENAMER THAT HANDLES MULTIPLE REGISTER SIZES ALIASED TO THE SAME STORAGE LOCATIONS - A processor may include a physical register file and a register renamer. The register renamer may be organized into even and odd banks of entries, where each entry stores an identifier of a physical register. The register renamer may be indexed by a register number of an architected register, such that the renamer maps a particular architected register to a corresponding physical register. Individual entries of the renamer may correspond to architected register aliases of a given size. Renaming aliases that are larger than the given size may involve accessing multiple entries of the renamer, while renaming aliases that are smaller than the given size may involve accessing a single renamer entry. | 02-25-2016 |
20160098276 | OPERAND CONFLICT RESOLUTION FOR REDUCED PORT GENERAL PURPOSE REGISTER - Techniques are described for determining whether execution of an instruction would require reading more values from a memory cell of a general purpose register (GPR) than a read port of the memory cell would allow. In such a case, the techniques may store, prior to execution of the instruction, one or more values from the memory cell in a separate conflict queue. During execution of the instruction to implement an operation defined by the instruction, one value that is an operand of the operation would be read from the memory cell and another value that is an operand of the operation other would be read from the conflict queue. | 04-07-2016 |
20160170751 | MECHANISM TO PRECLUDE LOAD REPLAYS DEPENDENT ON FUSE ARRAY ACCESS IN AN OUT-OF-ORDER PROCESSOR | 06-16-2016 |
20160170752 | MECHANISM TO PRECLUDE I/O-DEPENDENT LOAD REPLAYS IN AN OUT-OF-ORDER PROCESSOR | 06-16-2016 |
20160170753 | MECHANISM TO PRECLUDE UNCACHEABLE-DEPENDENT LOAD REPLAYS IN OUT-OF-ORDER PROCESSOR | 06-16-2016 |
20160170754 | LOAD REPLAY PRECLUDING MECHANISM | 06-16-2016 |
20160170755 | MECHANISM TO PRECLUDE LOAD REPLAYS DEPENDENT ON PAGE WALKS IN AN OUT-OF-ORDER PROCESSOR | 06-16-2016 |
20160170756 | MECHANISM TO PRECLUDE LOAD REPLAYS DEPENDENT ON LONG LOAD CYCLES IN AN OUT-OF-ORDER PROCESSOR | 06-16-2016 |
20160170757 | PROGRAMMABLE LOAD REPLAY PRECLUDING MECHANISM | 06-16-2016 |
20160170758 | POWER SAVING MECHANISM TO REDUCE LOAD REPLAYS IN OUT-OF-ORDER PROCESSOR | 06-16-2016 |
20160170759 | MECHANISM TO PRECLUDE SHARED RAM-DEPENDENT LOAD REPLAYS IN AN OUT-OF-ORDER PROCESSOR | 06-16-2016 |
20160170760 | APPARATUS AND METHOD TO PRECLUDE NON-CORE CACHE-DEPENDENT LOAD REPLAYS IN AN OUT-OF-ORDER PROCESSOR | 06-16-2016 |
20160170761 | MECHANISM TO PRECLUDE LOAD REPLAYS DEPENDENT ON OFF-DIE CONTROL ELEMENT ACCESS IN AN OUT-OF-ORDER PROCESSOR | 06-16-2016 |
20160170762 | APPARATUS AND METHOD TO PRECLUDE X86 SPECIAL BUS CYCLE LOAD REPLAYS IN AN OUT-OF-ORDER PROCESSOR | 06-16-2016 |
20160170763 | APPARATUS AND METHOD TO PRECLUDE LOAD REPLAYS DEPENDENT ON WRITE COMBINING MEMORY SPACE ACCESS IN AN OUT-OF-ORDER PROCESSOR | 06-16-2016 |
20160170764 | APPARATUS AND METHOD FOR PROGRAMMABLE LOAD REPLAY PRECLUSION | 06-16-2016 |
20160253175 | Register File Having a Plurality of Sub-Register Files | 09-01-2016 |
20160378497 | Systems, Methods, and Apparatuses for Thread Selection and Reservation Station Binding - Embodiments of systems, methods, and apparatuses for thread selection and reservation station binding are disclosed. In an embodiment, an apparatus includes allocation hardware including reservation station binding logic to bind an operation to one of a plurality of reservation stations. In an embodiment, an apparatus includes thread selection logic to select a thread to be processed by a pipeline stage, wherein the thread selection logic to evaluate a plurality of conditions to select a thread, wherein the conditions include if a thread is active, if a thread has operations in an instruction queue, if a thread has available resources, and if a thread has no known stall. | 12-29-2016 |