Class / Patent application number | Description | Number of patent applications / Date published |
712213000 | Predecoding of instruction component | 28 |
20080256338 | Techniques for Storing Instructions and Related Information in a Memory Hierarchy - A memory subsystem includes a first memory, a second memory, a first compressor, and a first decompressor. The first memory is configured to store instruction bytes of a fetch window and to store first predecode information and first branch information that characterizes the instruction bytes of the fetch window. The second memory is configured to store the instruction bytes of the fetch window upon eviction of the instruction bytes from the first memory and to store combined predecode/branch information that also characterizes the instruction bytes of the fetch window. The first compressor is configured to compress the first predecode information and the first branch information into the combined predecode/branch information. The first decompressor is configured to decode at least some of the instruction bytes stored in the second memory to convert the combined predecode/branch information into second predecode information, which corresponds to an uncompressed version of the first predecode information, for storage in the third memory. | 10-16-2008 |
20080263329 | Parallel-Prefix Broadcast for a Parallel-Prefix Operation on a Parallel Computer - A parallel-prefix broadcast for a parallel-prefix operation on a parallel computer includes: configuring, on each node, a parallel-prefix contribution buffer for storing the node's parallel-prefix contribution; configuring, on each node, a parallel-prefix results buffer for storing results of a operation, the results buffer having a position for each node that corresponds to node's rank; and repeatedly for each position in the results buffer: processing in parallel by each node, including: determining, by the node, whether the current position in the results buffer is to include the node's contribution, if the current position is not to include the contribution, contributing the identity element, and if the current position is to include the contribution, contributing the contribution, performing, by each node, the operation using the contributed identity elements and the contributed contributions, yielding a result from the operation, and storing, by each node, the result in the position in the results buffer. | 10-23-2008 |
20080282066 | Compact instruction set encoding - The invention provides a decode unit for decoding instructions in a processor. The decode unit comprises opcode decoding logic, operand decoding logic, and a sixteen-bit input. The opcode decoding logic is operable to determine an opcode using five bits of the input and the operand decoding logic is operable to determine three four-bit operand elements from the remaining eleven bits of the input, the three operand elements each having one of twelve possible binary values. The operand decoding logic is operable to decode an encoded group of the eleven bits to determine a first part of each of the three operand elements, and to read verbatim a verbatim group of the eleven bits to determine a second part of each of the three operand elements. | 11-13-2008 |
20080288753 | Methods and Apparatus for Emulating the Branch Prediction Behavior of an Explicit Subroutine Call - An apparatus for emulating the branch prediction behavior of an explicit subroutine call is disclosed. The apparatus includes a first input which is configured to receive an instruction address and a second input. The second input is configured to receive predecode information which describes the instruction address as being related to an implicit subroutine call to a subroutine. In response to the predecode information, the apparatus also includes an adder configured to add a constant to the instruction address defining a return address, causing the return address to be stored to an explicit subroutine resource, thus, facilitating subsequent branch prediction of a return call instruction. | 11-20-2008 |
20090070557 | PARALLEL PROGRAM EXECUTION OF COMMAND BLOCKS USING FIXED BACKJUMP ADDRESSES - The invention relates to a method for executing instructions in a processor, according to which an instruction to be executed of a program memory is addressed by a program control unit by means of a program counter reading of a program counter that operates in said unit. The addressed instruction is then read out, decoded and executed by the program control unit. The program control unit additionally stores the current program counter reading and the number of successive instructions when a jump instruction occurs in the form of a block instruction, according to which a specific number of instructions are to be executed successively, thus defining the return address after execution. After the last instruction of the instruction block to be executed, the program counter resumes the counting operation from the stored program counter reading. | 03-12-2009 |
20090164757 | Method and Apparatus for Performing Out of Order Instruction Folding and Retirement - The illustrative embodiments described herein provide a computer implemented method, apparatus, and computer program product for increasing a number of instructions per clock cycle associated with a processor. The illustrative embodiments fold a plurality of non-sequential instructions within the set of sequential order instructions to form a folded instruction. The folded instruction is executed to form an executed instruction. The executed instruction is placed in a reorder buffer. The instructions within the reorder buffer are written to a register based on the sequential order of execution within the set of sequential order instructions. | 06-25-2009 |
20090187741 | Data processing apparatus and method for handling instructions to be executed by processing circuitry - A data processing apparatus and method are provided for handling instructions to be executed by processing circuitry. The processing circuitry has a plurality of processor states, each processor state having a different instruction set associated therewith. Pre-decoding circuitry receives the instructions fetched from the memory and performs a pre-decoding operation to generate corresponding pre-decoded instructions, with those pre-decoded instructions then being stored in a cache for access by the processing circuitry. The pre-decoding circuitry performs the pre-decoding operation assuming a speculative processor state, and the cache is arranged to store an indication of the speculative processor state in association with the pre-decoded instructions. The processing circuitry is then arranged only to execute an instruction in the sequence using the corresponding pre-decoded instruction from the cache if a current processor state of the processing circuitry matches the indication of the speculative processor state stored in the cache for that instruction. This provides a simple and effective mechanism for detecting instructions that have been corrupted by the pre-decoding operation due to an incorrect assumption of processor state. | 07-23-2009 |
20090187742 | Instruction pre-decoding of multiple instruction sets - A data processing apparatus is provided with pre-decoding circuitry | 07-23-2009 |
20090187743 | Data processing apparatus and method for instruction pre-decoding - The present invention provides a data processing apparatus comprising processing circuitry for executing a sequence of instructions and pre-decoding circuitry for receiving the instructions fetched from memory. The pre-decoding circuitry performs a pre-decoding operation to generate corresponding pre-decoded instructions and stores them in a cache for access by the processing circuitry. For each instruction fetched from the memory, the pre-decoding circuitry detects whether the instruction is an abnormal instruction and upon such detection provides in association with a corresponding pre-decoded instruction an identifier identifying that instruction as abnormal. | 07-23-2009 |
20090187744 | Data processing apparatus and method for pre-decoding instructions - A data processing apparatus and method are provided for pre-decoding instructions. The data processing apparatus has pre-decoding circuitry for receiving instructions fetched from memory and for performing a pre-decoding operation to generate corresponding pre-decoded instructions, which are then stored in a cache for access by processing circuitry. For a first set of instructions, each instruction comprises a plurality of instruction portions, and the pre-decoding circuitry generates a corresponding pre-decoded instruction comprising a plurality of pre-decoded instruction portions. If when applying the pre-decoding operation to an instruction in the first set, the pre-decoding circuitry does not have access to all of the plurality of instruction portions of that instruction, the pre-decoding circuitry is arranged to provide in association with at least one pre-decoded instruction portion that it does generate, an indication that the pre-decoded instruction portion relates to an incomplete pre-decoding operation. This provides a simple and effective mechanism for detecting situations where a pre-decoded instruction as later read from the cache may have become corrupted by the pre-decoding operation, and accordingly should not be relied upon. | 07-23-2009 |
20090300331 | IMPLEMENTING INSTRUCTION SET ARCHITECTURES WITH NON-CONTIGUOUS REGISTER FILE SPECIFIERS - There are provided methods and computer program products for implementing instruction set architectures with non-contiguous register file specifiers. A method for processing instruction code includes processing a fixed-width instruction of a fixed-width instruction set using a non-contiguous register specifier of a non-contiguous register specification. The fixed-width instruction includes the non-contiguous register specifier. | 12-03-2009 |
20100049949 | PARALLEL PROGRAM EXECUTION OF COMMAND BLOCKS USING FIXED BACKJUMP ADDRESSES - The invention relates to a method for executing instructions in a processor, according to which an instruction to be executed of a program memory is addressed by a program control unit by means of a program counter reading of a program counter that operates in said unit. The addressed instruction is then read out, decoded and executed by the program control unit. The program control unit additionally stores the current program counter reading and the number of successive instructions when a jump instruction occurs in the form of a block instruction, according to which a specific number of instructions are to be executed successively, thus defining the return address after execution. After the last instruction of the instruction block to be executed, the program counter resumes the counting operation from the stored program counter reading. | 02-25-2010 |
20100106945 | Instruction processing apparatus - The present invention includes a decode section for simultaneously holding a plurality of instructions in one thread at a time and for decoding the held instructions; an execution pipeline capable of simultaneously executing each processing represented by the respective instructions belonging to different threads and decoded by the decode section; a reservation station for receiving the instructions decoded by the decode section and holding the instructions, if the decoded instructions are of sync attribute, until executable conditions are ready and thereafter dispatching the decoded instructions to the execution pipeline; a pre-decode section for confirming by a simple decoding, prior to decoding by the decode section, whether or not the instructions are of sync attribute; and an instruction buffer for suspending issuance to the decode section and holding the instructions subsequent to an instruction of sync attribute. | 04-29-2010 |
20100169615 | Preloading Instructions from an Instruction Set Other than a Currently Executing Instruction Set - A preload instruction in a first instruction set is executed at a processor. The preload instruction causes the processor to preload one or more instructions into an instruction cache. The pre-loaded instructions are pre-decoded according to a second instruction set that is different from the first instruction set. The preloaded instructions are pre-decoded according to the second instruction set in response to an instruction set preload indicator (ISPI). | 07-01-2010 |
20100332803 | PROCESSOR AND CONTROL METHOD FOR PROCESSOR - A processor includes a storage unit storing an instruction, an instruction extension information register that includes a first area and a second area, an instruction decoding unit that decodes a first prefix instruction including first extension information extending an immediately following instruction written to the first area when the first prefix instruction is executed, and that decodes a second prefix instruction including the first extension information and a second extension information extending an instruction after two instructions written to the second area respectively, an instruction packing unit that generates a packed instruction including at least one of the first prefix instruction or the second prefix instruction, and the instruction immediately following the first prefix instruction or the second prefix instruction when the instruction decoding unit decodes the first prefix instruction or the second prefix instruction, an instruction execution unit that executes the packed instruction generated by the instruction packing unit. | 12-30-2010 |
20110107064 | DATA PROCESSOR - The present invention realizes an efficient superscalar instruction issue and low power consumption at an instruction set including instructions with prefixes. An instruction fetch unit is adopted which determines whether an instruction code is of a prefix code or an instruction code other than it, and outputs the result of determination and the 16-bit instruction code. Along with it, decoders each of which decodes the instruction code, based on the result of determination, and decoders each of which decodes the prefix code, are disposed separately. Further, a prefix is supplied to each decoder prior to a fixed-length instruction code like 16 bits modified with it. A fixed-length instruction code following the prefix code is supplied to each decoder of the same pipeline as the decoder for the prefix code. | 05-05-2011 |
20120124338 | MULTI-THREADED DATA PROCESSING SYSTEM - A method and apparatus are provided for executing instructions of a multi-threaded processor having multiple hardware threads ( | 05-17-2012 |
20130262829 | DECODE TIME INSTRUCTION OPTIMIZATION FOR LOAD RESERVE AND STORE CONDITIONAL SEQUENCES - A technique is provided for replacing an atomic sequence. A processing circuit receives the atomic sequence. The processing circuit detects the atomic sequence. The processing circuit generates an internal atomic operation to replace the atomic sequence. | 10-03-2013 |
20130262830 | DECODE TIME INSTRUCTION OPTIMIZATION FOR LOAD RESERVE AND STORE CONDITIONAL SEQUENCES - A technique is provided for replacing an atomic sequence. A processing circuit receives the atomic sequence. The processing circuit detects the atomic sequence. The processing circuit generates an internal atomic operation to replace the atomic sequence. | 10-03-2013 |
20140095835 | PERFORMING PREDECODE-TIME OPTIMIZED INSTRUCTIONS IN CONJUNCTION WITH PREDECODE TIME OPTIMIZED INSTRUCTION SEQUENCE CACHING - A method for performing predecode-time optimized instructions in conjunction with predecode time optimized instruction sequence caching. The method includes receiving a first instruction of an instruction sequence and a second instruction of the instruction sequence and determining if the first instruction and the second instruction can be optimized. In response to the determining that the first instruction and second instruction can be optimized, the method includes, preforming a pre-decode optimization on the instruction sequence and generating a new second instruction, wherein the new second instruction is not dependent on a target operand of the first instruction and storing a pre-decoded first instruction and a pre-decoded new second instruction in an instruction cache. In response to determining that the first instruction and second instruction can not be optimized, the method includes, storing the pre-decoded first instruction and a pre-decoded second instruction in the instruction cache. | 04-03-2014 |
20140244976 | IT INSTRUCTION PRE-DECODE - Various techniques for processing and pre-decoding branches within an IT instruction block. Instructions are fetched and cached in an instruction cache, and pre-decode bits are generated to indicate the presence of an IT instruction and the likely boundaries of the IT instruction block. If an unconditional branch is detected within the likely boundaries of an IT instruction block, the unconditional branch is treated as if it were a conditional branch. The unconditional branch is sent to the branch direction predictor and the predictor generates a branch direction prediction for the unconditional branch. | 08-28-2014 |
20140317384 | DATA PROCESSING APPARATUS AND METHOD FOR PRE-DECODING INSTRUCTIONS TO BE EXECUTED BY PROCESSING CIRCUITRY - A hierarchical cache with at least a unified cache is used to store both instructions and data values, and a further cache coupled between processing circuitry and a unified cache. The unified cache has a plurality of cache lines identified as an instruction cache line or a data cache line. Each data cache line stores at least one data value and the associated information. Pre-decode circuitry is associated with the unified cache and performs a first pre-decode operation on a received instruction for that instruction cache line in order to generate a corresponding partially pre-decoded instruction for storing in the instruction cache line. Further pre-decode circuitry is associated with the further cache, and, when a partially pre-decoded instruction is routed to the further cache, performs a further pre-decode operation on the partially pre-decoded instruction to generate a corresponding pre-decoded instruction for storage in the further cache. | 10-23-2014 |
20150082009 | METHOD AND APPARATUS FOR THE DYNAMIC CREATION OF INSTRUCTIONS UTILIZING A WIDE DATAPATH - A processing system and method includes a predecoder configured to identify instructions that are combinable to form a single executable internal instruction. Instruction storage is configured to merge instructions that are combinable. An instruction execution unit is configured to execute the single, executable internal instruction on a hardware wide datapath. | 03-19-2015 |
20160004536 | Systems And Methods For Processing Inline Constants - Disclosed is a digital processor comprising an instruction memory having a first input, a second input, a first output, and a second output. A program counter register is in communication with the first input of the instruction memory. The program counter register is configured to store an address of an instruction to be fetched. A data pointer register is in communication with the second input of the instruction memory. The data pointer register is configured to store an address of a data value in the instruction memory. An instruction buffer is in communication with the first output of the instruction memory. The instruction buffer is arranged to receive an instruction according to a value at the program counter register. A data buffer is in communication with the second output of the instruction memory. The data buffer is arranged to receive a data value according to a value at the data pointer register. | 01-07-2016 |
20160092216 | OPTIMIZING GROUPING OF INSTRUCTIONS - Embodiments include optimizing the grouping of instructions in a microprocessor. Aspects include receiving a first clump of instructions from a streaming buffer, pre-decoding each of instructions for select information and sending the instructions to an instruction queue. Aspects further include storing initial grouping information for the instructions in a local register, wherein the initial grouping information is based on the select information. Aspects further include updating the initial group information stored in the local register when additional pre-decode information becomes available and grouping the instructions that are ready to be dispatched into a dispatch group based on the grouping information stored in the local register. Aspects further include dispatching the dispatch group to an issue unit. | 03-31-2016 |
20160139927 | IDENTIFYING INSTRUCTIONS FOR DECODE-TIME INSTRUCTION OPTIMIZATION GROUPING IN VIEW OF CACHE BOUNDARIES - A technique for processing instructions includes examining instructions in an instruction stream of a processor to determine properties of the instructions. The properties indicate whether the instructions may belong in an instruction sequence subject to decode-time instruction optimization (DTIO). Whether the properties of multiple ones of the instructions are compatible for inclusion within an instruction sequence of a same group is determined. The instructions with compatible ones of the properties are grouped into a first instruction group. The instructions of the first instruction group are decoded subsequent to formation of the first instruction group. Whether the first instruction group actually includes a DTIO sequence is verified based on the decoding. Based on the verifying, DTIO is performed on the instructions of the first instruction group or is not performed on the instructions of the first instruction group. | 05-19-2016 |
20160179540 | INSTRUCTION AND LOGIC FOR HARDWARE SUPPORT FOR EXECUTION OF CALCULATIONS | 06-23-2016 |
20160202980 | MICROPROCESSOR WITH ARM AND X86 INSTRUCTION LENGTH DECODERS | 07-14-2016 |