Class / Patent application number | Description | Number of patent applications / Date published |
712210000 | Decoding instruction to accommodate variable length instruction or operand | 52 |
20080229075 | MICROCONTROLLER WITH LOW-COST DIGITAL SIGNAL PROCESSING EXTENSIONS - A set of low-cost microcontroller extensions facilitates Digital Signal Processing (DSP) applications by incorporating a Multiply-Accumulate (MAC) unit in a Central Processing Unit (CPU) of the microcontroller which is responsive to the extensions. | 09-18-2008 |
20080250229 | System, Method and Software to Preload Instructions from a Variable-Length Instruction Set with Proper Pre-Decoding - In a processor executing instructions from a variable-length instruction set, a preload instruction is operative to retrieve from memory a data block corresponding to an instruction cache line, pre-decode instructions from a variable-length instruction set in the data block, and load the instructions and pre-decode information into the instruction cache. An instruction execution unit indicates to a pre-decoder the position within the data block of a first valid instruction. The pre-decoder successively determines the length of each instruction and hence the instruction boundaries. An instruction cache line offset indicator that identifies the position of the first valid instruction may be generated and provided to the pre-decoder in a variety of ways. | 10-09-2008 |
20080307203 | Scaling Instruction Intervals to Identify Collection Points for Representative Instruction Traces - A method, system, and computer program product are provided for identifying instructions to obtain representative traces. A phase instruction budget is calculated for each phase in a set of phases. The phase instruction budget is based on a weight associated with each phase and a global instruction budget. A starting index and an ending index are identified for instructions within a set of intervals in each phase in order to meet the phase instruction budget for that phase, thereby forming a set of interval indices. A determination is made as to whether the instructions within the set of interval indices meet the global instruction budget. Responsive to the global instruction budget being met, the set of interval indices are output as collection points for the representative traces. | 12-11-2008 |
20080307204 | Fast Static Rotator/Shifter with Non Two's Complemented Decode and Fast Mask Generation - In one embodiment, a rotator, a mask generator, and circuitry configured to mask the rotated operand output by the rotator with the output mask generated by the mask generator perform a shift operation. Coupled to receive the input operand and the shift count, the rotator is configured to rotate the input operand by the shift count. Coupled to receive the shift count and the shift direction, the mask generator is configured to generate an output mask by decoding a most significant bit (MSB) field of the shift count to generate a first mask, decoding a least significant bit (LSB) field of the shift count to generate a second mask, logically ANDing the bits of the second mask with the corresponding bit of the first mask and logically ORing the result with an adjacent bit of the first mask that is selected responsive to the shift direction. Additionally, in one embodiment, the rotator may be configured to perform a right rotate/shift operation using a left rotate and without performing a two's complement operation on the rotate/shift count. | 12-11-2008 |
20090019263 | Method and Apparatus for Length Decoding Variable Length Instructions - A mechanism for superscalar decode of variable length instructions. The decode mechanism may be included within a processing unit, and may comprise a length decode unit. The length decode unit may obtain a plurality of instruction bytes. The instruction bytes may be associated with a plurality of variable length instructions, which are to be executed by the processing unit. The length decode unit may perform a length decode operation for each of the plurality of instruction bytes. For each instruction byte, the length decode unit may estimate the instruction length of a current variable length instruction associated with a current instruction byte. Furthermore, during the length decode operation, for each instruction byte, the length decode unit may estimate the start of a next variable length instruction based on the estimated instruction length of the current variable length instruction, and store a first pointer to the estimated start of the next variable length instruction. | 01-15-2009 |
20090043989 | Null value checking instruction - A processor | 02-12-2009 |
20090043990 | Implementation of variable length instruction encoding using alias addressing - A digital processor and method of operation utilize an alias address space to implement variable length instruction encoding on a legacy processor. The method includes storing instructions of a code sequence in memory; generating instruction addresses of the code sequence; automatically switching between a first operating mode and a second operating mode in response to a transition in instruction addresses between a first address space and a second address space, wherein addresses in the first and second address spaces access a common memory space; in the first operating mode, accessing instructions in the first address space; in the second operating mode, accessing instructions in the second address space; and executing the accessed instructions of the code sequence. Instructions of different instruction lengths may be utilized in the first and second operating modes. | 02-12-2009 |
20090055629 | Instruction length determination device and method using concatenate bits to determine an instruction length in a multi-mode processor - An instruction length determination device includes an instruction input unit having a memory space to store a plurality of N-bit data; an instruction fetch unit which fetches the plurality of N-bit data from the instruction input unit; an instruction length determination logic which compares concatenate bits of a first N-bit data with a predetermined value for determination of an instruction length; and an instruction concatenate unit which selectively concatenates a number of successive N-bit data based on the determination. The instruction length determination logic determines that the first N-bit data is a complete instruction when the concatenate bit of the first N-bit data is not equal to the predetermined value. Otherwise, the instruction length determination logic determines that a complete instruction is formed of last N-bit data finally fetched and all N-bit previously reserved. | 02-26-2009 |
20100115240 | Optimizing performance of instructions based on sequence detection or information associated with the instructions - In one embodiment, the present invention includes an instruction decoder that can receive an incoming instruction and a path select signal and decode the incoming instruction into a first instruction code or a second instruction code responsive to the path select signal. The two different instruction codes, both representing the same incoming instruction may be used by an execution unit to perform an operation optimized for different data lengths. Other embodiments are described and claimed. | 05-06-2010 |
20100146244 | METHOD FOR INSTRUCTING A DATA PROCESSOR TO PROCESS DATA - A data processor which executes instructions described in first and second instruction formats. The first instruction format defines a register-addressing field of a predetermined size, while the second instruction format defines a register-addressing field of a size larger than that of the register-addressing field defined by the first instruction format. The data processor includes: instruction-type identifier, responsive to an instruction, for identifying the received instruction as being described in the first or second instruction format by the instruction itself; a first register file including a plurality of registers; and a second register file also including a plurality of registers, the number of the registers included in the second register file being larger than that of the registers included in the first register file. | 06-10-2010 |
20100199073 | MULTITHREADED PROCESSOR WITH MULTIPLE CONCURRENT PIPELINES PER THREAD - A multithreaded processor comprises a plurality of hardware thread units, an instruction decoder coupled to the thread units for decoding instructions received therefrom, and a plurality of execution units for executing the decoded instructions. The multithreaded processor is configured for controlling an instruction issuance sequence for threads associated with respective ones of the hardware thread units. On a given processor clock cycle, only a designated one of the threads is permitted to issue one or more instructions, but the designated thread that is permitted to issue instructions varies over a plurality of clock cycles in accordance with the instruction issuance sequence. The instructions are pipelined in a manner which permits at least a given one of the threads to support multiple concurrent instruction pipelines. | 08-05-2010 |
20100268919 | METHOD AND STRUCTURE FOR SOLVING THE EVIL-TWIN PROBLEM - A register file, in a processor, includes a first plurality of registers of a first size, n-bits. A decoder uses a mapping that divides the register file into a second plurality M of registers having a second size. Each of the registers having the second size is assigned a different name in a continuous name space. Each register of the second size includes a plurality N of registers of the first size, n-bits. Each register in the plurality N of registers is assigned the same name as the register of the second size that includes that plurality. State information is maintained in the register file for each n-bit register. The dependence of an instruction on other instructions is detected through the continuous name space. The state information allows the processor to determine when the information in any portion, or all, of a register is valid. | 10-21-2010 |
20100287359 | VARIABLE REGISTER AND IMMEDIATE FIELD ENCODING IN AN INSTRUCTION SET ARCHITECTURE - A method and apparatus provide means for compressing instruction code size. An Instruction Set Architecture (ISA) encodes instructions compact, usual or extended bit lengths. Commonly used instructions are encoded having both compact and usual bit lengths, with compact or usual bit length instructions chosen based on power, performance or code size requirements. Instructions of the ISA can be used in both privileged and non-privileged operating modes of a microprocessor. The instruction encodings can be used interchangeably in software applications. Instructions from the ISA may be executed on any programmable device enabled for the ISA, including a single instruction set architecture processor or a multi-instruction set architecture processor. | 11-11-2010 |
20100299503 | APPARATUS AND METHOD FOR MARKING START AND END BYTES OF INSTRUCTIONS IN A STREAM OF INSTRUCTION BYTES IN A MICROPROCESSOR HAVING AN INSTRUCTION SET ARCHITECTURE IN WHICH INSTRUCTIONS MAY INCLUDE A LENGTH-MODIFYING PREFIX - An apparatus in a microprocessor that has an instruction set architecture in which instructions may include a length-modifying prefix used to select an address/operand size other than a default address/operand size, wherein the apparatus marks the start byte and the end byte of each instruction in a stream of instruction bytes. Decode logic decodes each instruction byte of a predetermined number of instruction bytes to determine whether the instruction byte specifies a length-modifying prefix and generates a start mark and an end mark for each of the instruction bytes based on an address/operand size. Operand/address size logic provides the default operand/address size to the decode logic to use to generate the start and end marks during a first clock cycle during which the decode logic decodes the predetermined number of instruction bytes. If during the first clock cycle and any of N subsequent clock cycles the decode logic indicates that one of the predetermined number of instruction bytes specifies a length-modifying prefix, the operand/address size logic provides to the decode logic on the next clock cycle the address/operand size specified by the length-modifying prefix to use to generate the start and end marks. | 11-25-2010 |
20110083001 | METHODS AND APPARATUS FOR AUTOMATED GENERATION OF ABBREVIATED INSTRUCTION SET AND CONFIGURABLE PROCESSOR ARCHITECTURE - A systematic approach to architecture and design of the instruction fetch mechanisms and instruction set architectures in embedded processors is described. This systematic approach allows a relaxing of certain restrictions normally imposed by a fixed-size instruction set architecture (ISA) on design and development of an embedded system. The approach also guarantees highly efficient usage of the available instruction storage which is only bounded by the actual information contents of an application or its entropy. The result of this efficiency increase is a general reduction of the storage requirements, or a compression, of the instruction segment of the original application. An additional feature of this system is the full decoupling of the ISA from the core architecture. This decoupling allows usage of a variable length encoding for any size of the ISA without impacting the physical instruction memory organization or layout and branching mechanism as well as tuning of the execution core to the application. A hardware embodiment described herein allows application of the above mentioned high-entropy encoding technique in actual embedded processor using today's technology without posing significant strain on timing requirements. | 04-07-2011 |
20110173418 | INSTRUCTION SET EXTENSION USING 3-BYTE ESCAPE OPCODE - A method, apparatus and system are disclosed for decoding an instruction in a variable-length instruction set. The instruction is one of a set of new types of instructions that uses a new escape code value, which is two bytes in length, to indicate that a third opcode byte includes the instruction-specific opcode for a new instruction. The new instructions are defined such the length of each instruction in the opcode map for one of the new escape opcode values may be determined using the same set of inputs, where each of the inputs is relevant to determining the length of each instruction in the new opcode map. For at least one embodiment, the length of one of the new instructions is determined without evaluating the instruction-specific opcode. | 07-14-2011 |
20110219214 | Microprocessor having novel operations - A processor. The processor includes a first register for storing a first packed data, a decoder, and a functional unit. The decoder has a control signal input. The control signal input is for receiving a first control signal and a second control signal. The first control signal is for indicating a pack operation. The second control signal is for indicating an unpack operation. The functional unit is coupled to the decoder and the register. The functional unit is for performing the pack operation and the unpack operation using the first packed data. The processor also supports a move operation. | 09-08-2011 |
20120079247 | DUAL REGISTER DATA PATH ARCHITECTURE - A processor includes a first and second execution unit each of which is arranged to execute multiply instructions of a first type upon fixed point operands and to execute multiply instructions of a second type upon floating point operands. A register file of the processor stores operands in registers that are each addressable by instructions for performing the first and second types of operations. An instruction decode unit is responsive to the at least one multiply instruction of the first type and the at least one multiply instruction of the second type to at the same time enable a first data path between the first set of registers and the first execution unit and to enable a second data path between a second set of registers and the second execution unit. | 03-29-2012 |
20120110307 | COMPRESSED INSTRUCTION PROCESSING DEVICE AND COMPRESSED INSTRUCTION GENERATION DEVICE - A compressed instruction processing device has: a compressed instruction expanding circuit which expands a compressed instruction code that include a difference code between an instruction code being a compression object and a reference instruction code and which outputs an expanded instruction code; an instruction buffer storing the instruction code expanded by the compressed instruction expanding circuit; and an execution section executing the instruction code expanded by the compressed instruction expanding circuit, wherein the compressed instruction expanding circuit outputs the expanded instruction code by inputting the instruction code in the instruction buffer as the reference instruction code and adding the reference instruction code and the difference code in the compressed instruction code. | 05-03-2012 |
20120159128 | Handling Media Streams In A Programmable Bit Processor - In one embodiment, the present invention is directed to a bit processor that includes an execution unit to, responsive to an instruction for access of data of a first bit width, access data of a second bit width, the second bit width having a different number of bits than the first bit width when some of the data accessed includes non-stream data. Other embodiments are described and claimed. | 06-21-2012 |
20120159129 | PROGRAMMABLE LOGIC ARRAY AND READ-ONLY MEMORY AREA REDUCTION USING CONTEXT-SENSITIVE LOGIC FOR DATA SPACE MANIPULATION - A computer, circuit, and computer-readable medium are disclosed. In one embodiment, the processor includes an instruction decoder unit that can decode a macro instruction into at least one micro-operation with a set of data fields. The resulting micro-operation has at least one data field that is in a compressed form. The instruction decoder unit has storage that can store the micro-operation with the compressed-form data field. The instruction decoder unit also has extraction logic that is capable of extracting the compressed-form data field into an uncompressed-form data field. After extraction, the instruction decoder unit also can send the micro-operation with the extracted uncompressed-form data field to an execution unit. The computer also includes an execution unit capable of executing the sent micro-operation. | 06-21-2012 |
20120173852 | INSTRUCTION SET EXTENSION USING 3-BYTE ESCAPE OPCODE - A method, apparatus and system are disclosed for decoding an instruction in a variable-length instruction set. The instruction is one of a set of new types of instructions that uses a new escape code value, which is two bytes in length, to indicate that a third opcode byte includes the instruction-specific opcode for a new instruction. The new instructions are defined such the length of each instruction in the opcode map for one of the new escape opcode values may be determined using the same set of inputs, where each of the inputs is relevant to determining the length of each instruction in the new opcode map. For at least one embodiment, the length of one of the new instructions is determined without evaluating the instruction-specific opcode. | 07-05-2012 |
20120210100 | SEGMENTAL ALLOCATION METHOD OF EXPANDING RISC PROCESSOR REGISTER - A segmental allocation method of expanding RISC processor register includes the steps of a) setting an instruction format of the RISC processor, the destination register field being set having 6 bits to correspond to 64 registers and at least one source register field having at least 4 bits to correspond to at least 16 registers; b) providing two solutions to the problem resulting from that the instruction format in the step a) goes beyond range under some circumstances; and c) setting a register segment allocation algorithm having the steps of c1) providing and grouping a plurality of pseudo registers; c2) prioritizing the pseudo registers in each of the groups; c3) combining the groups pursuant to the priorities thereof; and c4) locating the physical register of lowest computational cost. | 08-16-2012 |
20120233444 | MIXED SIZE DATA PROCESSING OPERATION - A data processing system | 09-13-2012 |
20120265967 | IMPLEMENTING INSTRUCTION SET ARCHITECTURES WITH NON-CONTIGUOUS REGISTER FILE SPECIFIERS - There are provided methods and computer program products for implementing instruction set architectures with non-contiguous register file specifiers. A method for processing instruction code includes processing an instruction of an instruction set using a non-contiguous register specifier of a non-contiguous register specification. The instruction includes the non-contiguous register specifier. | 10-18-2012 |
20120331272 | SIMD SIGN OPERATION - Method, apparatus, and program means for nonlinear filtering and deblocking applications utilizing SIMD sign and absolute value operations. The method of one embodiment comprises receiving first data for a first block and second data for a second block. The first data and said second data are comprised of a plurality of rows and columns of pixel data. A block boundary between the first block and the second block is characterized. A correction factor for a deblocking algorithm is calculated with a first instruction for a sign operation that multiplies and with a second instruction for an absolute value operation. Data for pixels located along said block boundary between the first and second block are corrected. | 12-27-2012 |
20130061024 | Bitstream Buffer Manipulation With A SIMD Merge Instruction - Method, apparatus, and program means for performing bitstream buffer manipulation with a SIMD merge instruction. The method of one embodiment comprises determining whether any unprocessed data bits for a partial variable length symbol exist in a first data block is made. A shift merge operation is performed to merge the unprocessed data bits from the first data block with a second data block. A merged data block is formed. A merged variable length symbol comprised of the unprocessed data bits and a plurality of data bits from the second data block is extracted from the merged data block. | 03-07-2013 |
20130061025 | Bitstream Buffer Manipulation With A SIMD Merge Instruction - Method, apparatus, and program means for performing bitstream buffer manipulation with a SIMD merge instruction. The method of one embodiment comprises determining whether any unprocessed data bits for a partial variable length symbol exist in a first data block is made. A shift merge operation is performed to merge the unprocessed data bits from the first data block with a second data block. A merged data block is formed. A merged variable length symbol comprised of the unprocessed data bits and a plurality of data bits from the second data block is extracted from the merged data block. | 03-07-2013 |
20130117539 | Method and Apparatus for Packing Packed Data - An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element. | 05-09-2013 |
20130117540 | METHOD AND APPARATUS FOR UNPACKING PACKED DATA - An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element. | 05-09-2013 |
20130212360 | In-Lane Vector Shuffle Instructions - In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand. | 08-15-2013 |
20130219152 | INSTRUCTION SET EXTENSION USING 3-BYTE ESCAPE OPCODE - A method, apparatus and system are disclosed for decoding an instruction in a variable-length instruction set. The instruction is one of a set of new types of instructions that uses a new escape code value, which is two bytes in length, to indicate that a third opcode byte includes the instruction-specific opcode for a new instruction. The new instructions are defined such the length of each instruction in the opcode map for one of the new escape opcode values may be determined using the same set of inputs, where each of the inputs is relevant to determining the length of each instruction in the new opcode map. For at least one embodiment, the length of one of the new instructions is determined without evaluating the instruction-specific opcode. | 08-22-2013 |
20130246749 | DATA PROCESSOR TO PROCESS DATA - A data processor includes a first register file including registers, a second register file including resisters, a number of which is larger than that of the registers of the first register file, an instruction decoder and an operation unit. The instruction decoder decodes an instruction described in first and second instruction formats. The first instruction format includes a first register-addressing field for designating the first register file. The second instruction format includes a second register-addressing field for designating the second register file, a size of which is larger than that of the first register-addressing field. The operation unit executes an instruction described in the first and second instruction formats using operand data stored in the first and second register files, respectively, based on the instruction decoder, and executes operations in parallel, a number of which is determined by a certain field included in the second instruction format. | 09-19-2013 |
20130290681 | REGISTER FILE POWER SAVINGS - A system and method for efficiently reducing the power consumption of register file accesses. A processor is operable to execute instructions with two or more data types, each with an associated size and alignment. Data operands for a first data type use operand sizes equal to an entire width of a physical register within a physical register file. Data operands for a second data type use operand sizes less than an entire width of a physical register. Accesses of the physical register file for operands associated with a non-full-width data type do not access a full width of the physical registers. A given numerical value may be bypassed for the portion of the physical register that is not accessed. | 10-31-2013 |
20130290682 | COMPRESSED INSTRUCTION FORMAT - A technique for decoding an instruction in a variable-length instruction set. In one embodiment, an instruction encoding is described, in which legacy, present, and future instruction set extensions are supported, and increased functionality is provided, without expanding the code size and, in some cases, reducing the code size. | 10-31-2013 |
20140189311 | SYSTEM AND METHOD FOR PERFORMING A SHUFFLE INSTRUCTION - An apparatus and method for performing a shuffle operation on packed data using computer-implemented steps is described. In one embodiment, a first packed data operand having at least two data elements is accessed. A second packed data operand having at least two data elements is accessed. One of the data elements in the first packed data operand is shuffled into a lower destination field of a destination register, and one of the data elements in the second packed data operand is shuffled into an upper destination field of the destination register. | 07-03-2014 |
20140281400 | Systems, Apparatuses,and Methods for Zeroing of Bits in a Data Element - Embodiments of systems, methods and apparatuses for execution a NAME instruction are described. The execution of a VPBZHI causes, on a per data element basis of a second source, a zeroing of bits higher (more significant) than a starting point in the data element. The starting point is defined by the contents of a data element in a first source. The resultant data elements are stored in a corresponding data element position of a destination. | 09-18-2014 |
20140281401 | Systems, Apparatuses, and Methods for Determining a Trailing Least Significant Masking Bit of a Writemask Register - The execution of a KZBTZ finds a trailing least significant zero bit position in an first input mask and sets an output mask to have the values of the first input mask, but with all bit positions closer to the most significant bit position than the trailing least significant zero bit position in an first input mask set to zero. In some embodiments, a second input mask is used as a writemask such that bit positions of the first input mask are not considered in the trailing least significant zero bit position calculation depending upon a corresponding bit position in the second input mask. | 09-18-2014 |
20140297994 | PROCESSORS, METHODS, AND SYSTEMS TO IMPLEMENT PARTIAL REGISTER ACCESSES WITH MASKED FULL REGISTER ACCESSES - A method includes receiving a packed data instruction indicating a first narrower source packed data operand and a narrower destination operand. The instruction is mapped to a masked packed data operation indicating a first wider source packed data operand that is wider than and includes the first narrower source operand, and indicating a wider destination operand that is wider than and includes the narrower destination operand. A packed data operation mask is generated that includes a mask element for each corresponding result data element of a packed data result to be stored by the masked packed data operation. All mask elements that correspond to result data elements to be stored by the masked operation that would not be stored by the packed data instruction are masking out. The masked operation is performed using the packed data operation mask. The packed data result is stored in the wider destination operand. | 10-02-2014 |
20140310505 | COMPRESSED INSTRUCTION FORMAT - A technique for decoding an instruction in a variable-length instruction set. In one embodiment, an instruction encoding is described, in which legacy, present, and future instruction set extensions are supported, and increased functionality is provided, without expanding the code size and, in some cases, reducing the code size. | 10-16-2014 |
20150012728 | SYSTEM FOR PROVIDING TRACE DATA IN A DATA PROCESSOR HAVING A PIPELINED ARCHITECTURE - The invention is a method and system for providing trace data in a pipelined data processor. Aspects of the invention include providing a trace pipeline in parallel to the execution pipeline, providing trace information on whether conditional instructions complete or not, providing trace information on the interrupt status of the processor, replacing instructions in the processor with functionally equivalent instructions that also produce trace information and modifying the scheduling of instructions in the processor based on the occupancy of a trace output buffer. | 01-08-2015 |
20150032999 | INSTRUCTION SET ARCHITECTURE WITH OPCODE LOOKUP USING MEMORY ATTRIBUTE - A method and circuit arrangement decode instructions based in part on one or more decode-related attributes stored in a memory address translation data structure such as an Effective To Real Translation (ERAT) or Translation Lookaside Buffer (TLB). A memory address translation data structure may be accessed, for example, in connection with a decode of an instruction stored in a page of memory, such that one or more attributes associated with the page in the data structure may be used to control how that instruction is decoded. | 01-29-2015 |
20150067302 | INSTRUCTIONS AND LOGIC TO PROVIDE GENERAL PURPOSE GF(256) SIMD CRYPTOGRAPHIC ARITHMETIC FUNCTIONALITY - Instructions and logic provide general purpose GF(2 | 03-05-2015 |
20150100763 | DECODING A COMPLEX PROGRAM INSTRUCTION CORRESPONDING TO MULTIPLE MICRO-OPERATIONS - A data processing apparatus | 04-09-2015 |
20150347142 | Computer Processor Employing Double-Ended Instruction Decoding - A computer processor including an instruction buffer configured to store at least one variable-length instruction having a bit bundle bounded by a head end and a tail end with a plurality of slots each defining a corresponding operation, wherein the plurality of slots and corresponding operations are logically partitioned into a plurality of distinct blocks with a first group of blocks extending from the head end of the bit bundle toward the tail end of the bit bundle and a second group of blocks extending from the tail end of the bit bundle toward the head end of the bit bundle, wherein the second group of blocks includes a tail end block disposed adjacent the tail end of the bit bundle. A decode stage is operably coupled to the instruction buffer and configured to process a given variable-length instruction stored by the instruction buffer by decoding at least one operation of a particular block belonging to the first group of blocks in parallel with decoding at least one operation of the tail end block. Additional aspects are described and claimed. | 12-03-2015 |
20150347143 | COMPUTER PROCESSOR EMPLOYING INSTRUCTIONS WITH ELIDED NOP OPERATIONS - A computer processor that operates on distinct first and second instruction streams that have a predefined timed semantic relationship. At least one of the first and second instruction streams includes variable-length instructions having a header and associated bundle bounded by a head end and a tail end. An alignment hole within the bundle encodes information representing at least one nop operation. The computer processor includes first and second multi-stage instruction processing components configured to process in parallel the first and second instruction streams. At least one of the first and second multi-stage instruction processing components includes an instruction buffer operably coupled to a decode stage. The decode stage is configured to process a variable-length instruction by isolating and interpreting the alignment hole of the variable length instruction in order to initiate zero or more nop operations that follow the timed semantic relationship between the first and second instruction streams. | 12-03-2015 |
20150370566 | INSTRUCTION SET ARCHITECTURE WITH OPCODE LOOKUP USING MEMORY ATTRIBUTE - A method decodes instructions based in part on one or more decode-related attributes stored in a memory address translation data structure such as an Effective To Real Translation (ERAT) or Translation Lookaside Buffer (TLB). A memory address translation data structure may be accessed, for example, in connection with a decode of an instruction stored in a page of memory, such that one or more attributes associated with the page in the data structure may be used to control how that instruction is decoded. | 12-24-2015 |
20160026466 | INSTRUCTION SET FOR SUPPORTING WIDE SCALAR PATTERN MATCHES - A processor includes an instruction decoder to receive an instruction having a first operand, a second operand, and a third operand, and an execution unit coupled to the instruction decoder to execute the instruction, the execution unit to individually perform a shift operation by at least one bit for each of a plurality of data elements stored in a storage location indicated by the second operand, for each of the data elements that has an overflow in response to the shift-left operation, to carry over the overflow into an adjacent data element based on a first bitmask obtained from the third operand, generating a final result, and to store the final result in a storage location indicated by the first operand. | 01-28-2016 |
20160026467 | INSTRUCTION AND LOGIC FOR EXECUTING INSTRUCTIONS OF MULTIPLE-WIDTHS - A processor an execution unit, a decoder, an operation width tracker, and an allocator. The decoder includes logic to decode a received instruction. The operation width tracker includes logic to track a state indicating a currently used width of one or more registers of the processor. The allocator includes logic to selectively blend the instruction with a higher number of bits based upon a width of the instruction and the state. The execution unit may include logic to execute the selectively blended instructions. | 01-28-2016 |
20160048393 | PROCESSOR FOR EXECUTING WIDE OPERAND OPERATIONS USING A CONTROL REGISTER AND A RESULTS REGISTER - A programmable processor and method for improving the performance of processors by expanding at least two source operands, or a source and a result operand, to a width greater than the width of either the general purpose register or the data path width. The present invention provides operands which are substantially larger than the data path with of the processor by using the contents of a general purpose register to specify a memory address at which a plurality of data path widths of data can be read or written, as well as the size and shape of the operand. In addition, several instructions and apparatus for implementing these instructions are described which obtain performance advantages if the operands are not limited to the width and accessible number of general purpose registers. | 02-18-2016 |
20160147535 | VARIABLE REGISTER AND IMMEDIATE FIELD ENCODING IN AN INSTRUCTION SET ARCHITECTURE - A method and apparatus provide means for compressing instruction code size. An Instruction Set Architecture (ISA) encodes instructions compact, usual or extended bit lengths. Commonly used instructions are encoded having both compact and usual bit lengths, with compact or usual bit length instructions chosen based on power, performance or code size requirements. Instructions of the ISA can be used in both privileged and non-privileged operating modes of a microprocessor. The instruction encodings can be used interchangeably in software applications. Instructions from the ISA may be executed on any programmable device enabled for the ISA, including a single instruction set architecture processor or a multi-instruction set architecture processor. | 05-26-2016 |
20160179534 | INSTRUCTION LENGTH DECODING | 06-23-2016 |