Patent application number | Description | Published |
20150106545 | Computer Processor Employing Cache Memory Storing Backless Cache Lines - A computer processing system with a hierarchical memory system having at least one cache and physical memory, and a processor having execution logic that generates memory requests that are supplied to the hierarchical memory system. The at least one cache stores a plurality of cache lines including at least one backless cache line. | 04-16-2015 |
20150106566 | Computer Processor Employing Dedicated Hardware Mechanism Controlling The Initialization And Invalidation Of Cache Lines - A computer processing system includes execution logic that generates memory requests that are supplied to a hierarchical memory system. The computer processing system includes a hardware map storing a number of entries associated with corresponding cache lines, where each given entry of the hardware map indicates whether a corresponding cache line i) currently stores valid data in the hierarchical memory system, or ii) does not currently store valid data in hierarchical memory system and should be interpreted as being implicitly zero throughout. | 04-16-2015 |
20150106567 | Computer Processor Employing Cache Memory With Per-Byte Valid Bits - A computer processing system with a hierarchical memory system that associates a number of valid bits for each cache line of the hierarchical memory system. The valid bits are provided for each cache line stored in a respective cache and make explicit which bytes are semantically defined and which are not for the associated given cache line. Memory requests to the cache(s) of the hierarchical memory system can include an address specifying a requested cache line as well as a mask that includes a number of bits each corresponding to a different byte of the requested cache line. The values of the bits of the byte mask indicate which bytes of the requested cache line are to be returned from the hierarchical memory system. The memory request is processed by the top level cache of the hierarchical memory system, looking for one or more valid bytes of the requested cache line corresponding to the target address of the memory request. The valid bytes of the cache line corresponding to the byte mask as stored in cache can be identified by reading out the valid bit(s) and data byte(s) stored by the cache for putative matching cache lines for those data bytes that are specified by the byte mask of the memory request, while ignoring the valid bit(s) and data byte(s) stored by the cache for putative matching cache lines for those data bytes that are not specified by the byte mask of the memory request. Extensions to shared multiprocessor systems is also described and claimed. | 04-16-2015 |
20150106588 | Computer Processor Employing Hardware-Based Pointer Processing - A computer processor is provided with execution logic that performs operations that utilize pointers stored in memory. In one aspect, each pointer is associated with a predefined number of event bits. The execution logic processes the event bits of a given pointer in conjunction with processing a predefined pointer-related operation involving the given pointer in order to selectively output an event-of-interest signal. | 04-16-2015 |
20150106597 | Computer Processor With Deferred Operations - A computer processor and corresponding method of operation employs execution logic that includes at least one functional unit and operand storage that stores data that is produced and consumed by the at least one functional unit. The at least one functional unit is configured to execute a deferred operation whose execution produces result data. The execution logic further includes a retire station that is configured to store and retire the result data of the deferred operation in order to store such result data in the operand storage, wherein the retire of such result data occurs at a machine cycle following issue of the deferred operation as controlled by statically-assigned parameter data included in the encoding of the deferred operation. | 04-16-2015 |
20150205609 | Computer Processor Employing Operand Data With Associated Meta-Data - A computer processor is provided that employs a plurality of operand storage elements that store operand data values and associated meta-data as unitary operand data elements as well as at least one functional unit that performs operations that produce and access the unitary operand data elements stored in the plurality of operand storage elements. The meta-data associated with a given operand data value as part of a unitary operand data element can specify type of the unitary operand data element (e.g., vector or scalar), elemental width and floating-point error flags. The meta-data can also be used to define special operand data values (e.g., Not-a-Result and None). The meta-data is useful in optimizing execution, such as in speculation and vectorized SIMD operations. The computer processor can also support a number of particular vector operations that are useful in optimizing execution of vectorized SIMD operations. | 07-23-2015 |
20150220343 | Computer Processor Employing Phases of Operations Contained in Wide Instructions - A computer processor employs an instruction processing pipeline that processes a sequence of wide instructions each having an encoding that represents a plurality of different operations. The plurality of different operations of the given wide instruction are logically organized into a number of phases having a predefined ordering such that at least one operation of the given wide instruction produces data that is consumed by at least one other operation of the given wide instruction. In certain circumstances where stalling is absent, the plurality of different operations of the phases of the given wide instruction can be issued for execution by the instruction processing pipeline over a plurality of consecutive machine cycles. | 08-06-2015 |
20150347130 | COMPUTER PROCESSOR EMPLOYING SPLIT-STREAM ENCODING - A computer processor is operably coupled to a memory system. The memory system is configured to store instruction blocks, wherein each instruction block is associated with an entry address and multiple distinct instruction streams within the instruction block. The multiple distinct instruction streams include at least a first instruction stream and a second instruction stream. The first instruction stream has an instruction order that logically extends in a direction of increasing memory space relative to the entry address of the instruction block. The second instruction stream has an instruction order that logically extends in a direction of decreasing memory space relative to the entry address of the instruction block. The computer processor includes a number of multi-stage instruction processing components corresponding to the multiple distinct instruction streams within each instruction block. The number of multi-stage instruction processing components are configured to access and process in parallel instructions belonging to multiple distinct instruction streams of a particular instruction block stored in the memory system. | 12-03-2015 |
20150347142 | Computer Processor Employing Double-Ended Instruction Decoding - A computer processor including an instruction buffer configured to store at least one variable-length instruction having a bit bundle bounded by a head end and a tail end with a plurality of slots each defining a corresponding operation, wherein the plurality of slots and corresponding operations are logically partitioned into a plurality of distinct blocks with a first group of blocks extending from the head end of the bit bundle toward the tail end of the bit bundle and a second group of blocks extending from the tail end of the bit bundle toward the head end of the bit bundle, wherein the second group of blocks includes a tail end block disposed adjacent the tail end of the bit bundle. A decode stage is operably coupled to the instruction buffer and configured to process a given variable-length instruction stored by the instruction buffer by decoding at least one operation of a particular block belonging to the first group of blocks in parallel with decoding at least one operation of the tail end block. Additional aspects are described and claimed. | 12-03-2015 |
20150347143 | COMPUTER PROCESSOR EMPLOYING INSTRUCTIONS WITH ELIDED NOP OPERATIONS - A computer processor that operates on distinct first and second instruction streams that have a predefined timed semantic relationship. At least one of the first and second instruction streams includes variable-length instructions having a header and associated bundle bounded by a head end and a tail end. An alignment hole within the bundle encodes information representing at least one nop operation. The computer processor includes first and second multi-stage instruction processing components configured to process in parallel the first and second instruction streams. At least one of the first and second multi-stage instruction processing components includes an instruction buffer operably coupled to a decode stage. The decode stage is configured to process a variable-length instruction by isolating and interpreting the alignment hole of the variable length instruction in order to initiate zero or more nop operations that follow the timed semantic relationship between the first and second instruction streams. | 12-03-2015 |