Patent application number | Description | Published |
20090172355 | INSTRUCTIONS WITH FLOATING POINT CONTROL OVERRIDE - Methods and apparatus relating to instructions with floating point control override are described. In an embodiment, floating point operation settings indicated by a floating point control register may be overridden on a per instruction basis. Other embodiments are also described. | 07-02-2009 |
20090327657 | GENERATING AND PERFORMING DEPENDENCY CONTROLLED FLOW COMPRISING MULTIPLE MICRO-OPERATIONS (uops) - A processor to perform an out-of-order (OOO) processing in which a reservation station (RS) may generate and process a dependency controlled flow comprising multiple micro-operations (uops) with specific clock based dispatch scheme. The RS may either combine two or more uops into a single RS entry or make a direct connection between two or more RS entries. The RS may allow more than two source values to be associated with a single RS by combining sources from the two or more uops. One or more execution units may be provisioned to perform the function defined by the uops. The execution units may receive more than two sources at a given time point and produce two or more results on different ports. | 12-31-2009 |
20110153707 | MULTIPLYING AND ADDING MATRICES - An apparatus and method are described for multiplying and adding matrices. For example, one embodiment of a method comprises decoding by a decoder in a processor device, a single instruction specifying an m-by-m matrix operation for a set of vectors, wherein each vector represents an m-by-m matrix of data elements and m is greater than one; issuing the single instruction for execution by an execution unit in the processor device; and responsive to the execution of the single instruction, generating a resultant vector, wherein the resultant vector represents an m-by-m matrix of data elements. | 06-23-2011 |
20120079251 | MULTIPLY ADD FUNCTIONAL UNIT CAPABLE OF EXECUTING SCALE, ROUND, GETEXP, ROUND, GETMANT, REDUCE, RANGE AND CLASS INSTRUCTIONS - A method is described that involves executing a first instruction with a functional unit. The first instruction is a multiply-add instruction. The method further includes executing a second instruction with the functional unit. The second instruction is a round instruction. | 03-29-2012 |
20120166509 | Performing Reciprocal Instructions With High Accuracy - In one embodiment, the present invention includes a method for receiving a reciprocal instruction and an operand in a processor, accessing an entry of a lookup table based on a portion of the operand and the instruction, generating an encoder output based on a type of the reciprocal instruction and whether the reciprocal instruction is a legacy instruction, and selecting portions of the lookup table entry and input operand to be provided to a reciprocal logic unit based on the encoder output. Other embodiments are described and claimed. | 06-28-2012 |
20130067204 | Instructions With Floating Point Control Override - Methods and apparatus relating to instructions with floating point control override are described. In an embodiment, floating point operation settings indicated by a floating point control register may be overridden on a per instruction basis. Other embodiments are also described. | 03-14-2013 |
20130290685 | FLOATING POINT ROUNDING PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS - A method of an aspect includes receiving a floating point rounding instruction. The floating point rounding instruction indicates a source of one or more floating point data elements, indicates a number of fraction bits after a radix point that each of the one or more floating point data elements are to be rounded to, and indicates a destination storage location. A result is stored in the destination storage location in response to the floating point rounding instruction. The result includes one or more rounded result floating point data elements. Each of the one or more rounded result floating point data elements includes one of the floating point data elements of the source, in a corresponding position, which has been rounded to the indicated number of fraction bits. Other methods, apparatus, systems, and instructions are disclosed. | 10-31-2013 |
20140006755 | VECTOR MULTIPLICATION WITH ACCUMULATION IN LARGE REGISTER SPACE | 01-02-2014 |
20140013075 | SYSTEMS, APPARATUSES, AND METHODS FOR PERFORMING A HORIZONTAL ADD OR SUBTRACT IN RESPONSE TO A SINGLE INSTRUCTION - Embodiments of systems, apparatuses, and methods for performing in a computer processor vector packed horizontal add or subtract of packed data elements in response to a single vector packed horizontal add or subtract instruction that includes a destination vector register operand, a source vector register operand, and an opcode are describes. | 01-09-2014 |
20140019713 | SYSTEMS, APPARATUSES, AND METHODS FOR PERFORMING A DOUBLE BLOCKED SUM OF ABSOLUTE DIFFERENCES - Embodiments of systems, apparatuses, and methods for performing in a computer processor vector double block packed sum of absolute differences (SAD) in response to a single vector double block packed sum of absolute differences instruction that includes a destination vector register operand, first and second source operands, an immediate, and an opcode are described. | 01-16-2014 |
20140082333 | SYSTEMS, APPARATUSES, AND METHODS FOR PERFORMING AN ABSOLUTE DIFFERENCE CALCULATION BETWEEN CORRESPONDING PACKED DATA ELEMENTS OF TWO VECTOR REGISTERS - Embodiments of systems, apparatuses, and methods for performing in a computer processor absolute difference calculation in response to a single vector packed absolute difference instruction that includes a first and second source vector register operand, a destination vector register operand, and an opcode are described. | 03-20-2014 |
20140188967 | Leading Change Anticipator Logic - In one embodiment, a processor includes at least one floating point unit. The at least one floating point unit may include an adder, leading change anticipator (LCA) logic, and a shifter. The adder may be to add a first operand X and a second operand Y to obtain an output operand having a bit length n. The LCA logic may be to: for each bit position i from n−1 to 1, obtain a set of propagation values and a set of bit values based on the first operand X and the second operand Y; and generate a LCA mask based on the set of propagation values and the set of bit values. The shifter may be to normalize the output operand based on the LCA mask. Other embodiments are described and claimed. | 07-03-2014 |
20140195580 | FLOATING POINT ROUND-OFF AMOUNT DETERMINATION PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS - A method of an aspect includes receiving a floating point round-off amount determination instruction. The instruction indicates a source of one or more floating point data elements, indicates a number of fraction bits after a radix point, and indicates a destination storage location. A result including one or more result floating point data elements is stored in the destination storage location in response to the floating point round-off amount determination instruction. Each of the one or more result floating point data elements includes a difference between a corresponding floating point data element of the source in a corresponding position, and a rounded version of the corresponding floating point data element of the source that has been rounded to the indicated number of the fraction bits. Other methods, apparatus, systems, and instructions are disclosed. | 07-10-2014 |
20140201502 | SYSTEMS, APPARATUSES, AND METHODS FOR PERFORMING A BUTTERFLY HORIZONTAL AND CROSS ADD OR SUBSTRACT IN RESPONSE TO A SINGLE INSTRUCTION - Embodiments of systems, apparatuses, and methods for performing in a computer processor vector packed butterfly horizontal cross add or subtract of packed data elements in response to a single vector packed butterfly horizontal cross add or subtract instruction that includes a destination vector register operand, a source vector register operand, an immediate, and an opcode are described. | 07-17-2014 |
20140222883 | MATH CIRCUIT FOR ESTIMATING A TRANSCENDENTAL FUNCTION - A math circuit for computing an estimate of a transcendental function is described. A lookup table storage circuit has stored therein several groups of binary values, where each group of values represents a respective coefficient of a first polynomial that estimates the function to a high precision. A computing circuit uses a portion of a binary value, that is also taken from one of the groups of values, to evaluate a second polynomial that estimates the function to a low precision. Other embodiments are also described and claimed. | 08-07-2014 |
20140379773 | FUSED MULTIPLY ADD OPERATIONS USING BIT MASKS - Systems and methods of performing a fused multiply add (FMA) operations are provided. In one embodiment, the length of the adder used by the FMA operation is less than 3*N, where N is the number of bits in the mantissa term of a floating point number. A mask may be used to perform the addition portion of the FMA operation using the adder. A second mask may be used to denormalize the result of the addition portion of the FMA operation if an underflow occurs. | 12-25-2014 |
20150088946 | FLOATING POINT SCALING PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS - A method of an aspect includes receiving a floating point scaling instruction. The floating point scaling instruction indicates a first source including one or more floating point data elements, a second source including one or more corresponding floating point data elements, and a destination. A result is stored in the destination in response to the floating point scaling instruction. The result includes one or more corresponding result floating point data elements each including a corresponding floating point data element of the second source multiplied by a base of the one or more floating point data elements of the first source raised to a power of an integer representative of the corresponding floating point data element of the first source. Other methods, apparatus, systems, and instructions are disclosed. | 03-26-2015 |
20150088947 | MULTIPLY ADD FUNCTIONAL UNIT CAPABLE OF EXECUTING SCALE, ROUND, GETEXP, ROUND, GETMANT, REDUCE, RANGE AND CLASS INSTRUCTIONS - A method is described that involves executing a first instruction with a functional unit. The first instruction is a multiply-add instruction. The method further includes executing a second instruction with the functional unit. The second instruction is a round instruction. | 03-26-2015 |