Patent application number | Description | Published |
20130262838 | Memory Disambiguation Hardware To Support Software Binary Translation - A method of memory disambiguation hardware to support software binary translation is provided. This method includes unrolling a set of instructions to be executed within a processor, the set of instructions having a number of memory operations. An original relative order of memory operations is determined. Then, possible reordering problems are detected and identified in software. The reordering problem being when a first memory operation has been reordered prior to and aliases to a second memory operation with respect to the original order of memory operations. The reordering problem is addressed and a relative order of memory operations to the processor is communicated. | 10-03-2013 |
20130275715 | METHOD, APPARATUS, AND SYSTEM FOR EFFICIENTLY HANDLING MULTIPLE VIRTUAL ADDRESS MAPPINGS DURING TRANSACTIONAL EXECUTION - An apparatus and method is described herein for providing structures to support software memory re-ordering within atomic sections of code. Upon a start or end of a critical section, speculative bits of a translation buffer are reset. When a speculative memory access causes an address translation of a virtual address to a physical address, the translation buffer is searched to determine if another entry (a different virtual address) includes the same physical address. And if another entry does include the same physical address, the speculative execution is failed to provide protection from invalid execution resulting from the memory re-ordering. | 10-17-2013 |
20130283249 | INSTRUCTION AND LOGIC TO PERFORM DYNAMIC BINARY TRANSLATION - A micro-architecture may provide a hardware and software co-designed dynamic binary translation. The micro-architecture may invoke a method to perform a dynamic binary translation. The method may comprise executing original software code compiled targeting a first instruction set, using processor hardware to detect a hot spot in the software code and passing control to a binary translation translator, determining a hot spot region for translation, generating the translated code using a second instruction set, placing the translated code in a translation cache, executing the translated code from the translated cache, and transitioning back to the original software code after the translated code finishes execution. | 10-24-2013 |
20130305019 | Instruction and Logic to Control Transfer in a Partial Binary Translation System - A dynamic optimization of code for a processor-specific dynamic binary translation of hot code pages (e.g., frequently executed code pages) may be provided by a run-time translation layer. A method may be provided to use an instruction look-aside buffer (iTLB) to map original code pages and translated code pages. The method may comprise fetching an instruction from an original code page, determining whether the fetched instruction is a first instruction of a new code page and whether the original code page is deprecated. If both determinations return yes, the method may further comprise fetching a next instruction from a translated code page. If either determinations returns no, the method may further comprise decoding the instruction and fetching the next instruction from the original code page. | 11-14-2013 |
20130311758 | HARDWARE PROFILING MECHANISM TO ENABLE PAGE LEVEL AUTOMATIC BINARY TRANSLATION - A hardware profiling mechanism implemented by performance monitoring hardware enables page level automatic binary translation. The hardware during runtime identifies a code page in memory containing potentially optimizable instructions. The hardware requests allocation of a new page in memory associated with the code page, where the new page contains a collection of counters and each of the counters corresponds to one of the instructions in the code page. When the hardware detects a branch instruction having a branch target within the code page, it increments one of the counters that has the same position in the new page as the branch target in the code page. The execution of the code page is repeated and the counters are incremented when branch targets fall within the code page. The hardware then provides the counter values in the new page to a binary translator for binary translation. | 11-21-2013 |
20140007066 | STATE RECOVERY METHODS AND APPARTUS FOR COMPUTING PLATFORMS | 01-02-2014 |
20140089271 | MEMORY ADDRESS ALIASING DETECTION - Method and apparatus to efficiently detect violations of data dependency relationships. A memory address associated with a computer instruction may be obtained. A current state of the memory address may be identified. The current state may include whether the memory address is associated with a read or a store instruction, and whether the memory address is associated with a set or a check. A previously accumulated state associated with the memory address may be retrieved from a data structure. The previously accumulated state may include whether the memory address was previously associated with a read or a store instruction, and whether the memory address was previously associated with a set or a check. If a transition from the previously accumulated state to the current state is invalid, a failure condition may be signaled. | 03-27-2014 |
20140095842 | ACCELERATED INTERLANE VECTOR REDUCTION INSTRUCTIONS - A vector reduction instruction is executed by a processor to provide efficient reduction operations on an array of data elements. The processor includes vector registers. Each vector register is divided into a plurality of lanes, and each lane stores the same number of data elements. The processor also includes execution circuitry that receives the vector reduction instruction to reduce the array of data elements stored in a source operand into a result in a destination operand using a reduction operator. Each of the source operand and the destination operand is one of the vector registers. Responsive to the vector reduction instruction, the execution circuitry applies the reduction operator to two of the data elements in each lane, and shifts one or more remaining data elements when there is at least one of the data elements remaining in each lane. | 04-03-2014 |
20140156933 | System, Method, and Apparatus for Improving Throughput of Consecutive Transactional Memory Regions - Systems, apparatuses, and methods for improving TM throughput using a TM region indicator (or color) are described. Through the use of TM region indicators younger TM regions can have their instructions retired while waiting for older TM regions to commit. | 06-05-2014 |
20140156978 | Detecting and Filtering Biased Branches in Global Branch History - A processor includes an instruction pipeline for executing instructions including a branching instruction, a counter for counting times that the branching instruction is taken, a register for storing a global branch history as a function of a value of the counter, and a branch prediction unit for predicting branching based on the global branch history. | 06-05-2014 |