Patent application number | Description | Published |
20080222466 | Meeting point thread characterization - An apparatus associated with identifying a critical thread based on information gathered during meeting point processing is provided. One embodiment of the apparatus may include logic to selectively update meeting point counts for threads upon determining that they have arrived at a meeting point. The embodiment may also include logic to periodically identify which thread in a set of threads is a critical thread. The critical thread may be the slowest thread and criticality may be determined by examining meeting point counts. The embodiment may also include logic to selectively manipulate a configurable attribute of the critical thread and/or core upon which the critical thread will run. | 09-11-2008 |
20080244182 | Memory content inverting to minimize NTBI effects - In general, in one aspect, the disclosure describes an apparatus that includes a memory device having a plurality of memory cells. An inverter is used to invert data and tag information destined for the memory device. A register is used to capture the inverted data and tag information. A write inverted value logic is used to determine when to enable writing the inverted data and tag information from the register to the memory device. When inverted data and tag information is written to a memory cell the memory cell is invalidated. | 10-02-2008 |
20080244223 | BRANCH PRUNING IN ARCHITECTURES WITH SPECULATION SUPPORT - According to one example embodiment of the inventive subject matter, the method and apparatus described herein is used to generate an optimized speculative version of a static piece of code. The portion of code is optimized in the sense that the number of instructions executed will be smaller. However, since the applied optimization is speculative, the optimized version can be incorrect and some mechanism to recover from that situation is required. Thus, the quality of the produced code will be measured by taking into account both the final length of the code as well as the frequency of misspeculation. | 10-02-2008 |
20080244278 | Leakage Power Estimation - Methods and apparatus to provide leakage power estimation are described. In one embodiment, one or more sensed temperature values ( | 10-02-2008 |
20080263376 | Frequency and voltage scaling architecture - A method and apparatus for scaling frequency and operating voltage of at least one clock domain of a microprocessor. More particularly, embodiments of the invention relate to techniques to divide a microprocessor into clock domains and control the frequency and operating voltage of each clock domain independently of the others. | 10-23-2008 |
20080277613 | Valve - A valve for use as a fluid flow regulator having an inner part ( | 11-13-2008 |
20090019219 | Compressing address communications between processors - In one embodiment, the present invention includes a method for determining if data of a memory request by a first agent is in a memory region represented by a region indicator of a region table of the first agent, and transmitting a compressed address for the memory request to other agents of a system if the memory region is represented by the region indicator, otherwise transmitting a full address. Other embodiments are described and claimed. | 01-15-2009 |
20090037783 | PROTECTING DATA STORAGE STRUCTURES FROM INTERMITTENT ERRORS - Embodiments of apparatuses and methods for protecting data storage structures from intermittent errors are disclosed. In one embodiment, an apparatus includes a plurality of data storage locations, execution logic, error detection logic, and control logic. The execution logic is to execute an instruction to generate a data value to store in one of the data storage locations. The error detection logic is to detect an error in the data value stored in the data storage location. The control logic is to respond to the detection of the error by causing the execution logic to re-execute the instruction to regenerate the data value to store in the data storage location, causing the error detection logic to check the data value read from the data storage location, and deactivating the data storage location if another error is detected. | 02-05-2009 |
20090083488 | Enabling Speculative State Information in a Cache Coherency Protocol - In one embodiment, the present invention includes a method for receiving a bus message in a first cache corresponding to a speculative access to a portion of a second cache by a second thread, and dynamically determining in the first cache if an inter-thread dependency exists between the second thread and a first thread associated with the first cache with respect to the portion. Other embodiments are described and claimed. | 03-26-2009 |
20090094481 | Enhancing Reliability of a Many-Core Processor - In one embodiment, the present invention includes a method for identifying available cores of a many-core processor, allocating a first subset of the cores to an enabled state and a second subset of the cores to a spare state, and storing information regarding the allocation in a storage. The allocation of cores to the enables state may be based on a temperature-aware algorithm, in certain embodiments. Other embodiments are described and claimed. | 04-09-2009 |
20090113240 | Detecting Soft Errors Via Selective Re-Execution - In one embodiment, the present invention includes a method for determining a vulnerability level for an instruction executed in a processor, and re-executing the instruction if the vulnerability level is above a threshold. The vulnerability level may correspond to a soft error likelihood for the instruction while the instruction is in the processor. Other embodiments are described and claimed. | 04-30-2009 |
20090119457 | MULTITHREADED CLUSTERED MICROARCHITECTURE WITH DYNAMIC BACK-END ASSIGNMENT - A multithreaded clustered microarchitecture with dynamic back-end assignment is presented. A processing system may include a plurality of instruction caches and front-end units each to process an individual thread from a corresponding one of the instruction caches, a plurality of back-end units, and an interconnect network to couple the front-end and back-end units. A method may include measuring a performance metric of a back-end unit, comparing the measurement to a first value, and reassigning, or not, the back-end unit according to the comparison. Computer systems according to embodiments of the invention may include: a random access memory; a system bus; and a processor having a plurality of instruction caches, a plurality of front-end units each to process an individual thread from a corresponding one of the instruction caches; a plurality of back-end units; and an interconnect network coupled to the plurality of front-end units and the plurality of back-end units. | 05-07-2009 |
20090150335 | ACHIEVING COHERENCE BETWEEN DYNAMICALLY OPTIMIZED CODE AND ORIGINAL CODE - An apparatus comprising a first search logic to search for a first entry for a first page containing a first code region in a first data structure to determine whether a first indicator in the first entry is set to a first value; an adder logic to add the first entry to the first data structure, in response to failing to find the first entry in the first data structure; a second search logic to search for a second entry for the first code region in a second data structure, in response to determining that the first indicator is set to the first value, wherein one or more optimized code regions corresponding to the first page from a code cache are to be removed in response to determining that the first page may have been modified, and wherein the first indicator is to be set to a second value. | 06-11-2009 |
20090172424 | THREAD MIGRATION TO IMPROVE POWER EFFICIENCY IN A PARALLEL PROCESSING ENVIRONMENT - A method and system to selectively move one or more of a plurality threads which are executing in parallel by a plurality of processing cores. In one embodiment, a thread may be moved from executing in one of the plurality of processing cores to executing in another of the plurality of processing cores, the moving based on a performance characteristic associated with the plurality of threads. In another embodiment of the invention, a power state of the plurality of processing cores may be changed to improve a power efficiency associated with the executing of the multiple threads. | 07-02-2009 |
20090287909 | Dynamically Estimating Lifetime of a Semiconductor Device - In one embodiment, the present invention includes a method for obtaining dynamic operating parameter information of a semiconductor device such as a processor, determining dynamic usage of the device, either as a whole or for one or more portions thereof, based on the dynamic operating parameter information, and dynamically estimating a remaining lifetime of the device based on the dynamic usage. Depending on the estimated remaining lifetime, the device may be controlled in a desired manner. Other embodiments are described and claimed. | 11-19-2009 |
20100005277 | Communicating Between Multiple Threads In A Processor - In one embodiment, the present invention includes a method for accessing registers associated with a first thread while executing a second thread. In one such embodiment a method may include preventing an instruction of a first thread that is to access a source operand from a register file of a second thread from executing if a synchronization indicator associated with the source operand indicates incompletion of a producer operation of the second thread, and executing the instruction if the synchronization indicator indicates completion of the producer operation of the second thread. Other embodiments are described and claimed. | 01-07-2010 |
20100082905 | DISABLING CACHE PORTIONS DURING LOW VOLTAGE OPERATIONS - Methods and apparatus relating to disabling one or more cache portions during low voltage operations are described. In some embodiments, one or more extra bits may be used for a portion of a cache that indicate whether the portion of the cache is capable at operating at or below Vccmin levels. Other embodiments are also described and claimed. | 04-01-2010 |
20100115224 | MEMORY APPARATUSES WITH LOW SUPPLY VOLTAGES - Low supply voltage memory apparatuses are presented. In one embodiment, a memory apparatus comprises a memory and a memory controller. The memory controller includes a read controller. The read controller prevents a read operation to a memory location from being completed, for at least N clock cycles after a write operation to the memory location, where N is the number of clock cycles for the memory location to stabilize after the write operation. | 05-06-2010 |
20100115247 | REPLACEMENT POLICY FOR HOT CODE DETECTION - Methods and apparatus relating to a replacement policy for hot code detection are described. In some embodiments, it may be determined which entry amongst a plurality of entries stored in storage unit is to be replaced next. The entries may correspond to hot code and may store age and execution frequency information corresponding to the hot code. Other embodiments are also described and claimed. | 05-06-2010 |
20100262812 | REGISTER CHECKPOINTING MECHANISM FOR MULTITHREADING - Methods and apparatus are disclosed for using a register checkpointing mechanism to resolve multithreading mis-speculations. Valid architectural state is recovered and execution is rolled back. Some embodiments include memory to store checkpoint data. Multiple thread units concurrently execute threads. They execute a checkpoint mask instruction to initialize memory to store active checkpoint data including register contents and a checkpoint mask indicating the validity of stored register contents. As register contents change, threads execute checkpoint write instructions to store register contents and update the checkpoint mask. Threads also execute a recovery function instruction to store a pointer to a checkpoint recovery function, and in response to mis-speculation among the threads, branch to the checkpoint recovery function. Threads then execute one or more checkpoint read instructions to copy data from a valid checkpoint storage area into the registers necessary to recover a valid architectural state, from which execution may resume. | 10-14-2010 |
20100269102 | SYSTEMS, METHODS, AND APPARATUSES TO DECOMPOSE A SEQUENTIAL PROGRAM INTO MULTIPLE THREADS, EXECUTE SAID THREADS, AND RECONSTRUCT THE SEQUENTIAL EXECUTION - Systems, methods, and apparatuses for decomposing a sequential program into multiple threads, executing these threads, and reconstructing the sequential execution of the threads are described. A plurality of data cache units (DCUs) store locally retired instructions of speculatively executed threads. A merging level cache (MLC) merges data from the lines of the DCUs. An inter-core memory coherency module (ICMC) globally retire instructions of the speculatively executed threads in the MLC. | 10-21-2010 |
20100332811 | SPECULATIVE MULTI-THREADING FOR INSTRUCTION PREFETCH AND/OR TRACE PRE-BUILD - The latencies associated with retrieving instruction information for a main thread are decreased through the use of a simultaneous helper thread. The helper thread is a speculative prefetch thread to perform instruction prefetch and/or trace pre-build for the main thread. | 12-30-2010 |
20110271056 | MULTITHREADED CLUSTERED MICROARCHITECTURE WITH DYNAMIC BACK-END ASSIGNMENT - A multithreaded clustered microarchitecture with dynamic back-end assignment is presented. A processing system may include a plurality of instruction caches and front-end units each to process an individual thread from a corresponding one of the instruction caches, a plurality of back-end units, and an interconnect network to couple the front-end and back-end units. A method may include measuring a performance metric of a back-end unit, comparing the measurement to a first value, and reassigning, or not, the back-end unit according to the comparison. Computer systems according to embodiments of the invention may include: a random access memory; a system bus; and a processor having a plurality of instruction caches, a plurality of front-end units each to process an individual thread from a corresponding one of the instruction caches; a plurality of back-end units; and an interconnect network coupled to the plurality of front-end units and the plurality of back-end units. | 11-03-2011 |
20120110266 | DISABLING CACHE PORTIONS DURING LOW VOLTAGE OPERATIONS - Methods and apparatus relating to disabling one or more cache portions during low voltage operations are described. In some embodiments, one or more extra bits may be used for a portion of a cache that indicate whether the portion of the cache is capable at operating at or below Vccmin levels. Other embodiments are also described and claimed. | 05-03-2012 |
20130173948 | Frequency And Voltage Scaling Architecture - A method and apparatus for scaling frequency and operating voltage of at least one clock domain of a microprocessor. More particularly, embodiments of the invention relate to techniques to divide a microprocessor into clock domains and control the frequency and operating voltage of each clock domain independently of the others. | 07-04-2013 |
20130268735 | SUPPORT FOR SPECULATIVE OWNERSHIP WITHOUT DATA - Techniques are described for providing an enhanced cache coherency protocol for a multi-core processor that includes a Speculative Request For Ownership Without Data (SRFOWD) for a portion of cache memory. With a SRFOWD, only an acknowledgement message may be provided as an answer to a requesting core. The contents of the affected cache line are not required to be a part of the answer. The enhanced cache coherency protocol may assure that a valid copy of the current cache line exists in case of misspeculation by the requesting core. Thus, an owner of the current copy of the cache line may maintain a copy of the old contents of the cache line. The old contents of the cache line may be discarded if speculation by the requesting core turns out to be correct. Otherwise, in case of misspeculation by the requesting core, the old contents of the cache line may be set back to a valid state. | 10-10-2013 |
20130283277 | THREAD MIGRATION TO IMPROVE POWER EFFICIENCY IN A PARALLEL PROCESSING ENVIRONMENT - A method and system to selectively move one or more of a plurality threads which are executing in parallel by a plurality of processing cores. In one embodiment, a thread may be moved from executing in one of the plurality of processing cores to executing in another of the plurality of processing cores, the moving based on a performance characteristic associated with the plurality of threads. In another embodiment of the invention, a power state of the plurality of processing cores may be changed to improve a power efficiency associated with the executing of the multiple threads. | 10-24-2013 |
20130326199 | METHOD AND APPARATUS FOR CONTROLLING A MXCSR - Disclosed is an apparatus and method generally related to controlling a multimedia extension control and status register (MXCSR). A processor core may include a floating point unit (FPU) to perform arithmetic functions; and a multimedia extension control register (MXCR) to provide control bits to the FPU. Further an optimizer may be used to select a speculative multimedia extension status register (SPEC_MXSR) from a plurality of SPEC_MXSRs to update a multimedia extension status register (MXSR) based upon an instruction. | 12-05-2013 |
20130332705 | PROFILING ASYNCHRONOUS EVENTS RESULTING FROM THE EXECUTION OF SOFTWARE AT CODE REGION GRANULARITY - A combination of hardware and software collect profile data for asynchronous events, at code region granularity. An exemplary embodiment is directed to collecting metrics for prefetching events, which are asynchronous in nature. Instructions that belong to a code region are identified using one of several alternative techniques, causing a profile bit to be set for the instruction, as a marker. Each line of a data block that is prefetched is similarly marked. Events corresponding to the profile data being collected and resulting from instructions within the code region are then identified. Each time that one of the different types of events is identified, a corresponding counter is incremented. Following execution of the instructions within the code region, the profile data accumulated in the counters are collected, and the counters are reset for use with a new code region. | 12-12-2013 |
20140019721 | MANAGED INSTRUCTION CACHE PREFETCHING - Disclosed is an apparatus and method to manage instruction cache prefetching from an instruction cache. A processor may comprise: a prefetch engine; a branch prediction engine to predict the outcome of a branch; and dynamic optimizer. The dynamic optimizer may be used to control: indentifying common instruction cache misses and inserting a prefetch instruction from the prefetch engine to the instruction cache. | 01-16-2014 |
20140095849 | INSTRUCTION AND LOGIC FOR OPTIMIZATION LEVEL AWARE BRANCH PREDICTION - A computer-readable storage medium, method and system for optimization-level aware branch prediction is described. A gear level is assigned to a set of application instructions that have been optimized. The gear level is also stored in a register of a branch prediction unit of a processor. Branch prediction is then performed by the processor based upon the gear level. | 04-03-2014 |
20140108733 | DISABLING CACHE PORTIONS DURING LOW VOLTAGE OPERATIONS - Methods and apparatus relating to disabling one or more cache portions during low voltage operations are described. In some embodiments, one or more extra bits may be used for a portion of a cache that indicate whether the portion of the cache is capable at operating at or below Vccmin levels. Other embodiments are also described and claimed. | 04-17-2014 |
20140156976 | METHOD, APPARATUS AND SYSTEM FOR SELECTIVE EXECUTION OF A COMMIT INSTRUCTION - Techniques and mechanisms for a processor to determine whether a commit action is to be performed. In an embodiment, a processor performs operations to determine whether a commit instruction is for contingent performance of a commit action. In another embodiment, one or more conditions of processor state are evaluated in response to determining that the commit instruction is for contingent performance of the commit action, where the evaluation is performed to determine whether the commit action indicated by the commit instruction is to be performed. | 06-05-2014 |
20140164802 | Frequency And Voltage Scaling Architecture - A method and apparatus for scaling frequency and operating voltage of at least one clock domain of a microprocessor. More particularly, embodiments of the invention relate to techniques to divide a microprocessor into clock domains and control the frequency and operating voltage of each clock domain independently of the others. | 06-12-2014 |