Patent application number | Description | Published |
20080215822 | PCI Express Enhancements and Extensions - A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses. | 09-04-2008 |
20080301398 | LINEAR TO PHYSICAL ADDRESS TRANSLATION WITH SUPPORT FOR PAGE ATTRIBUTES - Embodiments of the invention are generally directed to systems, methods, and apparatuses for linear to physical address translation with support for page attributes. In some embodiments, a system receives an instruction to translate a memory pointer to a physical memory address for a memory location. The system may return the physical memory address and one or more page attributes. Other embodiments are described and claimed. | 12-04-2008 |
20100162023 | METHOD AND APPARATUS OF POWER MANAGEMENT OF PROCESSOR - A processing platform and a method of controlling power consumption of a central processing unit of the processing platform are presented. By operating the method the processing platform is able to set an upper performance state limit and a lower performance state limit. The upper performance state limit is based on a central processing unit activity rate value and the lower performance state limit is based on a minimum require of the operating system to perform operating system tasks. The performance state values are varying within a range of the lower and upper limits according to a power management policy. | 06-24-2010 |
20110072164 | PCI EXPRESS ENHANCEMENTS AND EXTENSIONS - A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses. | 03-24-2011 |
20110161703 | PCI EXPRESS ENHANCEMENTS AND EXTENSIONS - A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses. | 06-30-2011 |
20110173367 | PCI EXPRESS ENHANCEMENTS AND EXTENSIONS - A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses. | 07-14-2011 |
20110208925 | PCI EXPRESS ENHANCEMENTS AND EXTENSIONS - A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses. | 08-25-2011 |
20110238882 | PCI EXPRESS ENHANCEMENTS AND EXTENSIONS - A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses. | 09-29-2011 |
20120036293 | PCI EXPRESS ENHANCEMENTS AND EXTENSIONS - A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses. | 02-09-2012 |
20120079304 | PACKAGE LEVEL POWER STATE OPTIMIZATION - Methods and apparatus to optimize package level power state usage are described. In one embodiment, a processor control logic receives a request to enter a lower power consumption state (such as a package level deeper sleep state). The control logic determines the time difference or delta between a last entry into the lower power consumption state and the current time. The control logic then causes the flushing of a last level cache based on a comparison of the time difference and a threshold value corresponding to the lower power consumption state. Other embodiments are also claimed and disclosed. | 03-29-2012 |
20120089750 | PCI EXPRESS ENHANCEMENTS AND EXTENSIONS - A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses. | 04-12-2012 |
20120166854 | Controlling Current Transients In A Processor - In one embodiment, a processor includes a core with a front end unit, at least one execution unit, and a back end unit. Multiple voltage drop detectors can be located within the core each to output a voltage drop signal when a detected voltage falls below a threshold voltage. In turn, a current transient logic coupled to receive the voltage drop signals can control a micro-architectural parameter of at least one of the front end unit, execution unit and back end unit responsive to receipt of a voltage drop signal. Other embodiments are described and claimed. | 06-28-2012 |
20120185709 | METHOD, APPARATUS, AND SYSTEM FOR ENERGY EFFICIENCY AND ENERGY CONSERVATION INCLUDING THREAD CONSOLIDATION - An apparatus, method and system is described herein for thread consolidation. Current processor utilization is determined. And consolidation opportunities are identified from the processor utilization and other exaction parameters, such as estimating a new utilization after consolidation, determining if power savings would occur based on the new utilization, and performing migration/consolidation of threads to a subset of active processing elements. Once the consolidation is performed, the non-subset processing elements that are now idle are powered down to save energy and provide an energy efficient execution environment. | 07-19-2012 |
20120191995 | METHOD, APPARATUS, AND SYSTEM FOR ENERGY EFFICIENCY AND ENERGY CONSERVATION INCLUDING OPTIMIZING C-STATE SELECTION UNDER VARIABLE WAKEUP RATES - A processor may include power management techniques to, dynamically, chose an optimal C-state for the processing core. The measurement of real workloads on the OSes exhibit two important observations (1) the bursts of high interrupt rate are interspersed between the low interrupt rate periods and long periods of high activity levels; and (2) the interrupt rate may, suddenly, fall below an interrupt rate (of 1 milli-second, for example) that is typical of the current operating systems (OS). Instead of determining the C-state based on the stale data stored in the counters, the power control logic may determine an optimal C-state by overriding the C-state determined by the OS or any other power monitoring logic. The power control logic may, dynamically, determine an optimal C-state based on the CPU idle residency times and variable rate wakeup events to match the expected wakeup event rate. | 07-26-2012 |
20120204042 | User Level Control Of Power Management Policies - In one embodiment, the present invention includes a processor having a core and a power controller to control power management features of the processor. The power controller can receive an energy performance bias (EPB) value from the core and access a power-performance tuning table based on the value. Using information from the table, at least one setting of a power management feature can be updated. Other embodiments are described and claimed. | 08-09-2012 |
20120233439 | Implementing TLB Synchronization for Systems with Shared Virtual Memory Between Processing Devices - Page faults arising in a graphics processing unit may be handled by an operating system running on the central processing unit. In some embodiments, this means that unpinned memory can be used for the graphics processing unit. Using unpinned memory in the graphics processing unit may expand the capabilities of the graphics processing unit in some cases. | 09-13-2012 |
20120236010 | Page Fault Handling Mechanism - Page faults arising in a graphics processing unit may be handled by an operating system running on the central processing unit. In some embodiments, this means that unpinned memory can be used for the graphics processing unit. Using unpinned memory in the graphics processing unit may expand the capabilities of the graphics processing unit in some cases. | 09-20-2012 |
20120254563 | PCI EXPRESS ENHANCEMENTS AND EXTENSIONS - A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses. | 10-04-2012 |
20130007406 | DYNAMIC PINNING OF VIRTUAL PAGES SHARED BETWEEN DIFFERENT TYPE PROCESSORS OF A HETEROGENEOUS COMPUTING PLATFORM - A computer system may support one or more techniques to allow dynamic pinning of the memory pages accessed by a non-CPU device (e.g., a graphics processing unit, GPU). The non-CPU may support virtual to physical address mapping and may thus be aware of the memory pages, which may not be pinned but may be accessed by the non-CPU. The non-CPU may notify or send such information to a run-time component such as a device driver associated with the CPU. In one embodiment, the device driver may, dynamically, perform pinning of such memory pages, which may be accessed by the non-CPU. The device driver may even unpin the memory pages, which may be no longer accessed by the non-CPU. Such an approach may allow the memory pages, which may be no longer accessed by the non-CPU to be available for allocation to the other CPUs and/or non-CPUs. | 01-03-2013 |
20130036317 | Device, System And Method Of Generating An Execution Instruction Based On A Memory-Access Instruction - Embodiments of the present invention provide an apparatus, system, and method of generating an execution instruction. Some demonstrative embodiments my include generating an execution instruction of a predetermined executable format based on memory address data of a memory-access instruction representing a memory address. Other embodiments are described and claimed. | 02-07-2013 |
20130061064 | Dynamically Allocating A Power Budget Over Multiple Domains Of A Processor - In one embodiment, the present invention includes a method for determining a power budget for a multi-domain processor for a current time interval, determining a portion of the power budget to be allocated to first and second domains of the processor, and controlling a frequency of the domains based on the allocated portions. Such determinations and allocations can be dynamically performed during runtime of the processor. Other embodiments are described and claimed. | 03-07-2013 |
20130091317 | PCI EXPRESS ENHANCEMENTS AND EXTENSIONS - A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses. | 04-11-2013 |
20130097353 | PCI EXPRESS ENHANCEMENTS AND EXTENSIONS - A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses. | 04-18-2013 |
20130097437 | METHOD, APPARATUS, AND SYSTEM FOR ENERGY EFFICIENCY AND ENERGY CONSERVATION INCLUDING OPTIMIZING C-STATE SELECTION UNDER VARIABLE WAKEUP RATES - A processor may include power management techniques to, dynamically, chose an optimal C-state for the processing core. The measurement of real workloads on the OSes exhibit two important observations (1) the bursts of high interrupt rate are interspersed between the low interrupt rate periods and long periods of high activity levels; and (2) the interrupt rate may, suddenly, fall below an interrupt rate (of 1 milli-second, for example) that is typical of the current operating systems (OS). Instead of determining the C-state based on the stale data stored in the counters, the power control logic may determine an optimal C-state by overriding the C-state determined by the OS or any other power monitoring logic. The power control logic may, dynamically, determine an optimal C-state based on the CPU idle residency times and variable rate wakeup events to match the expected wakeup event rate. | 04-18-2013 |
20130111086 | PCI EXPRESS ENHANCEMENTS AND EXTENSIONS | 05-02-2013 |
20130132622 | PCI EXPRESS ENHANCEMENTS AND EXTENSIONS - A method and apparatus forenhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses. | 05-23-2013 |
20130132636 | PCI EXPRESS ENHANCEMENTS AND EXTENSIONS - A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses. | 05-23-2013 |
20130132683 | PCI EXPRESS ENHANCEMENTS AND EXTENSIONS - A method and apparatus forenhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses. | 05-23-2013 |
20130151569 | COMPUTING PLATFORM INTERFACE WITH MEMORY MANAGEMENT - In some embodiments, a PPM interface may be provided with functionality to facilitate to an OS memory power state management for one or more memory nodes, regardless of a particular platform hardware configuration, as long as the platform hardware is in conformance with the PPM interface. | 06-13-2013 |
20130173946 | CONTROLLING POWER CONSUMPTION THROUGH MULTIPLE POWER LIMITS OVER MULTIPLE TIME INTERVALS - Methods and apparatus relating to controlling power consumption through multiple power limits over multiple time intervals are described. In one embodiment, the level of power consumption by a computing device component (e.g., a processor or one of its processor cores) is modified based on a determined power limit value. The power limit value may be determined based on rolling power consumption averages over multiple time intervals and their comparison against multiple corresponding power limits. Other embodiments are also disclosed and claimed. | 07-04-2013 |
20130179704 | Dynamically Allocating A Power Budget Over Multiple Domains Of A Processor - In one embodiment, the present invention includes a method for determining a power budget for a multi-domain processor for a current time interval, determining a portion of the power budget to be allocated to first and second domains of the processor, and controlling a frequency of the domains based on the allocated portions. Such determinations and allocations can be dynamically performed during runtime of the processor. Other embodiments are described and claimed. | 07-11-2013 |
20130179706 | User Level Control Of Power Management Policies - In one embodiment, the present invention includes a processor having a core and a power controller to control power management features of the processor. The power controller can receive an energy performance bias (EPB) value from the core and access a power-performance tuning table based on the value. Using information from the table, at least one setting of a power management feature can be updated. Other embodiments are described and claimed. | 07-11-2013 |
20130262816 | TRANSLATION LOOKASIDE BUFFER FOR MULTIPLE CONTEXT COMPUTE ENGINE - Some implementations disclosed herein provide techniques and arrangements for an specialized logic engine that includes translation lookaside buffer to support multiple threads executing on multiple cores. The translation lookaside buffer enables the specialized logic engine to directly access a virtual address of a thread executing on one of the plurality of processing cores. For example, an acceleration compute engine may receive one or more instructions from a thread executed by a processing core. The acceleration compute engine may retrieve, based on an address space identifier associated with the one or more instructions, a physical address associated with the one or more instructions from the translation lookaside buffer to execute the one or more instructions using the physical address. | 10-03-2013 |
20130268742 | CORE SWITCHING ACCELERATION IN ASYMMETRIC MULTIPROCESSOR SYSTEM - An asymmetric multiprocessor system (ASMP) may comprise computational cores implementing different instruction set architectures and having different power requirements. Program code executing on the ASMP is analyzed by a binary analysis unit to determine what functions are called by the program code and select which of the cores are to execute the program code, or a code segment thereof. Selection may be made to provide for native execution of the program code, to minimize power consumption, and so forth. Control operations based on this selection may then be inserted into the program code, forming instrumented program code. The instrumented program code is then executed by the ASMP. | 10-10-2013 |
20130268804 | SYNCHRONOUS SOFTWARE INTERFACE FOR AN ACCELERATED COMPUTE ENGINE - Some implementations disclosed herein provide techniques and arrangements for a synchronous software interface for a specialized logic engine. The synchronous software interface may receive, from a first core of a plurality of cores, a control block including a transaction for execution by the specialized logic engine. The synchronous software interface may send the control block to the specialized logic engine and wait to receive a confirmation from the specialized logic engine that the transaction was successfully executed. | 10-10-2013 |
20130275737 | COLLABORATIVE PROCESSOR AND SYSTEM PERFORMANCE AND POWER MANAGEMENT - The present invention relates to a platform power management scheme. In some embodiments, a platform provides a relative performance scale using one or more parameters to be requested by an OSPM system. | 10-17-2013 |
20130275796 | COLLABORATIVE PROCESSOR AND SYSTEM PERFORMANCE AND POWER MANAGEMENT - The present invention relates to a platform power management scheme. In some embodiments, a platform provides a relative performance scale using one or more parameters to be requested by an OSPM system. | 10-17-2013 |
20130318323 | APPARATUS AND METHOD FOR ACCELERATING OPERATIONS IN A PROCESSOR WHICH USES SHARED VIRTUAL MEMORY - An apparatus and method are described for coupling a front end core to an accelerator component (e.g., such as a graphics accelerator). For example, an apparatus is described comprising: an accelerator comprising one or more execution units (EUs) to execute a specified set of instructions; and a front end core comprising a translation lookaside buffer (TLB) communicatively coupled to the accelerator and providing memory access services to the accelerator, the memory access services including performing TLB lookup operations to map virtual to physical addresses on behalf of the accelerator and in response to the accelerator requiring access to a system memory. | 11-28-2013 |
20140006758 | Extension of CPU Context-State Management for Micro-Architecture State | 01-02-2014 |
20140013333 | CONTEXT-STATE MANAGEMENT - Extended features such as registers and functions within processors are made available to operating systems (OS) using an extended-state driver and by modifying instruction set extensions, such as XSAVE. A map-table designates a correspondence between memory locations for storing data relating to extended features not supported by the OS and called by an application. As a result, applications may utilize processor resources which are unsupported by the OS. | 01-09-2014 |
20140019723 | BINARY TRANSLATION IN ASYMMETRIC MULTIPROCESSOR SYSTEM - An asymmetric multiprocessor system (ASMP) may comprise computational cores implementing different instruction set architectures and having different power requirements. Program code for execution on the ASMP is analyzed and a determination is made as to whether to allow the program code, or a code segment thereof to execute on a first core natively or to use binary translation on the code and execute the translated code on a second core which consumes less power than the first core during execution. | 01-16-2014 |
20140040643 | METHOD AND APPARATUS OF POWER MANAGMENT OF PROCESSOR - A processing platform and a method of controlling power consumption of a central processing unit of the processing platform are presented. By operating the method the processing platform is able to set an upper performance state limit and a lower performance state limit. The upper performance state limit is based on a central processing unit activity rate value and the lower performance state limit is based on a minimum require of the operating system to perform operating system tasks. The performance state values are varying within a range of the lower and upper limits according to a power management policy. | 02-06-2014 |
20140068302 | MECHANISM FOR FACILITATING FASTER SUSPEND/RESUME OPERATIONS IN COMPUTING SYSTEMS - A mechanism is described for facilitating faster suspend/resume operations in computing systems according to one embodiment of the invention. A method of embodiments of the invention includes initiating an entrance process into a first sleep state in response to a sleep call at a computing system, transforming from the first sleep state to a second sleep state. The transforming may include preserving at least a portion of processor context at a local memory associated with one or more processor cores of a processor at the computing system. The method may further include entering the second sleep state. | 03-06-2014 |
20140082630 | PROVIDING AN ASYMMETRIC MULTICORE PROCESSOR SYSTEM TRANSPARENTLY TO AN OPERATING SYSTEM - In one embodiment, the present invention includes a multicore processor with first and second groups of cores. The second group can be of a different instruction set architecture (ISA) than the first group or of the same ISA set but having different power and performance support level, and is transparent to an operating system (OS). The processor further includes a migration unit that handles migration requests for a number of different scenarios and causes a context switch to dynamically migrate a process from the second core to a first core of the first group. This dynamic hardware-based context switch can be transparent to the OS. Other embodiments are described and claimed. | 03-20-2014 |
20140115351 | DYNAMICALLY ALLOCATING A POWER BUDGET OVER MULTIPLE DOMAINS OF A PROCESSOR - In one embodiment, the present invention includes a method for determining a power budget for a multi-domain processor for a current time interval, determining a portion of the power budget to be allocated to first and second domains of the processor, and controlling a frequency of the domains based on the allocated portions. Such determinations and allocations can be dynamically performed during runtime of the processor. Other embodiments are described and claimed. | 04-24-2014 |
20140129808 | MIGRATING TASKS BETWEEN ASYMMETRIC COMPUTING ELEMENTS OF A MULTI-CORE PROCESSOR - In one embodiment, the present invention includes a multicore processor having first and second cores to independently execute instructions, the first core visible to an operating system (OS) and the second core transparent to the OS and heterogeneous from the first core. A task controller, which may be included in or coupled to the multicore processor, can cause dynamic migration of a first process scheduled by the OS to the first core to the second core transparently to the OS. Other embodiments are described and claimed. | 05-08-2014 |
20140181830 | THREAD MIGRATION SUPPORT FOR ARCHITECTUALLY DIFFERENT CORES - According to one embodiment, a processor includes a plurality of processor cores for executing a plurality of threads, a shared storage communicatively coupled to the plurality of processor cores, a power control unit (PCU) communicatively coupled to the plurality of processors to determine, without any software (SW) intervention, if a thread being performed by a first processor core should be migrated to a second processor core, and a migration unit, in response to receiving an instruction from the PCU to migrate the thread, to store at least a portion of architectural state of the first processor core in the shared storage and to migrate the thread to the second processor core, without any SW intervention, such that the second processor core can continue executing the thread based on the architectural state from the shared storage without knowledge of the SW. | 06-26-2014 |
20140189297 | HETERGENEOUS PROCESSOR APPARATUS AND METHOD - A heterogeneous processor architecture is described. For example, a processor according to one embodiment of the invention comprises: a set of two or more small physical processor cores; at least one large physical processor core having relatively higher performance processing capabilities and relatively higher power usage relative to the small physical processor cores; virtual-to-physical (V-P) mapping logic to expose the set of two or more small physical processor cores to software through a corresponding set of virtual cores and to hide the at least one large physical processor core from the software. | 07-03-2014 |
20140189299 | HETERGENEOUS PROCESSOR APPARATUS AND METHOD - A heterogeneous processor architecture is described. For example, a processor according to one embodiment of the invention comprises: a set of large physical processor cores; a set of small physical processor cores having relatively lower performance processing capabilities and relatively lower power usage relative to the large physical processor cores; virtual-to-physical (V-P) mapping logic to expose the set of large physical processor cores to software through a corresponding set of virtual cores and to hide the set of small physical processor core from the software. | 07-03-2014 |
20140189301 | HIGH DYNAMIC RANGE SOFTWARE-TRANSPARENT HETEROGENEOUS COMPUTING ELEMENT PROCESSORS, METHODS, AND SYSTEMS - A processor of an aspect includes at least one lower processing capability and lower power consumption physical compute element and at least one higher processing capability and higher power consumption physical compute element. Migration performance benefit evaluation logic is to evaluate a performance benefit of a migration of a workload from the at least one lower processing capability compute element to the at least one higher processing capability compute element, and to determine whether or not to allow the migration based on the evaluated performance benefit. Available energy and thermal budget evaluation logic is to evaluate available energy and thermal budgets and to determine to allow the migration if the migration fits within the available energy and thermal budgets. Workload migration logic is to perform the migration when allowed by both the migration performance benefit evaluation logic and the available energy and thermal budget evaluation logic. | 07-03-2014 |
20140189302 | OPTIMAL LOGICAL PROCESSOR COUNT AND TYPE SELECTION FOR A GIVEN WORKLOAD BASED ON PLATFORM THERMALS AND POWER BUDGETING CONSTRAINTS - A processor includes multiple physical cores that support multiple logical cores of different core types, where the core types include a big core type and a small core type. A multi-threaded application includes multiple software threads are concurrently executed by a first subset of logical cores in a first time slot. Based on data gathered from monitoring the execution in the first time slot, the processor selects a second subset of logical cores for concurrent execution of the software threads in a second time slot. Each logical core in the second subset has one of the core types that matches the characteristics of one of the software threads. | 07-03-2014 |
20140189332 | APPARATUS AND METHOD FOR LOW-LATENCY INVOCATION OF ACCELERATORS - An apparatus and method are described for providing low-latency invocation of accelerators. For example, a processor according to one embodiment comprises: a command register for storing command data identifying a command to be executed; a result register to store a result of the command or data indicating a reason why the command could not be executed; execution logic to execute a plurality of instructions including an accelerator invocation instruction to invoke one or more accelerator commands; and one or more accelerators to read the command data from the command register and responsively attempt to execute the command identified by the command data. | 07-03-2014 |
20140189333 | APPARATUS AND METHOD FOR TASK-SWITCHABLE SYNCHRONOUS HARDWARE ACCELERATORS - A processor comprising: execution logic to execute a first thread including an accelerator invocation instruction to invoke an accelerator command; an accelerator to execute an accelerator thread in response to the accelerator command, the accelerator to store state data associated with the accelerator thread in a application memory area in memory, wherein prior to executing the accelerator thread, the accelerator is to lock entries in a translation lookaside buffer (TLB) associated with the accelerator thread to prevent an exception which might otherwise result. | 07-03-2014 |
20140189694 | MANAGING PERFORMANCE POLICIES BASED ON WORKLOAD SCALABILITY - Methods and systems may provide for identifying a workload associated with a platform and determining a scalability of the workload. Additionally, a performance policy of the platform may be managed based at least in part on the scalability of the workload. In one example, determining the scalability includes determining a ratio of productive cycles to actual cycles. | 07-03-2014 |
20140189704 | HETERGENEOUS PROCESSOR APPARATUS AND METHOD - A heterogeneous processor architecture is described. For example, a processor according to one embodiment of the invention comprises: a first set of one or more physical processor cores having first processing characteristics; a second set of one or more physical processor cores having second processing characteristics different from the first processing characteristics; virtual-to-physical (V-P) mapping logic to expose a plurality of virtual processors to software, the plurality of virtual processors to appear to the software as a plurality of homogeneous processor cores, the software to allocate threads to the virtual processors as if the virtual processors were homogeneous processor cores; wherein the V-P mapping logic is to map each virtual processor to a physical processor within the first set of physical processor cores or the second set of physical processor cores such that a thread allocated to a first virtual processor by software is executed by a physical processor mapped to the first virtual processor from the first set or the second set of physical processors. | 07-03-2014 |
20140189713 | APPARATUS AND METHOD FOR INVOCATION OF A MULTI THREADED ACCELERATOR - A processor is described having logic circuitry of a general purpose CPU core to save multiple copies of context of a thread of the general purpose CPU core to prepare multiple micro-threads of a multi-threaded accelerator for execution to accelerate operations for the thread through parallel execution of the micro-threads. | 07-03-2014 |
20140245034 | MULTI-LEVEL CPU HIGH CURRENT PROTECTION - Methods and apparatus relating to multi-level CPU (Central Processing Unit) high current protection are described. In one embodiment, different workloads may be assigned different license types and/or weights based on micro-architectural events (such as uop (micro-operation) types and sizes) and/or data types. Other embodiments are also disclosed and claimed. | 08-28-2014 |
20140304488 | LINEAR TO PHYSICAL ADDRESS TRANSLATION WITH SUPPORT FOR PAGE ATTRIBUTES - Embodiments of the invention are generally directed to systems, methods, and apparatuses for linear to physical address translation with support for page attributes. In some embodiments, a system receives an instruction to translate a memory pointer to a physical memory address for a memory location. The system may return the physical memory address and one or more page attributes. Other embodiments are described and claimed. | 10-09-2014 |
20140317430 | METHOD, APPARATUS, AND SYSTEM FOR ENERGY EFFICIENCY AND ENERGY CONSERVATION INCLUDING OPTIMIZING C-STATE SELECTION UNDER VARIABLE WAKEUP RATES - A processor may include power management techniques to, dynamically, chose an optimal C-state for the processing core. The measurement of real workloads on the OSes exhibit two important observations (1) the bursts of high interrupt rate are interspersed between the low interrupt rate periods and long periods of high activity levels; and (2) the interrupt rate may, suddenly, fall below an interrupt rate (of 1 milli-second, for example) that is typical of the current operating systems (OS). Instead of determining the C-state based on the stale data stored in the counters, the power control logic may determine an optimal C-state by overriding the C-state determined by the OS or any other power monitoring logic. The power control logic may, dynamically, determine an optimal C-state based on the CPU idle residency times and variable rate wakeup events to match the expected wakeup event rate. | 10-23-2014 |
20140344598 | Enabling A Non-Core Domain To Control Memory Bandwidth - In one embodiment, the present invention includes a processor having multiple domains including at least a core domain and a non-core domain that is transparent to an operating system (OS). The non-core domain can be controlled by a driver. In turn, the processor further includes a memory interconnect to interconnect the core domain and the non-core domain to a memory coupled to the processor. Still further, a power controller, which may be within the processor, can control a frequency of the memory interconnect based on memory boundedness of a workload being executed on the non-core domain. Other embodiments are described and claimed. | 11-20-2014 |
20140344815 | CONTEXT SWITCHING MECHANISM FOR A PROCESSING CORE HAVING A GENERAL PURPOSE CPU CORE AND A TIGHTLY COUPLED ACCELERATOR - An apparatus is described having multiple cores, each core having: a) an accelerator; and, b) a general purpose CPU coupled to the accelerator. The general purpose CPU has functional unit logic circuitry to execute an instruction that returns an amount of storage space to store context information of the accelerator. | 11-20-2014 |
20140351553 | LINEAR TO PHYSICAL ADDRESS TRANSLATION WITH SUPPORT FOR PAGE ATTRIBUTES - Embodiments of the invention are generally directed to systems, methods, and apparatuses for linear to physical address translation with support for page attributes. In some embodiments, a system receives an instruction to translate a memory pointer to a physical memory address for a memory location. The system may return the physical memory address and one or more page attributes. Other embodiments are described and claimed. | 11-27-2014 |
20140351554 | LINEAR TO PHYSICAL ADDRESS TRANSLATION WITH SUPPORT FOR PAGE ATTRIBUTES - Embodiments of the invention are generally directed to systems, methods, and apparatuses for linear to physical address translation with support for page attributes. In some embodiments, a system receives an instruction to translate a memory pointer to a physical memory address for a memory location. The system may return the physical memory address and one or more page attributes. Other embodiments are described and claimed. | 11-27-2014 |
20140359629 | MECHANISM FOR ISSUING REQUESTS TO AN ACCELERATOR FROM MULTIPLE THREADS - An apparatus is described having multiple cores, each core having: a) a CPU; b) an accelerator; and, c) a controller and a plurality of order buffers coupled between the CPU and the accelerator. Each of the order buffers is dedicated to a different one of the CPU's threads. Each one of the order buffers is to hold one or more requests issued to the accelerator from its corresponding thread. The controller is to control issuance of the order buffers' respective requests to the accelerator. | 12-04-2014 |
20140380076 | Mapping A Performance Request To An Operating Frequency In A Processor - In an embodiment, a processor includes multiple cores each to independently execute instructions and a power control unit (PCU) coupled to the plurality of cores to control power consumption of the processor. The PCU may include a mapping logic to receive a performance scale value from an operating system (OS) and to calculate a dynamic performance-frequency mapping based at least in part on the performance scale value. Other embodiments are described and claimed. | 12-25-2014 |