Entries |
Document | Title | Date |
20080209126 | METHOD FOR ACHIEVING VERY HIGH BANDWIDTH BETWEEN THE LEVELS OF A CACHE HIERARCHY IN 3-DIMENSIONAL STRUCTURES, AND A 3-DIMENSIONAL STRUCTURE RESULTING THEREFROM - A method of electronic computing, and more specifically, a method of design of cache hierarchies in 3-dimensional chips, and a cache hierarchy resulting therefrom, including a physical arrangement of bits in cache hierarchies implemented in 3 dimensions such that the planar wiring required in the busses connecting the levels of the hierarchy is minimized. In this way, the data paths between the levels are primarily the vias themselves, which leads to very short, hence fast and low power busses. | 08-28-2008 |
20080215815 | SYSTEM AND METHOD OF IMPROVING TASK SWITCHING AND PAGE TRANSLATION PERFORMANCE UTILIZING A MULTILEVEL TRANSLATION LOOKASIDE BUFFER - A system and method of improved task switching in a data processing system. First, a first-level cache memory casts out an invalidated page table entry and an associated first page directory base address to a second-level cache memory. Then, the second-level cache memory determines if a task switch has occurred. If a task switch has not occurred, first-level cache memory sends the invalidated page table entry to a current running task directory. If a task switch has occurred, first-level cache memory loads from the second-level cache directory a collection of page table entries related to a new task to enable improved task switching without requiring access to a page table stored in main memory to retrieve the collection of page table entries. | 09-04-2008 |
20080215816 | APPARATUS AND METHOD FOR FILTERING UNUSED SUB-BLOCKS IN CACHE MEMORIES - A memory system and method includes a cache having a filtered portion and an unfiltered portion. The unfiltered portion is divided into block sized components, and the filtered portion is divided into sub-block sized components. Blocks evicted from the unfiltered portion have selected sub-blocks thereof cached in the filtered portion for servicing requests. | 09-04-2008 |
20080222358 | METHOD AND SYSTEM FOR PROVIDING AN IMPROVED STORE-IN CACHE - A system and method of providing a cache system having a store-in policy and affording the advantages of store-in cache operation, while simultaneously providing protection against soft-errors in locally modified data, which would normally preclude the use of a store-in cache when reliability is paramount. The improved store-in cache mechanism includes a store-in L1 cache, at least one higher-level storage hierarchy; an ancillary store-only cache (ASOC) that holds most recently stored-to lines of the store-in L1 cache, and a cache controller that controls storing of data to the ancillary store-only cache (ASOC) and recovering of data from the ancillary store-only cache (ASOC) such that the data from the ancillary store-only cache (ASOC) is used only if parity errors are encountered in the store-in L1 cache. | 09-11-2008 |
20080222359 | STORAGE SYSTEM AND DATA MANAGEMENT METHOD - The present invention comprises a CHA | 09-11-2008 |
20080229020 | Systems and Methods of Providing A Multi-Tier Cache - The present solution provides a variety of techniques for accelerating and optimizing network traffic, such as HTTP based network traffic. The solution described herein provides techniques in the areas of proxy caching, protocol acceleration, domain name resolution acceleration as well as compression improvements. In some cases, the present solution provides various prefetching and/or prefreshening techniques to improve intermediary or proxy caching, such as HTTP proxy caching. In other cases, the present solution provides techniques for accelerating a protocol by improving the efficiency of obtaining and servicing data from an originating server to server to clients. In another cases, the present solution accelerates domain name resolution more quickly. As every HTTP access starts with a URL that includes a hostname that must be resolved via domain name resolution into an IP address, the present solution helps accelerate HTTP access. In some cases, the present solution improves compression techniques by prefetching non-cacheable and cacheable content to use for compressing network traffic, such as HTTP. The acceleration and optimization techniques described herein may be deployed on the client as a client agent or as part of a browser, as well as on any type and form of intermediary device, such as an appliance, proxying device or any type of interception caching and/or proxying device. | 09-18-2008 |
20080235453 | SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR EXECUTING A CACHE REPLACEMENT ALGORITHM - A system, method and computer program product for executing a cache replacement algorithm. A system includes a computer processor having an instruction processor, a cache and one or more useful indicators. The instruction processor processes instructions in a running program. The cache includes two or more cache levels including a level one (L1) cache level and one or more higher cache levels. Each cache level includes one or more cache lines and has an associated directory having one or more directory entries. A useful indicator is located within one or more of the directory entries and is associated with a particular cache line. The useful indicator is set to provide an indication that the associated cache line contains one or more instructions that are required by the running program and cleared to provide lack of such an indication. | 09-25-2008 |
20080256297 | Multi-Port High-Level Cache Unit an a Method For Retrieving Information From a Multi-Port High-Level Cache Unit - A device that includes multiple processors that are connected to multiple level-one cache units. The device also includes a multi-port high-level cache unit that includes a first modular interconnect, a second modular interconnect, multiple high-level cache paths; whereas the multiple high-level cache paths comprise multiple concurrently accessible interleaved high-level cache units. Conveniently, the device also includes at least one non-cacheable path. A method for retrieving information from a cache that includes: concurrently receiving, by a first modular interconnect of a multiple-port high-level cache unit, requests to retrieve information. The method is characterized by providing information from at least two paths out of multiple high-level cache paths if at least two high-level cache hit occurs, and providing information via a second modular interconnect if a high-level cache miss occurs. | 10-16-2008 |
20080263281 | CACHE MEMORY SYSTEM USING TEMPORAL LOCALITY INFORMATION AND A DATA STORAGE METHOD - A cache memory system using temporal locality information and a data storage method are provided. The cache memory system including: a main cache which stores data accessed by a central processing unit; an extended cache which stores the data if the data is evicted from the main cache; and a separation cache which stores the data of the extended cache when the data of the extended cache is evicted from the extended cache and temporal locality information corresponding to the data of the extended cache satisfies a predetermined condition. | 10-23-2008 |
20080307163 | METHOD FOR ACCESSING MEMORY - A method for accessing memory is provided. The memory includes many multi-level cells each having at least a storage capable of storing 2 | 12-11-2008 |
20080313404 | MULTIPROCESSOR SYSTEM AND OPERATING METHOD OF MULTIPROCESSOR SYSTEM - According to one aspect of embodiments, a multiprocessor system includes a cache memory corresponding to each of the processors, a hierarchy setting register in which the hierarchical level of each cache memory is set, an access control unit that controls access between each cache memory. The hierarchical level of the cache memory for each processor is stored in a rewritable hierarchy setting register. Each processor handles a cache memory corresponding to another processor as the cache memory having a deeper hierarchy than the cache memory corresponding to the each processor. As the result, each processor can access all the cache memories, and therefore the efficiency of cache memory utilization can be improved and the hierarchical level can be set so that the latency becomes optimal for each application. | 12-18-2008 |
20080313405 | Coherency maintaining device and coherency maintaining method - A second-level cache device stores part of registration information of data for a first-level cache device in a second-level cache-tag unit in association with registration information in a second-level-cache data unit, and stores the registration information of data for the first-level cache device in a first-level cache-tag copying unit. A coherency maintaining processor maintains coherency between the first-level cache device and the second-level cache device based on the information stored in the second-level cache-tag unit and the first-level cache-tag copying unit. | 12-18-2008 |
20080320224 | Multiprocessor system, processor, and cache control method - A multiprocessor system includes processors each having a primary cache and a secondary cache shared by the processors. The processors each include a read unit that reads data from the primary cache, a request unit that makes a write request when the data to be read is not stored in the primary cache, a measuring unit that measures an elapsed time since the write request is made, a receiving unit that receives a read command from an external device, a comparing unit that compares specific information for specifying data, for which the read command has been received, with specific information for specifying data, for which the write request has been made, and a controller that suspends reading of the data according to the read command, when pieces of specific information are the same, and the elapsed time measured is less than a predetermined time. | 12-25-2008 |
20090006752 | High Capacity Memory Subsystem Architecture Employing Hierarchical Tree Configuration of Memory Modules - A high-capacity memory subsystem architecture utilizes multiple memory modules arranged in a hierarchical tree configuration, in which at least some communications from an external source traverse successive levels of the tree to reach memory modules at the lowest level. Preferably, the memory system employs buffered memory chips having dual-mode operation, one of which supports a tree configuration in which data is interleaved and the communications buses operate at reduced bus width and/or reduced bus frequency to match the level of interleaving | 01-01-2009 |
20090006753 | DESIGN STRUCTURE FOR ACCESSING A CACHE WITH AN EFFECTIVE ADDRESS - A design structure embodied in a machine readable storage medium for designing, manufacturing, and/or testing a design for accessing a processor cache is provided. The design structure comprises a processor having a processor core, a level one cache, and circuitry. The circuitry is configured to execute an access instruction in the processor's core, wherein the access instruction provides an untranslated effective address of data to be accessed by the access instruction, determine whether the processor core's level one cache includes the data corresponding to the effective address of the access instruction, wherein the effective address of the access instruction is used without address translation to determine whether the processor core's level one cache includes the data corresponding to the effective address, and provide the data for the access instruction from the level one cache if the level one cache includes the data corresponding to the effective address. | 01-01-2009 |
20090006754 | DESIGN STRUCTURE FOR L2 CACHE/NEST ADDRESS TRANSLATION - A design structure embodied in a machine readable storage medium for designing, manufacturing, and/or testing a design for accessing a processor's cache memory is provided. The design structure comprises a processor having one or more level one caches, a lookaside buffer configured to include a corresponding entry for each cache line placed in each of the processor's one or more level one caches. The corresponding entry indicates a translation from the effective addresses to the real addresses for the cache line. The processor also comprises circuitry configured to access requested data in the processor's one or more level one caches using requested effective addresses of the requested data, translate the requested effective addresses to real addresses if the processor's one or more level one caches do not contain requested data corresponding to the requested effective addresses, and use the translated real addresses to access the level two cache. | 01-01-2009 |
20090024796 | High Performance Multilevel Cache Hierarchy - A digital system is provided with a hierarchical memory system having at least a first and second level cache and a higher level memory. If a requested data item misses in both the first cache level and in the second cache level, a line of data containing the requested data is obtained from a higher level of the hierarchical memory system. The line of data is allocated to both the first cache level and to the second cache level simultaneously. | 01-22-2009 |
20090043966 | Adaptive Mechanisms and Methods for Supplying Volatile Data Copies in Multiprocessor Systems - In a computer system with a memory hierarchy, when a high-level cache supplies a data copy to a low-level cache, the shared copy can be either volatile or non-volatile. When the data copy is later replaced from the low-level cache, if the data copy is non-volatile, it needs to be written back to the high-level cache; otherwise it can be simply flushed from the low-level cache. The high-level cache can employ a volatile-prediction mechanism that adaptively determines whether a volatile copy or a non-volatile copy should be supplied when the high-level cache needs to send data to the low-level cache. An exemplary volatile-prediction mechanism suggests use of a non-volatile copy if the cache line has been accessed consecutively by the low-level cache. Further, the low-level cache can employ a volatile-promotion mechanism that adaptively changes a data copy from volatile to non-volatile according to some promotion policy, or changes a data copy from non-volatile to volatile according to some demotion policy. | 02-12-2009 |
20090055591 | Hierarchical cache memory system - A hierarchical cache memory system having first and second cache memories includes: a controller which outputs dirty data stored in the first cache memory to write back to a main memory; and a controller which processes the write-back to the main memory of the dirty data outputted from the first cache memory in parallel with the write-back to the main memory of dirty data stored in the second cache memory. | 02-26-2009 |
20090063772 | Methods and apparatus for controlling hierarchical cache memory - Methods and apparatus for controlling hierarchical cache memories permit controlling a first level cache memory including a plurality of cache lines and controlling a next lower level cache memory including a plurality of cache lines. An additional memory may be associated with the next lower level cache memory and include a plurality of memory lines, the number of memory lines corresponding to the number of cache lines in a way set of the first level cache memory. Alternatively, the memory lines may include L-flags for multiple cache lines of each way set of the next lower level cache memory. L-flags associated with a given index plus any index offset from the first level cache memory may be contained in a single memory line of the additional memory. | 03-05-2009 |
20090157966 | CACHE INJECTION USING SPECULATION - A method, system, and computer program product for cache injection using speculation are provided. The method includes creating a cache line indirection table at an input/output (I/O) hub, which includes fields and entries for addresses, processor ID, and cache type and includes cache level line limit fields. The method also includes setting cache line limits to the CLL fields and receiving a stream of contiguous addresses at the table. For each address in the stream, the method includes: looking up the address in the table; if the address is present in the table, inject the cache line corresponding to the address in the processor complex; if the address is not present in the table, search limit values from the lowest level cache to the highest level cache; and inject addresses not present in the table to the cache hierarchy of the processor last injected from the contiguous address stream. | 06-18-2009 |
20090198897 | CACHE MANAGEMENT DURING ASYNCHRONOUS MEMORY MOVE OPERATIONS - A data processing system includes a mechanism for completing an asynchronous memory move (AMM) operation in which the processor receives an AMM ST instruction and processes a processor-level move of data in virtual address space and an asynchronous memory mover then completes a physical move of the data within the real address space (memory). A status/control field of the AMM ST instruction includes an indication of a requested treatment of the lower level cache(s) on completion of the AMM operation. When the status/control field indicates an update to at least one cache should be performed, the asynchronous memory mover automatically forwards a copy of the data from the data move to the lower level cache, and triggers an update of a coherency state for a cache line in which the copy of the data is placed. | 08-06-2009 |
20090210625 | Self Prefetching L3/L4 Cache Mechanism - Embodiments of the invention provide a look-aside-look-aside buffer (LLB) configured to retain a portion of the real addresses in a translation look-aside (TLB) buffer to allow prefetching of data from a cache. A subset of real address bits associated with an effective address may be retrieved relatively quickly from the LLB, thereby allowing access to the cache before the complete address translation is available and reducing cache access latency. | 08-20-2009 |
20090216949 | METHOD AND SYSTEM FOR A MULTI-LEVEL VIRTUAL/REAL CACHE SYSTEM WITH SYNONYM RESOLUTION - Method and system for a multi-level virtual/real cache system with synonym resolution. An exemplary embodiment includes a multi-level cache hierarchy, including a set of L1 caches associated with one or more processor cores and a set of L2 caches, wherein the set of L1 caches are a subset of the set of L2 caches, wherein the set of L1 caches underneath a given L2 cache are associated with one or more of the processor cores. | 08-27-2009 |
20090216950 | Push for Sharing Instruction - In one embodiment, a system comprises a first processor, a main memory system, and a cache hierarchy coupled between the first processor and the main memory system. The cache hierarchy comprises at least a first cache. The first processor is configured to execute a first instruction, including forming an address responsive to one or more operands of the first instruction. The system is configured to push a first cache block that is hit by the first address in the first cache to a target location within the cache hierarchy or the main memory system, wherein the target location is unspecified in a definition of the first instruction within an instruction set architecture implemented by the first processor, and wherein the target location is implementation-dependent. | 08-27-2009 |
20090235027 | CACHE MEMORY SYSTEM, DATA PROCESSING APPARATUS, AND STORAGE APPARATUS - A cache memory system includes a plurality of first storage hierarchical units provided individually to a plurality of processors. A second storage hierarchical unit is provided commonly to the plurality of processors. A control unit controls data transfer between the plurality of first storage hierarchical units and the second storage hierarchical unit. Each of the plurality of processors is capable of executing a no-data transfer store command as a store command that does not require data transfer from the second storage hierarchical unit to the corresponding first storage hierarchical unit, and each of the plurality of first storage hierarchical units outputs a transfer-control signal in response to occurrence of a cache miss hit when executing the no-data transfer store command by the corresponding processor. | 09-17-2009 |
20090248983 | TECHNIQUE TO SHARE INFORMATION AMONG DIFFERENT CACHE COHERENCY DOMAINS - A technique to enable information sharing among agents within different cache coherency domains. In one embodiment, a graphics device may use one or more caches used by one or more processing cores to store or read information, which may be accessed by one or more processing cores in a manner that does not affect programming and coherency rules pertaining to the graphics device. | 10-01-2009 |
20090248984 | METHOD AND DEVICE FOR PERFORMING COPY-ON-WRITE IN A PROCESSOR - There are disclosed a method and device for performing Copy-on-Write in a processor. The processor comprises: processor cores, L | 10-01-2009 |
20090259813 | MULTI-PROCESSOR SYSTEM AND METHOD OF CONTROLLING THE MULTI-PROCESSOR SYSTEM - A multi-processor system has a plurality of processor cores, a plurality of level-one caches, and a level-two cache. The level-two cache has a level-two cache memory which stores data, a level-two cache tag memory which stores a line bit indicative of whether an instruction code included in data stored in the level-two cache memory is stored in the plurality of level-one cache memories or not line by line, and a level-two cache controller which refers to the line bit stored in the level-two cache tag memory and releases a line in which data including the same instruction code as that stored in the level-one cache memory is stored, in lines in the level-two cache memory. | 10-15-2009 |
20100005241 | DETECTION OF STREAMING DATA IN CACHE - An apparatus to detect streaming data in memory is presented. In one embodiment the apparatus use reuse bits and S-bits status for cache lines wherein an S-bit status indicates the data in the cache line are potentially streaming data. To enhance the efficiency of a cache, different measures can be applied to make the streaming data become the next victim during a replacement. | 01-07-2010 |
20100005242 | Efficient Processing of Data Requests With The Aid Of A Region Cache - A method and system for configuring a cache memory system in order to efficiently process processor requests. A group of cache elements, which include a Region Cache, a Region Coherence Array, and a lowest level cache, is configured based on a tradeoff of latency and power consumption requirements. A selected cache configuration differs from other feasible configurations in the order in which cache elements are accessed relative to each other. The Region Cache is employed in a number of configurations to reduce the power consumption, latency, and bandwidth requirements of the Region Coherence Array. The Region Cache is accessed by processor requests before (or in parallel with) the larger Region Coherence Array, providing the region coherence state and power efficiently to requests that hit in the Region Cache. | 01-07-2010 |
20100023695 | Victim Cache Replacement - A data processing system includes a processor core having an associated upper level cache and a lower level victim cache. In response to a memory access request of the processor core, the lower level cache victim determines whether the memory access request hits or misses in the directory of the lower level victim cache, and the upper level cache determines whether a castout from the upper level cache is to be performed and selects a victim coherency granule for eviction from the upper level cache. In response to determining that a castout from the upper level cache is to be performed, the upper level cache evicts the selected victim coherency granule. In the eviction, the upper level cache reads out the victim coherency granule from the data array of the upper level cache only in response to an indication that the memory access request misses in the directory of the lower level victim cache. | 01-28-2010 |
20100030965 | DISOWNING CACHE ENTRIES ON AGING OUT OF THE ENTRY - Caching where portions of data are stored in slower main memory and are transferred to faster memory between one or more processors and the main memory. The cache is such that an individual cache system must communicate to other associated cache systems, or check with such cache systems, to determine if they contain a copy of a given cached location prior to or upon modification or appropriation of data at a given cached location. The cache further includes provisions for determining when the data stored in a particular memory location may be replaced. | 02-04-2010 |
20100030966 | Cache memory and cache memory control apparatus - Disclosed herein is a cache memory including: a tag storage section including entries each including a tag address and a pending indication portion, at least one of the entries being to be referred to by a first address portion of an access address; a data storage section; a tag control section configured to compare a second address portion of the access address with the tag address included in each of the entries referred to to detect an entry whose tag address matches the second address portion, and, when the pending indication portion included in the detected entry indicates pending, cause an access related to the access address to be suspended; and a data control section configured to select data corresponding to the detected entry from among the data storage section, when the pending indication portion included in the detected entry does not indicate pending. | 02-04-2010 |
20100070709 | CACHE FILTERING METHOD AND APPARATUS - A method and apparatus used within memory and data processing that reduces the number of references allowed in processor cache by using active rows to reject references that are less frequently used from the cache. Comparators within a memory controller are used to generate a signal indicative of a row hit or miss, which signal is then applied to one or more demultiplexers to enable or disable transfer of a memory reference to processor cache locations. The cache may be level one (L1) or level two (L2) caches including data and or instructions or some combination of L1, L2, data, and instructions. | 03-18-2010 |
20100070710 | Techniques for Cache Injection in a Processor System - A technique for performing cache injection includes monitoring addresses on a bus. Ownership of input/output data on the bus is acquired by a cache when an address on the bus (that is associated with the input/output data) corresponds to an address of a data block stored in the cache. | 03-18-2010 |
20100070711 | Techniques for Cache Injection in a Processor System Using a Cache Injection Instruction - A technique for performing cache injection includes monitoring addresses on a bus in response to a cache injection instruction. Ownership of input/output data on the bus is acquired by a cache when an address on the bus (that is associated with the input/output data) corresponds to an address of a data block associated with the cache injection instruction. | 03-18-2010 |
20100070712 | Techniques for Cache Injection in a Processor System with Replacement Policy Position Modification - A technique for performing cache injection includes monitoring, at a cache, addresses on a bus. Ownership of input/output data on the bus is then acquired by the cache when an address on the bus (that is associated with the input/output data) corresponds to an address of a data block stored in the cache. A replacement policy position of the data block is then modified (to increase a probability that the data block is consumed prior to ejection from the cache). | 03-18-2010 |
20100082905 | DISABLING CACHE PORTIONS DURING LOW VOLTAGE OPERATIONS - Methods and apparatus relating to disabling one or more cache portions during low voltage operations are described. In some embodiments, one or more extra bits may be used for a portion of a cache that indicate whether the portion of the cache is capable at operating at or below Vccmin levels. Other embodiments are also described and claimed. | 04-01-2010 |
20100095065 | FIELD DEVICE COMMUNICATIONS - The present invention is a system and method for caching data in mobile devices to improve field assessment capabilities in a relief management system. The system analyzes captured geolocation information associated with one or more mobile field devices to identify a set of data to be cached by a mobile field device. The system then communicates the set of data to be cached to the mobile field device. The data cached is predicated upon the predicted likely location of field assessment operations and the data includes server-side image tiles. | 04-15-2010 |
20100100682 | Victim Cache Replacement - A data processing system includes a processor core having an associated upper level cache and a lower level victim cache. In response to a memory access request of the processor core that specifies a non-modifying access to a target coherency granule, a determination is made whether the memory access request hits or misses in a directory of the lower level victim cache. In response to determining that the memory access request hits in the lower level victim cache in a data-valid coherence state, the lower level victim cache provides the target coherency granule of the memory access request to the upper level cache. The lower level victim cache preserves the target coherency granule in the lower level victim cache in a shared coherence state if the memory access request is of a first type and invalidates the target coherency granule if the memory access request is of a second type. | 04-22-2010 |
20100100683 | Victim Cache Prefetching - A processing unit for a multiprocessor data processing system includes a processor core and a cache hierarchy coupled to the processor core to provide low latency data access. The cache hierarchy includes an upper level cache coupled to the processor core and a lower level victim cache coupled to the upper level cache. In response to a prefetch request of the processor core that misses in the upper level cache, the lower level victim cache determines whether the prefetch request misses in the directory of the lower level victim cache and, if so, allocates a state machine in the lower level victim cache that services the prefetch request by issuing the prefetch request to at least one other processing unit of the multiprocessor data processing system. | 04-22-2010 |
20100122031 | SPIRAL CACHE POWER MANAGEMENT, ADAPTIVE SIZING AND INTERFACE OPERATIONS - A spiral cache memory provides low access latency for frequently-accessed values by self-organizing to always move a requested value to a front-most storage tile of the spiral. If the spiral cache needs to eject a value to make space for a value moved to the front-most tile, space is made by ejecting a value from the cache to a backing store. A buffer along with flow control logic is used to prevent overflow of writes of ejected values to the generally slow backing store. The tiles in the spiral cache may be single storage locations or be organized as some form of cache memory such as direct-mapped or set-associative caches. Power consumption of the spiral cache can be reduced by dividing the cache into an active and inactive partition, which can be adjusted on a per-tile basis. Tile-generated or global power-down decisions can set the size of the partitions. | 05-13-2010 |
20100122032 | SELECTIVELY PERFORMING LOOKUPS FOR CACHE LINES - Embodiments of the present invention provide a system that selectively performs lookups for cache lines. During operation, the system by maintains a lower-level cache and a higher-level cache in accordance with a set of rules that dictate conditions under which cache lines are held in the lower-level cache and the higher-level cache. The system next performs a lookup for cache line A in the lower level cache. The system then discovers that the lookup for cache line A missed in the lower-level cache, but that cache line B is present in the lower-level cache. Next, in accordance with the set of rules, the system determines, without performing a lookup for cache line A in the higher-level cache, that cache line A is guaranteed not to be present and valid in the higher-level cache because cache line B is present in the lower-level cache. | 05-13-2010 |
20100122033 | MEMORY SYSTEM INCLUDING A SPIRAL CACHE - An integrated memory system with a spiral cache responds to requests for values at a first external interface coupled to a particular storage location in the cache in a time period determined by the proximity of the requested values to the particular storage location. The cache supports multiple outstanding in-flight requests directed to the same address using an issue table that tracks multiple outstanding requests and control logic that applies the multiple requests to the same address in the order received by the cache memory. The cache also includes a backing store request table that tracks push-back write operations issued from the cache memory when the cache memory is full and a new value is provided from the external interface, and the control logic to prevent multiple copies of the same value from being loaded into the cache or a copy being loaded before a pending push-back has been completed. | 05-13-2010 |
20100131712 | PSEUDO CACHE MEMORY IN A MULTI-CORE PROCESSOR (MCP) - Specifically, under the present invention, a cache memory unit can be designated as a pseudo cache memory unit for another cache memory unit within a common hierarchal level. For example, in case of cache miss at cache memory unit “X” on cache level L2 of a hierarchy, a request is sent to a cache memory unit on cache level L3 (external), as well as one or more other cache memory units on cache level L2. The L2 level cache memory units return search results as a hit or a miss. They typically do not search L3 nor write back with the L3 result even (e.g., if it the result is a miss). To this extent, only the immediate origin of the request is written back with L3 results, if all L2s miss. As such, the other L2 level cache memory units serve the original L2 cache memory unit as pseudo caches | 05-27-2010 |
20100146211 | Shader Complex with Distributed Level One Cache System and Centralized Level Two Cache - A shader pipe texture filter utilizes a level one cache system as a primary method of storage but with the ability to have the level one cache system read and write to a level two cache system when necessary. The level one cache system communicates with the level two cache system via a wide channel memory bus. In addition, the level one cache system can be configured to support dual shader pipe texture filters while maintaining access to the level two cache system. A method utilizing a level one cache system as a primary method of storage with the ability to have the level one cache system read and write a level two cache system when necessary is also presented. In addition, level one cache systems can allocate a defined area of memory to be sharable amongst other resources. | 06-10-2010 |
20100153646 | MEMORY HIERARCHY WITH NON-VOLATILE FILTER AND VICTIM CACHES - Various embodiments of the present invention are generally directed to an apparatus and method for non-volatile caching of data in a memory hierarchy of a data storage device. In accordance with some embodiments, a pipeline memory structure is provided to store data for use by a controller. The pipeline has a plurality of hierarchical cache levels each with an associated non-volatile filter cache and a non-volatile victim cache. Data retrieved from each cache level are respectively promoted to the associated non-volatile filter cache. Data replaced in each cache level are respectively demoted to the associated non-volatile victim cache. | 06-17-2010 |
20100153647 | Cache-To-Cache Cast-In - A data processing system includes a first processing unit and a second processing unit coupled by an interconnect fabric. The first processing unit has a first processor core and associated first upper and first lower level caches, and the second processing unit has a second processor core and associated second upper and lower level caches. In response to a data request, a victim cache line is selected for castout from the first lower level cache. The first processing unit issues on the interconnect fabric a lateral castout (LCO) command that identifies the victim cache line to be castout from the first lower level cache and indicates that a lower level cache is an intended destination. In response to a coherence response indicating success of the LCO command, the victim cache line is removed from the first lower level cache and held in the second lower level cache. | 06-17-2010 |
20100161904 | CACHE HIERARCHY WITH BOUNDS ON LEVELS ACCESSED - The present invention is directed to a system managing data in a multilevel cache memory system. Certain cache data is designated and stored only in particular levels of the multilevel cache, bypassing other levels of the multilevel cache. In a multiprocessor environment, the present invention includes cache coherency operations or messages that pertain to data stored only in certain levels of a multilevel cache. | 06-24-2010 |
20100169576 | SYSTEM AND METHOD FOR SIFT IMPLEMENTATION AND OPTIMIZATION - A method is to implement a Scale Invariant Feature Transform algorithm in a shared memory multiprocessing system. The method comprises building differences of Gaussian (DoG) images for an input image, detecting keypoints in the DoG images; assigning orientations to the keypoints and computing keypoints descriptors and performing matrix operations. In the method, building differences of Gaussian (DoG) images for an input image and detecting keypoints in the DoG images are executed for all scales of the input image in parallel. And, orientation assignment and keypoints descriptions computation are executed for all octaves of the input image in paralle. | 07-01-2010 |
20100180081 | Adaptive Data Prefetch System and Method - A data processing system includes a processor, a unit that includes a multi-level cache, a prefetch system and a memory. The data processing system can operate in a first mode and a second mode. The prefetch system can change behavior in response to a desired power consumption policy set by an external agent or automatically via hardware based on on-chip power/performance thresholds. | 07-15-2010 |
20100185816 | Multiple Cache Line Size - A mechanism which allows pages of flash memory to be read directly into cache. The mechanism enables different cache line sizes for different cache levels in a cache hierarchy, and optionally, multiple line size support, simultaneously or as an initialization option, in the highest level (largest/slowest) cache. Such a mechanism improves performance and reduces cost for some applications. | 07-22-2010 |
20100191912 | Systems and Methods for Memory Management on Print Devices - Systems and methods disclosed permit flexible optimization of printer cache memories by specify criteria for determining cache membership for objects derived from a print data streams, wherein the objects may be associated with distinct reference counts. In some embodiments, the method may comprise the steps of: assigning an initial value to the reference count associated with an object, if the object is not present in the cache; incrementing the reference count by a first weight, if the object is already present in the cache; decrementing the reference count by a second weight, in response to an end-of-page event; and removing the object from the cache if the reference count is below a threshold. | 07-29-2010 |
20100191913 | RECONFIGURATION OF EMBEDDED MEMORY HAVING A MULTI-LEVEL CACHE - A method of operating an embedded memory having (i) a local memory, (ii) a system memory, and (iii) a multi-level cache memory coupled between a processor and the system memory. According to one embodiment of the method, a two-level cache memory is configured to function as a single-level cache memory by excluding the level-two (L2) cache from the cache-transfer path between the processor and the system memory. The excluded L2-cache is then mapped as an independently addressable memory unit within the embedded memory that functions as an extension of the local memory, a separate additional local memory, or an extension of the system memory. | 07-29-2010 |
20100191914 | REGION COHERENCE ARRAY HAVING HINT BITS FOR A CLUSTERED SHARED-MEMORY MULTIPROCESSOR SYSTEM - A system and method for a multilevel region coherence protocol for use in Region Coherence Arrays (RCAs) deployed in clustered shared-memory multiprocessor systems which optimize cache-to-cache transfers (interventions) by using region hint bits in each RCA to allow memory requests for lines of a region of the memory to be optimally sent to only a determined portion of the clustered shared-memory multiprocessor system without broadcasting the requests to all processors in the system. A sufficient number of region hint bits are used to uniquely identify each level of the system's interconnect hierarchy to optimally predict which level of the system likely includes a processor that has cached copies of lines of data from the region. | 07-29-2010 |
20100223429 | Hybrid Caching Techniques and Garbage Collection Using Hybrid Caching Techniques - Hybrid caching techniques and garbage collection using hybrid caching techniques are provided. A determination of a measure of a characteristic of a data object is performed, the characteristic being indicative of an access pattern associated with the data object. A selection of one caching structure, from a plurality of caching structures, is performed in which to store the data object based on the measure of the characteristic. Each individual caching structure in the plurality of caching structures stores data objects has a similar measure of the characteristic with regard to each of the other data objects in that individual caching structure. The data object is stored in the selected caching structure and at least one processing operation is performed on the data object stored in the selected caching structure. | 09-02-2010 |
20100235576 | Handling Castout Cache Lines In A Victim Cache - A victim cache memory includes a cache array, a cache directory of contents of the cache array, and a cache controller that controls operation of the victim cache memory. The cache controller, responsive to receiving a castout command identifying a victim cache line castout from another cache memory, causes the victim cache line to be held in the cache array. If the other cache memory is a higher level cache in the cache hierarchy of the processor core, the cache controller marks the victim cache line in the cache directory so that it is less likely to be evicted by a replacement policy of the victim cache, and otherwise, marks the victim cache line in the cache directory so that it is more likely to be evicted by the replacement policy of the victim cache. | 09-16-2010 |
20100235577 | VICTIM CACHE LATERAL CASTOUT TARGETING - A data processing system includes a plurality of processing units coupled by an interconnect fabric. In response to a data request, a victim cache line is selected for castout from a first lower level cache of a first processing unit, and a target lower level cache of one of the plurality of processing units is selected based upon architectural proximity of the target lower level cache to a home system memory to which the address of the victim cache line is assigned. The first processing unit issues on the interconnect fabric a lateral castout (LCO) command that identifies the victim cache line to be castout from the first lower level cache and indicates that the target lower level cache is an intended destination. In response to a coherence response indicating success of the LCO command, the victim cache line is removed from the first lower level cache and held in the second lower level cache. | 09-16-2010 |
20100235578 | Cached Memory System and Cache Controller for Embedded Digital Signal Processor - A cached memory system that can handle high-rate input data and ensure that an embedded DSP can meet real-time constraints is described. The cached memory system includes a cache memory located close to a processor core, an on-chip memory at the next higher memory level, and an external main memory at the topmost memory level. A cache controller handles paging of instructions and data between the cache memory and the on-chip memory for cache misses. A direct memory exchange (DME) controller handles user-controlled paging between the on-chip memory and the external memory. A user/programmer can arrange to have the instructions and data required by the processor core to be present in the on-chip memory well in advance of when they are actually needed by the processor core. | 09-16-2010 |
20100250853 | Prefetch engine based translation prefetching - A method and system for prefetching in computer system are provided. The method in one aspect includes using a prefetch engine to perform prefetch instructions and to translate unmapped data. Misses to address translations during the prefetch are handled and resolved. The method also includes storing the resolved translations in a respective cache translation table. A system for prefetching in one aspect includes a prefetch engine operable to receive instructions to prefetch data from the main memory. The prefetch engine is also operable to search cache address translation for prefetch data and perform address mapping translation, if the prefetch data is unmapped. The prefetch engine is further operable to prefetch the data and store the address mapping in one or more cache memory, if the data is unmapped. | 09-30-2010 |
20100262782 | Lateral Castout Target Selection - In response to a data request of a first processing unit among a plurality of processing units, the first processing unit selects a victim cache line to be castout from the lower level cache of the first processing unit and selects the lower level cache of a second of the plurality of processing units as an intended destination of a lateral castout (LCO) command by randomized round-robin selection. The first processing unit issues on the interconnect fabric an LCO command identifying the victim cache line and the intended destination. In response to a coherence response to the LCO command indicating success of the LCO command, the first processing unit removes the victim cache line from its lower level cache, and the victim cache line is held in the lower level cache of one of the plurality of processing units other than the first processing unit. | 10-14-2010 |
20100262783 | Mode-Based Castout Destination Selection - In response to a data request of a first of a plurality of processing units, the first processing unit selects a victim cache line to be castout from the lower level cache of the first processing unit and determines whether a mode is set. If not, the first processing unit issues on the interconnect fabric an LCO command identifying the victim cache line and indicating that a lower level cache is the intended destination. If the mode is set, the first processing unit issues a castout command with an alternative intended destination. In response to a coherence response to the LCO command indicating success of the LCO command, the first processing unit removes the victim cache line from its lower level cache, and the victim cache line is held elsewhere in the data processing system. The mode can be set to inhibit castouts to system memory, for example, for testing. | 10-14-2010 |
20100262784 | Empirically Based Dynamic Control of Acceptance of Victim Cache Lateral Castouts - A second lower level cache receives an LCO command issued by a first lower level cache on an interconnect fabric. The LCO command indicates an address of a victim cache line to be castout from the first lower level cache and indicates that the second lower level cache is an intended destination of the victim cache line. The second lower level cache determines whether to accept the victim cache line from the first lower level cache based at least in part on the address of the victim cache line indicated by the LCO command. In response to determining not to accept the victim cache line, the second lower level cache provides a coherence response to the LCO command refusing the identified victim cache line. In response to determining to accept the victim cache line, the second lower level cache updates an entry corresponding to the identified victim cache line. | 10-14-2010 |
20100268882 | LOAD REQUEST SCHEDULING IN A CACHE HIERARCHY - A system and method for tracking core load requests and providing arbitration and ordering of requests. When a core interface unit (CIU) receives a load operation from the processor core, a new entry in allocated in a queue of the CIU. In response to allocating the new entry in the queue, the CIU detects contention between the load request and another memory access request. In response to detecting contention, the load request may be suspended until the contention is resolved. Received load requests may be stored in the queue and tracked using a least recently used (LRU) mechanism. The load request may then be processed when the load request resides in a least recently used entry in the load request queue. CIU may also suspend issuing an instruction unless a read claim (RC) machine is available. In another embodiment, CIU may issue stored load requests in a specific priority order. | 10-21-2010 |
20100268883 | Information Handling System with Immediate Scheduling of Load Operations and Fine-Grained Access to Cache Memory - An information handling system (IHS) includes a processor with a cache memory system. The processor includes a processor core with an L1 cache memory that couples to an L2 cache memory. The processor includes an arbitration mechanism that controls load and store requests to the L2 cache memory. The arbitration mechanism includes control logic that enables a load request to interrupt a store request that the L2 cache memory is currently servicing. When the L2 cache memory finishes servicing the interrupting load request, the L2 cache memory may return to servicing the interrupted store request at the point of interruption. The control logic determines the size requirement of each load operation or store operation. When the cache memory system performs a store operation or load operation, the memory system accesses the portion of a cache line it needs to perform the operation instead of accessing an entire cache line. | 10-21-2010 |
20100268884 | Updating Partial Cache Lines in a Data Processing System - A processing unit for a data processing system includes a processor core having one or more execution units for processing instructions and a register file for storing data accessed in processing of the instructions. The processing unit also includes a multi-level cache hierarchy coupled to and supporting the processor core. The multi-level cache hierarchy includes at least one upper level of cache memory having a lower access latency and at least one lower level of cache memory having a higher access latency. The lower level of cache memory, responsive to receipt of a memory access request that hits only a partial cache line in the lower level cache memory, sources the partial cache line to the at least one upper level cache memory to service the memory access request. The at least one upper level cache memory services the memory access request without caching the partial cache line. | 10-21-2010 |
20100268885 | SPECIFYING AN ACCESS HINT FOR PREFETCHING LIMITED USE DATA IN A CACHE HIERARCHY - A system and method for specifying an access hint for prefetching limited use data. A processing unit receives a data cache block touch (DCBT) instruction having an access hint indicating to the processing unit that a program executing on the data processing system may soon access a cache block addressed within the DCBT instruction. The access hint is contained in a code point stored in a subfield of the DCBT instruction. In response to detecting that the code point is set to a specific value, the data addressed in the DCBT instruction is prefetched into an entry in the lower level cache. The entry may then be updated as a least recently used entry of a plurality of entries in the lower level cache. In response to a new cache block being fetched to the cache, the prefetched cache block is cast out of the cache. | 10-21-2010 |
20100268886 | SPECIFYING AN ACCESS HINT FOR PREFETCHING PARTIAL CACHE BLOCK DATA IN A CACHE HIERARCHY - A system and method for specifying an access hint for prefetching only a subsection of cache block data, for more efficient system interconnect usage by the processor core. A processing unit receives a data cache block touch (DCBT) instruction containing an access hint and identifying a specific size portion of data to be prefetched. Both the access hint and a value corresponding to an amount of data to be prefetched are contained in separate subfields of the DCBT instruction. In response to detecting that the code point is set to a specific value, only the specific size of data identified in a sub-field of the DCBT and addressed in the DCBT instruction is prefetched into an entry in the lower level cache. | 10-21-2010 |
20100274971 | Multi-Core Processor Cache Coherence For Reduced Off-Chip Traffic - Technologies are generally described herein for maintaining cache coherency within a multi-core processor. A first cache entry to be evicted from a first cache may be identified. The first cache entry may include a block of data and a first tag indicating an owned state. An owner eviction message for the first cache entry may be broadcasted from the first cache. A second cache entry in a second cache may be identified. The second cache entry may include the block of data and a second tag indicating a shared state. The broadcasted owner eviction message may be detected with the second cache. An ownership acceptance message for the second cache entry may be broadcasted from the second cache. The broadcasted ownership acceptance message may be detected with the first cache. The second tag in the second cache entry may be transformed from the shared state to the owned state. | 10-28-2010 |
20100299481 | HIERARCHICAL READ-COMBINING LOCAL MEMORIES - The present disclosure relates to a system for hierarchical read-combining memory having a multicore processor operably coupled to a memory controller. The memory controller is configured for receiving a plurality of requests for data from one or more processing cores of the multicore processor, selectively holding a request for data from the plurality of requests for an undetermined or indefinite amount of time, and selectively combining a plurality of requests for the same data into a single read-combined data request. The present disclosure further relates to a method for hierarchical read-combining data requests of a multicore processor and a computer accessible medium having stored thereon computer executable instructions for performing a procedure for hierarchical read-combining data requests of a multicore processor. | 11-25-2010 |
20100299482 | METHOD AND APPARATUS FOR DETERMINING CACHE STORAGE LOCATIONS BASED ON LATENCY REQUIREMENTS - A method for determining whether to store binary information in a fast way or a slow way of a cache is disclosed. The method includes receiving a block of binary information to be stored in a cache memory having a plurality of ways. The plurality of ways includes a first subset of ways and a second subset of ways, wherein a cache access by a first execution core from one of the first subset of ways has a lower latency time than a cache access from one of the second subset of ways. The method further includes determining, based on a predetermined access latency and one or more parameters associated with the block of binary information, whether to store the block of binary information into one of the first set of ways or one of the second set of ways. | 11-25-2010 |
20100306470 | Methods and Apparatus for Issuing Memory Barrier Commands in a Weakly Ordered Storage System - Efficient techniques are described for enforcing order of memory accesses. A memory access request is received from a device which is not configured to generate memory barrier commands. A surrogate barrier is generated in response to the memory access request. A memory access request may be a read request. In the case of a memory write request, the surrogate barrier is generated before the write request is processed. The surrogate barrier may also be generated in response to a memory read request conditional on a preceding write request to the same address as the read request. Coherency is enforced within a hierarchical memory system as if a memory barrier command was received from the device which does not produce memory barrier commands. | 12-02-2010 |
20100306471 | D-CACHE LINE USE HISTORY BASED DONE BIT BASED ON SUCCESSFUL PREFETCHABLE COUNTER - A method of providing history based done logic for a D-cache includes receiving a D-cache line in an L2 cache; determining if the D-cache line is unprefetchable; aging the D-cache line without a delay if the D-cache line is prefetchable; and aging the D-cache line with a delay if the D-cache line is unprefetchable. | 12-02-2010 |
20100306472 | I-CACHE LINE USE HISTORY BASED DONE BIT BASED ON SUCCESSFUL PREFETCHABLE COUNTER - A method of providing history based done logic for a I-cache includes receiving an I-cache line in an L2 cache; determining if the I-cache line is unprefetchable; aging the I-cache line without a delay if the I-cache line is prefetchable; and aging the I-cache line with a delay is the I-cache line is unprefetchable. | 12-02-2010 |
20100306473 | CACHE LINE USE HISTORY BASED DONE BIT MODIFICATION TO D-CACHE REPLACEMENT SCHEME - A method of providing history based done logic includes receiving a cache line in a L2 cache; determining if the cache line has a history of access at least three times on a previous call into the L2 cache; providing the cache line directly to a processor if the history of access was less then the at least three times; and loading the cache line into an L1 cache if the history of access was the at least three times. | 12-02-2010 |
20100306474 | CACHE LINE USE HISTORY BASED DONE BIT MODIFICATION TO I-CACHE REPLACEMENT SCHEME - A method of providing history based done logic for instructions includes receiving an instruction in a cache line in a L2 cache; and loading the cache line into an L1 cache with a history count that indicates the number of read references of the previous access. | 12-02-2010 |
20100318741 | MULTIPROCESSOR COMPUTER CACHE COHERENCE PROTOCOL - A multiprocessor computer system comprises a processing node having a plurality of processors and a local memory shared among processors in the node. An L | 12-16-2010 |
20100325358 | Data storage protocols to determine items stored and items overwritten in linked data stores - A storage apparatus and method for storing a plurality of items is disclosed. The storage apparatus is configured to receive a first access request and a second access request for accessing respective items in a same clock cycle. The storage apparatus comprises: two stores each for storing a subset of the plurality of items, the first access request being routed to a first store and said second access request to a second store; miss detecting circuitry for detecting a miss where a requested item is not stored in the accessed store; item retrieving circuitry for retrieving an item whose access generated a miss from a further store; updating circuitry for selecting an item to overwrite in a respective one of the two stores in dependence upon an access history of the respective store, the updating circuitry being responsive to the miss detecting circuitry detecting the miss in an access to the first store and to at least one further condition to update both of the two stores with the item retrieved from the further store by overwriting the selected items. | 12-23-2010 |
20110010501 | EFFICIENT DATA PREFETCHING IN THE PRESENCE OF LOAD HITS - A BIU prioritizes L1 requests above L2 requests. The L2 generates a first request to the BIU and detects the generation of a snoop request and L1 request to the same cache line. The L2 determines whether a bus transaction to fulfill the first request may be retried and, if so, generates a miss, and otherwise generates a hit. Alternatively, the L2 detects the L1 generated a request to the L2 for the same line and responsively requests the BIU to refrain from performing a transaction on the bus to fulfill the first request if the BIU has not yet been granted the bus. Alternatively, a prefetch cache and the L2 allow the same line to be simultaneously present. If an L1 request hits in both the L2 and in the prefetch cache, the prefetch cache invalidates its copy of the line and the L2 provides the line to the L1. | 01-13-2011 |
20110022802 | Controlling data accesses to hierarchical data stores to retain access order - Data storage circuitry for controlling access to data stored in a memory is disclosed. The data storage circuitry comprises: a data store for storing a subset of the data stored in the memory; access circuitry for receiving access requests and for outputting the requested data, at least some of the received access requests being ordered access requests requiring the accessed data to be output in a same order as the access requests are received in; control circuitry for controlling access to the data; and retrieval circuitry for retrieving the data from the memory; wherein the control circuitry is responsive to an access request received from the access circuitry to access the data store and in response to detecting a miss in the data store when the requested data is not stored in the data store to transmit the access request to the retrieval circuitry; the retrieval circuitry being configured to retrieve requested data from the memory in response to the access request and to store the data in the data store and being responsive to no asserted output inhibit signal associated with the data access request to transmit the retrieved data to the access circuitry for output and being responsive to an asserted output inhibit signal associated with the data access request not to transmit the retrieved data to the access circuitry; the data storage circuitry further comprising detection circuitry for detecting an earlier ordered access request that misses in the data store and a later ordered access request that hits while the earlier ordered access request is pending, the data storage circuitry being configured to halt the later ordered access request and in response to receipt of a subsequent ordered access request while the earlier ordered request is still pending to assert an output inhibit signal associated with the subsequent ordered access request and in response to detection of completion of the earlier ordered access request to deassert the output inhibit signal. | 01-27-2011 |
20110022803 | Two Partition Accelerator and Application of Tiered Flash to Cache Hierarchy in Partition Acceleration - An approach is provided to identify a disabled processing core and an active processing core from a set of processing cores included in a processing node. Each of the processing cores is assigned a cache memory. The approach extends a memory map of the cache memory assigned to the active processing core to include the cache memory assigned to the disabled processing core. A first amount of data that is used by a first process is stored by the active processing core to the cache memory assigned to the active processing core. A second amount of data is stored by the active processing core to the cache memory assigned to the inactive processing core using the extended memory map. | 01-27-2011 |
20110047332 | STORAGE SYSTEM, CACHE CONTROL DEVICE, AND CACHE CONTROL METHOD - A storage system includes a storage device that stores data, a cache memory that caches the data, an information storage unit that stores data configuration information indicating a configuration of the data and state information indicating a cache state of the data in the cache memory, a candidate data selection unit, a first determining unit and a data-to-be-written unit. The candidate data selection unit selects, according to the state information candidate data from the data cached in the cache memory. The first determination unit determines, according to the data configuration information, whether data relating to the candidate data is cached in the cache memory. The data-to-be-written selection unit selects, according to the determination made by the first determination unit, data to be written into the storage device, from the data cached in the cache memory. | 02-24-2011 |
20110055482 | SHARED CACHE RESERVATION - Various example embodiments are disclosed. According to an example embodiment, a shared cache may be configured to determine whether a word requested by one of the L1 caches is currently stored in the L2 shared cache, read the requested word from the main memory based on determining that the requested word is not currently stored in the L2 shared cache, determine whether at least one line in a way reserved for the requesting L1 cache is unused, store the requested word in the at least one line based on determining that the at least one line in the reserved way is unused, and store the requested word in a line of the L2 shared cache outside the reserved way based on determining that the at least one line in the reserved way is not unused. | 03-03-2011 |
20110072213 | INSTRUCTIONS FOR MANAGING A PARALLEL CACHE HIERARCHY - A method for managing a parallel cache hierarchy in a processing unit. The method includes receiving an instruction from a scheduler unit, where the instruction comprises a load instruction or a store instruction; determining that the instruction includes a cache operations modifier that identifies a policy for caching data associated with the instruction at one or more levels of the parallel cache hierarchy; and executing the instruction and caching the data associated with the instruction based on the cache operations modifier. | 03-24-2011 |
20110078380 | MULTI-LEVEL CACHE PREFETCH - Methods and apparatus relating to multi-level cache prefetch are described. In some embodiments, a data parking logic updates a prefetch request with one or more bits based on the status of a request queue. The one or more bits may in turn cause the corresponding prefetched data to be stored in one of at least two caches. Other embodiments are also described and claimed. | 03-31-2011 |
20110078381 | Cache Operations and Policies For A Multi-Threaded Client - A method for managing a parallel cache hierarchy in a processing unit. The method including receiving an instruction that includes a cache operations modifier that identifies a level of the parallel cache hierarchy in which to cache data associated with the instruction; and implementing a cache replacement policy based on the cache operations modifier. | 03-31-2011 |
20110082982 | CONTENT DELIVERY NETWORK CACHE GROUPING - One or more content delivery networks (CDNs) that deliver content objects for others is disclosed. Content is propagated to edge servers through hosting and/or caching. End user computers are directed to an edge server for delivery of a requested content object by a universal resource indicator (URI). When a particular edge server does not have a copy of the content object from the URI, information is passed to another server, the ancestor or parent server to find the content object. There can be different parents servers designated for different URIs. The parent server looks for the content object and if not found, will go to another server, the grandparent server, and so on up a hierarchy within the group. Eventually, the topmost server in the hierarchy goes to the origin server to find the content object. The origin server may be hosted in the CDN or at a content provider across the Internet. Once the content object is located in the hierarchical chain, the content object is passed back down the chain to the edge server for delivery. Optionally, the various servers in the chain may cache or host the content object as it is relayed. | 04-07-2011 |
20110087841 | PROCESSOR AND CONTROL METHOD - A processor includes a first processing unit that has a first memory and performs processing, a second processing unit that performs processing, a second memory that holds status information specifying a status of data held in the first memory, and a control unit that outputs a request for reading out the data of the first address to the first processing unit upon receiving a first access request for data of a first address from the second processing unit when first status information of the data of the first address indicates that the data of the first address is held in the first memory in an exclusive state or an owned state and that allows the second processing unit to access data of the first address included at the second memory upon receiving a no-data-modification notification indicating the data of the first address is not modified by the first processing unit. | 04-14-2011 |
20110107031 | Extended Cache Capacity - A method, programmed medium and system are provided for enabling a core's cache capacity to be increased by using the caches of the disabled or non-enabled cores on the same chip. Caches of disabled or non-enabled cores on a chip are made accessible to store cachelines for those chip cores that have been enabled, thereby extending cache capacity of enabled cores. | 05-05-2011 |
20110107032 | CACHE RECONFIGURATION BASED ON RUN-TIME PERFORMANCE DATA OR SOFTWARE HINT - A method for reconfiguring a cache memory is provided. The method in one aspect may include analyzing one or more characteristics of an execution entity accessing a cache memory and reconfiguring the cache based on the one or more characteristics analyzed. Examples of analyzed characteristic may include but are not limited to data structure used by the execution entity, expected reference pattern of the execution entity, type of an execution entity, heat and power consumption of an execution entity, etc. Examples of cache attributes that may be reconfigured may include but are not limited to associativity of the cache memory, amount of the cache memory available to store data, coherence granularity of the cache memory, line size of the cache memory, etc. | 05-05-2011 |
20110119445 | HEAP/STACK GUARD PAGES USING A WAKEUP UNIT - A method and system for providing a memory access check on a processor including the steps of detecting accesses to a memory device including level-1 cache using a wakeup unit. The method includes invalidating level-1 cache ranges corresponding to a guard page, and configuring a plurality of wakeup address compare (WAC) registers to allow access to selected WAC registers. The method selects one of the plurality of WAC registers, and sets up a WAC register related to the guard page. The method configures the wakeup unit to interrupt on access of the selected WAC register. The method detects access of the memory device using the wakeup unit when a guard page is violated. The method generates an interrupt to the core using the wakeup unit, and determines the source of the interrupt. The method detects the activated WAC registers assigned to the violated guard page, and initiates a response. | 05-19-2011 |
20110119446 | CONDITIONAL LOAD AND STORE IN A SHARED CACHE - A method, system and computer program product are disclosed for implementing load-reserve and store-conditional instructions in a multi-processor computing system. The computing system includes a multitude of processor units and a shared memory cache, and each of the processor units has access to the memory cache. In one embodiment, the method comprises providing the memory cache with a series of reservation registers, and storing in these registers addresses reserved in the memory cache for the processor units as a result of issuing load-reserve requests. In this embodiment, when one of the processor units makes a request to store data in the memory cache using a store-conditional request, the reservation registers are checked to determine if an address in the memory cache is reserved for that one of the processor units. If an address in the memory cache is reserved for that one of the processors, the data are stored at this reserved address. | 05-19-2011 |
20110131377 | MULTI-CORE PROCESSING CACHE IMAGE MANAGEMENT - A multi-core processor chip comprises at least one shared cache having a plurality of ports and a plurality of address spaces and a plurality of processor cores. Each processor core is coupled to one of the plurality of ports such that each processor core is able to access the at least one shared cache simultaneously with another of the plurality of processor cores. Each processor core is assigned one of a unique application or a unique application task and the multi-core processor is operable to execute a partitioning operating system that temporally and spatially isolates each unique application and each unique application task such that each of the plurality of processor cores does not attempt to write to the same address space of the at least one shared cache at the same time as another of the plurality of processor cores. | 06-02-2011 |
20110138124 | TRACE MODE FOR CACHE MEMORY SYSTEM - A cache, including a cache memory, is configurable to operate in a cache mode and a trace mode. When the cache is operating in the cache mode, the cache memory stores a copy of a portion of data that is stored in another memory external to the cache, and a received data access request is processed by retrieving a copy of a portion of data identified in the received data access request from the cache memory (if the cache memory stores a copy of the portion of data), or by forwarding the data access request to a data access request processing means external to the cache (if the cache memory does not store a copy of the portion of data). When the cache is operating in the trace mode, data access requests received by the cache are monitored and information relating to a received data access request is captured and stored in the cache memory. | 06-09-2011 |
20110153942 | REDUCING IMPLEMENTATION COSTS OF COMMUNICATING CACHE INVALIDATION INFORMATION IN A MULTICORE PROCESSOR - A processor may include several processor cores, each including a respective higher-level cache, wherein each higher-level cache includes higher-level cache lines; and a lower-level cache including lower-level cache lines, where each of the lower-level cache lines may be configured to store data that corresponds to multiple higher-level cache lines. In response to invalidating a given lower-level cache line, the lower-level cache may be configured to convey a sequence including several invalidation packets to the processor cores via an interface, where each member of the sequence of invalidation packets corresponds to a respective higher-level cache line to be invalidated, and where the interface is narrower than an interface capable of concurrently conveying all invalidation information corresponding to the given lower-level cache line. Each invalidation packet may include invalidation information indicative of a location of the respective higher-level cache line within different ones of the processor cores. | 06-23-2011 |
20110153943 | Aggregate Data Processing System Having Multiple Overlapping Synthetic Computers - A first SMP computer has first and second processing units and a first system memory pool, a second SMP computer has third and fourth processing units and a second system memory pool, and a third SMP computer has at least fifth and sixth processing units and third, fourth and fifth system memory pools. The fourth system memory pool is inaccessible to the third, fourth and sixth processing units and accessible to at least the second and fifth processing units, and the fifth system memory pool is inaccessible to the first, second and sixth processing units and accessible to at least the fourth and fifth processing units. A first interconnect couples the second processing unit for load-store coherent, ordered access to the fourth system memory pool, and a second interconnect couples the fourth processing unit for load-store coherent, ordered access to the fifth system memory pool. | 06-23-2011 |
20110153944 | Secure Cache Memory Architecture - A variety of circuits, methods and devices are implemented for secure storage of sensitive data in a computing system. A first dataset that is stored in main memory is accessed and a cache memory is configured to maintain logical consistency between the main memory and the cache. In response to determining that a second dataset is a sensitive dataset, the cache memory is directed to store the second dataset in a memory location of the cache memory without maintaining logical consistency with the dataset and main memory. | 06-23-2011 |
20110153945 | Apparatus and Method for Controlling the Exclusivity Mode of a Level-Two Cache - A method of controlling the exclusivity mode of a level-two cache includes generating level-two cache exclusivity control information at a processor in response to an exclusivity mode indicator, and utilizing the level-two cache exclusivity control information to configure the exclusivity mode of the level-two cache. | 06-23-2011 |
20110161585 | PROCESSING NON-OWNERSHIP LOAD REQUESTS HITTING MODIFIED LINE IN CACHE OF A DIFFERENT PROCESSOR - Methods and apparatus to efficiently process non-ownership load requests hitting modified line (M-line) in cache of a different processor are described. In one embodiment, a first agent changes the state of a first data and forwards it to a second, requesting agent who stores the first data in an alternative modified state. Other embodiments are also described. | 06-30-2011 |
20110161586 | Shared Memories for Energy Efficient Multi-Core Processors - Technologies are described herein related to multi-core processors that are adapted to share processor resources. An example multi-core processor can include a plurality of processor cores. The multi-core processor further can include a shared register file selectively coupled to two or more of the plurality of processor cores, where the shared register file is adapted to serve as a shared resource among the selected processor cores. | 06-30-2011 |
20110161587 | PROACTIVE PREFETCH THROTTLING - According to a method of data processing, a memory controller receives a plurality of data prefetch requests from multiple processor cores in the data processing system, where the plurality of prefetch load requests include a data prefetch request issued by a particular processor core among the multiple processor cores. In response to receipt of the data prefetch request, the memory controller provides a coherency response indicating an excess number of data prefetch requests. In response to the coherency response, the particular processor core reduces a rate of issuance of data prefetch requests. | 06-30-2011 |
20110161588 | FORMATION OF AN EXCLUSIVE OWNERSHIP COHERENCE STATE IN A LOWER LEVEL CACHE - In response to a memory access request of a processor core that targets a target cache line, the lower level cache of a vertical cache hierarchy associated with the processor core supplies a copy of the target cache line to an upper level cache in the vertical cache hierarchy and retains a copy in a shared coherence state. The upper level cache holds the copy of the target cache line in a private shared ownership coherence state indicating that each cached copy of the target memory block is cached within the vertical cache hierarchy associated with the processor core. In response to the upper level cache signaling replacement of the copy of the target cache line in the private shared ownership coherence state, the lower level cache updates its copy of the target cache line to the exclusive ownership coherence state without coherency messaging with other vertical cache hierarchies. | 06-30-2011 |
20110161589 | SELECTIVE CACHE-TO-CACHE LATERAL CASTOUTS - A data processing system includes first and second processing units and a system memory. The first processing unit has first upper and first lower level caches, and the second processing unit has second upper and lower level caches. In response to a data request, a victim cache line to be castout from the first lower level cache is selected, and the first lower level cache selects between performing a lateral castout (LCO) of the victim cache line to the second lower level cache and a castout of the victim cache line to the system memory based upon a confidence indicator associated with the victim cache line. In response to selecting an LCO, the first processing unit issues an LCO command on the interconnect fabric and removes the victim cache line from the first lower level cache, and the second lower level cache holds the victim cache line. | 06-30-2011 |
20110161590 | SYNCHRONIZING ACCESS TO DATA IN SHARED MEMORY VIA UPPER LEVEL CACHE QUEUING - A processing unit includes a store-in lower level cache having reservation logic that determines presence or absence of a reservation and a processor core including a store-through upper level cache, an instruction execution unit, a load unit that, responsive to a hit in the upper level cache on a load-reserve operation generated through execution of a load-reserve instruction by the instruction execution unit, temporarily buffers a load target address of the load-reserve operation, and a flag indicating that the load-reserve operation bound to a value in the upper level cache. If a storage-modifying operation is received that conflicts with the load target address of the load-reserve operation, the processor core sets the flag to a particular state, and, responsive to execution of a store-conditional instruction, transmits an associated store-conditional operation to the lower level cache with a fail indication if the flag is set to the particular state. | 06-30-2011 |
20110161591 | INCREASED NAND FLASH MEMORY READ THROUGHPUT - A method of reading sequential pages of flash memory from alternating memory blocks comprises loading data from a first page into a first primary data cache and a second page into a second primary data cache simultaneously, the first and second pages loaded from different blocks of flash memory. Data from the first primary data cache is stored in a first secondary data cache, and data from the second primary data cache is stored in a second secondary data cache. Data is sequentially provided from the first and second secondary data caches by a multiplexer coupled to the first and second data caches. | 06-30-2011 |
20110167224 | CACHE MEMORY, MEMORY SYSTEM, DATA COPYING METHOD, AND DATA REWRITING METHOD - A cache memory according to an aspect of the present invention including entries each of which includes a tag address, line data, and a dirty flag, the cache memory includes: a command execution unit which rewrites, when a first command is instructed by a processor, a tag address included in at least one entry specified by the processor among the entries to a tag address corresponding to an address specified by the processor, and to set a dirty flag corresponding to the entry; and a write-back unit which writes, back to a main memory, the line data included in the entry in which the dirty flag is set. | 07-07-2011 |
20110173391 | System and Method to Access a Portion of a Level Two Memory and a Level One Memory - A system and method to access data from a portion of a level two memory or from a level one memory is disclosed. In a particular embodiment, the system includes a level one cache and a level two memory. A first portion of the level two memory is coupled to an input port and is addressable in parallel with the level one cache. | 07-14-2011 |
20110173392 | EVICT ON WRITE, A MANAGEMENT STRATEGY FOR A PREFETCH UNIT AND/OR FIRST LEVEL CACHE IN A MULTIPROCESSOR SYSTEM WITH SPECULATIVE EXECUTION - In a multiprocessor system with at least two levels of cache, a speculative thread may run on a core processor in parallel with other threads. When the thread seeks to do a write to main memory, this access is to be written through the first level cache to the second level cache. After the write though, the corresponding line is deleted from the first level cache and/or prefetch unit, so that any further accesses to the same location in main memory have to be retrieved from the second level cache. The second level cache keeps track of multiple versions of data, where more than one speculative thread is running in parallel, while the first level cache does not have any of the versions during speculation. A switch allows choosing between modes of operation of a speculation blind first level cache. | 07-14-2011 |
20110173393 | CACHE MEMORY, MEMORY SYSTEM, AND CONTROL METHOD THEREFOR - A cache memory according to the present invention includes: a first port for input of a command from the processor; a second port for input of a command from a master other than the processor; a hit determining unit which, when a command is input to said first port or said second port, determines whether or not data corresponding to an address specified by the command is stored in said cache memory; and a first control unit which performs a process for maintaining coherency of the data stored in the cache memory and corresponding to the address specified by the command and data stored in the main memory, and outputs the input command to the main memory as a command output from the master, when the command is input to the second port and said hit determining unit determines that the data is stored in said cache memory. | 07-14-2011 |
20110185125 | RESOURCE SHARING TO REDUCE IMPLEMENTATION COSTS IN A MULTICORE PROCESSOR - A processor may include several processor cores, each including a respective higher-level cache; a lower-level cache including several tag units each including several controllers, where each controller corresponds to a respective cache bank configured to store data, and where the controllers are concurrently operable to access their respective cache banks; and an interconnect network configured to convey data between the cores and the lower-level cache. The controllers may share access to an interconnect egress port coupled to the interconnect network, and may generate multiple concurrent requests to convey data via the shared port, where each of the requests is destined for a corresponding core, and where a datapath width of the port is less than a combined width of the multiple requests. The given tag unit may arbitrate among the controllers for access to the shared port, such that the requests are transmitted to corresponding cores serially rather than concurrently. | 07-28-2011 |
20110197030 | Latency Reduction for Cache Coherent Bus-Based Cache - In one embodiment, a system comprises a plurality of agents coupled to an interconnect and a cache coupled to the interconnect. The plurality of agents are configured to cache data. A first agent of the plurality of agents is configured to initiate a transaction on the interconnect by transmitting a memory request, and other agents of the plurality of agents are configured to snoop the memory request from the interconnect. The other agents provide a response in a response phase of the transaction on the interconnect. The cache is configured to detect a hit for the memory request and to provide data for the transaction to the first agent prior to the response phase and independent of the response. | 08-11-2011 |
20110202726 | Apparatus and method for handling data in a cache - A data processing apparatus for forming a portion of a coherent cache system comprises at least one master device for performing data processing operations, and a cache coupled to the at least one master device and arranged to store data values for access by that at least one master device when performing the data processing operations. Cache coherency circuitry is responsive to a coherency request from another portion of the coherent cache system to cause a coherency action to be taken in respect of at least one data value stored in the cache. Responsive to an indication that the coherency action has resulted in invalidation of that at least one data value in the cache, refetch control circuitry is used to initiate a refetch of that at least one data value into the cache. Such a mechanism causes the refetch of data into the cache to be triggered by the coherency action performed in response to a coherency request from another portion of the coherent cache system, rather than relying on any actions taken by the at least one master device, thereby providing a very flexible and efficient mechanism for reducing cache latency in a coherent cache system. | 08-18-2011 |
20110202727 | Apparatus and Methods to Reduce Duplicate Line Fills in a Victim Cache - Techniques and methods are used to reduce allocations to a higher level cache of cache lines displaced from a lower level cache. The allocations of the displaced cache lines are prevented for displaced cache lines that are determined to be redundant in the next level cache, whereby castouts are reduced. To such ends, a line is selected to be displaced in a lower level cache. Information associated with the selected line is identified which indicates that the selected line is present in a higher level cache or the selected line is a write-through line. An allocation of the selected line in the higher level cache is prevented based on the identified information. Preventing an allocation of the selected line saves power that would be associated with the allocation. | 08-18-2011 |
20110208915 | Fused Store Exclusive/Memory Barrier Operation - In an embodiment, a processor may be configured to detect a store exclusive operation followed by a memory barrier operation in a speculative instruction stream being executed by the processor. The processor may fuse the store exclusive operation and the memory barrier operation, creating a fused operation. The fused operation may be transmitted and globally ordered, and the processor may complete both the store exclusive operation and the memory barrier operation in response to the fused operation. As the fused operation progresses through the processor and one or more other components (e.g. caches in the cache hierarchy) to the ordering point in the system, the fused operation may push previous memory operations to effect the memory barrier operation. In some embodiments, the latency for completing the store exclusive operation and the subsequent data memory barrier operation may be reduced if the store exclusive operation is successful at the ordering point. | 08-25-2011 |
20110219190 | CACHE WITH RELOAD CAPABILITY AFTER POWER RESTORATION - A method and apparatus for repopulating a cache are disclosed. At least a portion of the contents of the cache are stored in a location separate from the cache. Power is removed from the cache and is restored some time later. After power has been restored to the cache, it is repopulated with the portion of the contents of the cache that were stored separately from the cache. | 09-08-2011 |
20110264860 | MULTI-MODAL DATA PREFETCHER - A microprocessor includes first and second cache memories occupying distinct hierarchy levels, the second backing the first. A prefetcher monitors load operations and maintains a recent history of the load operations from a cache line and determines whether the recent history indicates a clear direction. The prefetcher prefetches one or more cache lines into the first cache memory when the recent history indicates a clear direction and otherwise prefetches the one or more cache lines into the second cache memory. The prefetcher also determines whether the recent history indicates the load operations are large and, other things being equal, prefetches a greater number of cache lines when large than small. The prefetcher also determines whether the recent history indicates the load operations are received on consecutive clock cycles and, other things being equal, prefetches a greater number of cache lines when on consecutive clock cycles than not. | 10-27-2011 |
20110264861 | METHODS AND SYSTEMS FOR UTILIZING BYTECODE IN AN ON-DEMAND SERVICE ENVIRONMENT INCLUDING PROVIDING MULTI-TENANT RUNTIME ENVIRONMENTS AND SYSTEMS - Execution of code in a multitenant runtime environment. A request to execute code corresponding to a tenant identifier (ID) is received in a multitenant environment. The multitenant database stores data for multiple client entities each identified by a tenant ID having one of one or more users associated with the tenant ID. Users of each of multiple client entities can only access data identified by a tenant ID associated with the respective client entity. The multitenant database is a hosted database provided by an entity separate from the client entities, and provides on-demand database service to the client entities. Source code corresponding to the code to be executed is retrieved from a multitenant database. The retrieved source code is compiled. The compiled code is executed in the multitenant runtime environment. The memory used by the compiled code is freed in response to completion of the execution of the compiled code. | 10-27-2011 |
20110271057 | CACHE ACCESS FILTERING FOR PROCESSORS WITHOUT SECONDARY MISS DETECTION - The disclosed embodiments provide a system that filters duplicate requests from an L | 11-03-2011 |
20110276762 | COORDINATED WRITEBACK OF DIRTY CACHELINES - A data processing system includes a processor core and a cache memory hierarchy coupled to the processor core. The cache memory hierarchy includes at least one upper level cache and a lowest level cache. A memory controller is coupled to the lowest level cache and to a system memory and includes a physical write queue from which the memory controller writes data to the system memory. The memory controller initiates accesses to the lowest level cache to place into the physical write queue selected cachelines having spatial locality with data present in the physical write queue. | 11-10-2011 |
20110276763 | MEMORY BUS WRITE PRIORITIZATION - A data processing system includes a multi-level cache hierarchy including a lowest level cache, a processor core coupled to the multi-level cache hierarchy, and a memory controller coupled to the lowest level cache and to a memory bus of a system memory. The memory controller includes a physical read queue that buffers data read from the system memory via the memory bus and a physical write queue that buffers data to be written to the system memory via the memory bus. The memory controller grants priority to write operations over read operations on the memory bus based upon a number of dirty cachelines in the lowest level cache memory. | 11-10-2011 |
20110289276 | CACHE MEMORY APPARATUS - A cache memory apparatus includes an L1 cache memory, an L2 cache memory coupled to the L1 cache memory, an arithmetic logic unit (ALU) within the L2 cache memory, the combined ALU and L2 cache memory being configured to perform therewithin at least one of: an arithmetic operation, a logical bit mask operation; the cache memory apparatus being further configured to interact with at least one processor such that atomic memory operations bypass the L1 cache memory and go directly to the L2 cache memory. | 11-24-2011 |
20110302372 | SMT/ECO MODE BASED ON CACHE MISS RATE - A computer implemented method for managing an execution mode for a parallel processor is provided. A monitor identifies a first efficiency rate for a first contested resource of the parallel processor operating in a first operating mode. Responsive to identifying the first efficiency rate for the first contested resource, the monitor identifies whether the first efficiency rate for the contested resource of the parallel processor operating in the first operating mode exceeds a threshold. Responsive to identifying that the efficiency rate for the contested resource exceeds the threshold, an operation of the parallel processor is changed to a second operating mode. | 12-08-2011 |
20110320720 | Cache Line Replacement In A Symmetric Multiprocessing Computer - Cache line replacement in a symmetric multiprocessing computer, the computer having a plurality of processors, a main memory that is shared among the processors, a plurality of cache levels including at least one high level of private caches and a low level shared cache, and a cache controller that controls the shared cache, including receiving in the cache controller a memory instruction that requires replacement of a cache line in the low level shared cache; and selecting for replacement by the cache controller a least recently used cache line in the low level shared cache that has no copy stored in any higher level cache. | 12-29-2011 |
20110320721 | DYNAMIC TRAILING EDGE LATENCY ABSORPTION FOR FETCH DATA FORWARDED FROM A SHARED DATA/CONTROL INTERFACE - A computer-implemented method for managing data transfer in a multi-level memory hierarchy that includes receiving a fetch request for allocation of data in a higher level memory, determining whether a data bus between the higher level memory and a lower level memory is available, bypassing an intervening memory between the higher level memory and the lower level memory when it is determined that the data bus is available, and transferring the requested data directly from the higher level memory to the lower level memory. | 12-29-2011 |
20120017049 | METHOD AND APPARATUS FOR IMPLEMENTING CACHE COHERENCY OF A PROCESSOR - An advanced processor comprises a plurality of multithreaded processor cores each having a data cache and instruction cache. A data switch interconnect is coupled to each of the processor cores and configured to pass information among the processor cores. A messaging network is coupled to each of the processor cores and a plurality of communication ports. In one aspect of an embodiment of the invention, the data switch interconnect is coupled to each of the processor cores by its respective data cache, and the messaging network is coupled to each of the processor cores by its respective message station. Advantages of the invention include the ability to provide high bandwidth communications between computer systems and memory in an efficient and cost-effective manner. | 01-19-2012 |
20120030428 | INFORMATION PROCESSING DEVICE, MEMORY MANAGEMENT DEVICE AND MEMORY MANAGEMENT METHOD - According to one embodiment, an information processing device includes a first determination section and a setting section. The first determination section determines inconsistency between first data and second data. The first data is stored in a nonvolatile semiconductor memory. The second data is corresponding to the first data and stored in a semiconductor memory. The setting section sets execution timing of write back based on access frequency information associated with the second data. | 02-02-2012 |
20120030429 | METHOD FOR COORDINATING UPDATES TO DATABASE AND IN-MEMORY CACHE - A computer method and system of caching. In a multi-threaded application, different threads execute respective transactions accessing a data store (e.g. database) from a single server. The method and system represent status of datastore transactions using respective certain (e.g. Future) parameters. | 02-02-2012 |
20120042126 | METHOD FOR CONCURRENT FLUSH OF L1 AND L2 CACHES - The present invention provides a method and apparatus for use with a hierarchical cache system. The method may include concurrently flushing one or more first caches and a second cache of a multi-level cache. Each first cache is smaller and at a lower level in the multi-level cache than the second level cache. | 02-16-2012 |
20120042127 | CACHE PARTITIONING - A method and apparatus for partitioning a cache includes determining an allocation of a subcache out of a plurality of subcaches within the cache for association with a compute unit out of a plurality of compute units. Data is processed by the compute unit, and the compute unit evicts a line. The evicted line is written to the subcache associated with the compute unit. | 02-16-2012 |
20120059995 | Apparatus and Methods to Reduce Castouts in a Multi-Level Cache Hierarchy - Techniques and methods are used to reduce allocations to a higher level cache of cache lines displaced from a lower level cache. The allocations of the displaced cache lines are prevented for displaced cache lines that are determined to be redundant in the next level cache, whereby castouts are reduced. To such ends, a line is selected to be displaced in a lower level cache. Information associated with the selected line is identified which indicates that the selected line is present in a higher level cache. An allocation of the selected line in the higher level cache is prevented based on the identified information. Preventing an allocation of the selected line saves power that would be associated with the allocation. | 03-08-2012 |
20120066455 | HYBRID PREFETCH METHOD AND APPARATUS - A hybrid prefetch method and apparatus is disclosed. A processor includes a hybrid prefetch unit configured to generate addresses for accessing data from a system memory. The hybrid prefetch unit includes a first prediction unit configured to generate a first memory address according to a first prefetch algorithm and a second prediction unit configured to generate a second memory address according to a second prefetch algorithm. The hybrid prefetcher further includes an arbitration unit configured to select one of the first and second memory addresses and further configured to provide the selected one of the first and second memory addresses during a prefetch operation. | 03-15-2012 |
20120072667 | VARIABLE LINE SIZE PREFETCHER FOR MULTIPLE MEMORY REQUESTORS - A prefetch unit generates prefetch addresses in response to an initial received memory read request, an address associated with the initial received memory read request, a line length of the requestor of the initial received memory read request, and a request type width of the initial received memory read request. Prefetch operations are generated using the generated prefetch addresses, wherein each generated prefetch address is stored in a prefetch buffer slot that is selected by a prefetch FIFO (First In First Out) prefetch counter. Subsequent hits on the prefetcher result in returning prefetched data to the requestor in response to a subsequent memory read request received after the initial received memory read request. | 03-22-2012 |
20120072668 | SLOT/SUB-SLOT PREFETCH ARCHITECTURE FOR MULTIPLE MEMORY REQUESTORS - A prefetch unit generates a prefetch address in response to an address associated with a memory read request received from the first or second cache. The prefetch unit includes a prefetch buffer that is arranged to store the prefetch address in an address buffer of a selected slot of the prefetch buffer, where each slot of the prefetch unit includes a buffer for storing a prefetch address, and two sub-slots. Each sub-slot includes a data buffer for storing data that is prefetched using the prefetch address stored in the slot, and one of the two sub-slots of the slot is selected in response to a portion of the generated prefetch address. Subsequent hits on the prefetcher result in returning prefetched data to the requestor in response to a subsequent memory read request received after the initial received memory read request. | 03-22-2012 |
20120079202 | MULTISTREAM PREFETCH BUFFER - A prefetching system receives a memory read request having an associated address. In response to a determination that a most significant portion of the associated address is not present within slots of an array for storing the most significant portion of predicted addresses, a prefetch FIFO (First In-First Out) counter is modified to point to a next slot of the array and a new predicted address is generated in response to the received most significant portion of the associated address and is placed in the next slot of the array. The prefetch FIFO counter cycles through the slots of the array before wrapping around to a first slot of the array for storing the most significant portion of predicted addresses. | 03-29-2012 |
20120079203 | Transaction Info Bypass for Nodes Coupled to an Interconnect Fabric - A shared resource within a module may be accessed by a request from an external requester. An external transaction request may be received from an external requester outside the module for access to the shared resource that includes control information, not all of which is needed to access the shared resource. The external transaction request may be modified to form a modified request by removing a portion of the locally unneeded control information and storing the unneeded portion of control information as an entry in a bypass buffer. A reply received from the shared resource may be modified by appending the stored portion of control information from the entry in the bypass buffer before sending the modified reply to the external requester. | 03-29-2012 |
20120084511 | INEFFECTIVE PREFETCH DETERMINATION AND LATENCY OPTIMIZATION - A processor of an information handling system (IHS) initiates an L3 cache prefetch operation in response to a demand load during instruction processing. The processor selects an L3 cache prefetch at random for tracking as a target prefetched instruction. The processor initiates an L1 cache target prefetch operation and stores the resultant target prefetched instruction in the L1 cache. If a demand load arrives, the processor analyses the target prefetched instruction for effectiveness and determines the source of the prefetch data. If a demand does not arrive, the processor tests to determine if the particular prefetched instruction timed out in the cache and identifies the infectiveness of the prefetch operation. The processor samples multiple prefetch operations at random and generates a history of prefetch effectiveness and other useful prefetch information. The processor stores the prefetch effectiveness information to enable reduction or removal of ineffective prefetch operations. | 04-05-2012 |
20120089782 | METHOD FOR MANAGING AND TUNING DATA MOVEMENT BETWEEN CACHES IN A MULTI-LEVEL STORAGE CONTROLLER CACHE - A method for managing data movement in a multi-level cache system having a primary cache and a secondary cache. The method includes determining whether an unallocated space of the primary cache has reached a minimum threshold; selecting at least one outgoing data block from the primary cache when the primary cache reached the minimum threshold; initiating a de-stage process for de-staging the outgoing data block from the primary cache; and terminating the de-stage process when the unallocated space of the primary cache has reached an upper threshold. The de-stage process further includes determining whether a cache hit has occurred in the secondary cache before; storing the outgoing data block in the secondary cache when the cache hit has occurred in the secondary cache before; generating and storing metadata regarding the outgoing data block; and deleting the outgoing data block from the primary cache. | 04-12-2012 |
20120102269 | USING SPECULATIVE CACHE REQUESTS TO REDUCE CACHE MISS DELAYS - The disclosed embodiments provide a system that uses speculative cache requests to reduce cache miss delays for a cache in a multi-level memory hierarchy. During operation, the system receives a memory reference which is directed to a cache line in the cache. Next, while determining whether the cache line is available in the cache, the system determines whether the memory reference is likely to miss in the cache, and if so, simultaneously sends a speculative request for the cache line to a lower level of the multi-level memory hierarchy. | 04-26-2012 |
20120110266 | DISABLING CACHE PORTIONS DURING LOW VOLTAGE OPERATIONS - Methods and apparatus relating to disabling one or more cache portions during low voltage operations are described. In some embodiments, one or more extra bits may be used for a portion of a cache that indicate whether the portion of the cache is capable at operating at or below Vccmin levels. Other embodiments are also described and claimed. | 05-03-2012 |
20120117326 | APPARATUS AND METHOD FOR ACCESSING CACHE MEMORY - The present invention relates to an apparatus and a method for accessing a cache memory. The cache memory comprises a level-one memory and a level-two memory. The apparatus for accessing the cache memory according to the present invention comprises a register unit and a control unit. The control unit receives a first read command and a reject datum of the level-one memory and stores the reject datum of the level-one memory to the register unit. Then the control unit reads and stores a stored datum of the level-two memory to the level-one memory according to the first read command. | 05-10-2012 |
20120124291 | Secondary Cache Memory With A Counter For Determining Whether to Replace Cached Data - A selective cache includes a set configured to receive data evicted from a number of primary sets of a primary cache. The selective cache also includes a counter associated with the set. The counter is configured to indicate a frequency of access to data within the set. A decision whether to replace data in the set with data from one of the primary sets is based on a value of the counter. | 05-17-2012 |
20120137074 | METHOD AND APPARATUS FOR STREAM BUFFER MANAGEMENT INSTRUCTIONS - A method and system to perform stream buffer management instructions in a processor. The stream buffer management instructions facilitate the creation and usage of a dedicated memory space or stream buffer of the processor in one embodiment of the invention. The dedicated memory space is a contiguous memory space and has a sequential or linear addressing scheme in one embodiment of the invention. The processor has logic to execute a stream buffer management instruction to copy data from a source memory address to a destination memory address that is specified with a desired level of memory hierarchy. | 05-31-2012 |
20120137075 | System and Method for a Cache in a Multi-Core Processor - The invention relates to a multi-core processor system, in particular a single-package multi-core processor system, comprising at least two processor cores, preferably at least four processor cores, each of said at least two cores, preferably at least four processor cores, having a local LEVEL-1 cache, a tree communication structure combining the multiple LEVEL-1 caches, the tree having at least one node, preferably at least three nodes for a four processor core multi-core processor, and TAG information is associated to data managed within the tree, usable in the treatment of the data. | 05-31-2012 |
20120144118 | METHOD AND APPARATUS FOR SELECTIVELY PERFORMING EXPLICIT AND IMPLICIT DATA LINE READS ON AN INDIVIDUAL SUB-CACHE BASIS - A method and apparatus are described for selectively performing explicit and implicit data line reads. A controller, located in a cache, individually monitors the data resource availability for each of a plurality of sub-caches also located in the cache. The controller receives a data line request, generates an individual implicit tag request for each of the sub-caches that currently have sufficient data resources to perform an implicit data line read, and generates an individual explicit tag request for each of the sub-caches that do not currently have sufficient data resources to perform an implicit data line read. Each tag request includes an address of the requested data line and an indicator, (represented by at least one bit), of whether the tag request is an explicit or implicit tag request. | 06-07-2012 |
20120151144 | METHOD AND SYSTEM FOR DETERMINING A CACHE MEMORY CONFIGURATION FOR TESTING - A method and computer device for determining the cache memory configuration. The method includes allocating an amount of cache memory from a first memory level of the cache memory, and determining a read transfer time for the allocated amount of cache memory. The allocated amount of cache memory then is increased and the read transfer time for the increased allocated amount of cache memory is determined. The allocated amount of cache memory continues to be increased and the read transfer time determined for the each allocated amount until all of the cache memory in all of the cache memory levels has been allocated. The cache memory configuration is determined based on the read transfer times from the allocated portions of the cache memory. The determined cache memory configuration includes the number of cache memory levels and the respective capacities of each cache memory level. | 06-14-2012 |
20120159073 | METHOD AND APPARATUS FOR ACHIEVING NON-INCLUSIVE CACHE PERFORMANCE WITH INCLUSIVE CACHES - An apparatus and method for improving cache performance in a computer system having a multi-level cache hierarchy. For example, one embodiment of a method comprises: selecting a first line in a cache at level N for potential eviction; querying a cache at level M in the hierarchy to determine whether the first cache line is resident in the cache at level M, wherein M06-21-2012 | |
20120159074 | METHOD, APPARATUS, AND SYSTEM FOR ENERGY EFFICIENCY AND ENERGY CONSERVATION INCLUDING DYNAMIC CACHE SIZING AND CACHE OPERATING VOLTAGE MANAGEMENT FOR OPTIMAL POWER PERFORMANCE - Embodiments of the invention relate to increased energy efficiency and conservation by reducing and increasing an amount of cache available for use by a processor, and an amount of power supplied to the cache and to the processor, based on the amount of cache actually being used by the processor to process data. For example, a power control unit (PCU) may monitor a last level cache (LLC) to identify if the size or amount of the cache being used by a processor to process data and to determine heuristics based on that amount. Based on the monitored amount of cache being used and the heuristics, the PCU causes a corresponding decrease or increase in an amount of the cache available for use by the processor, and a corresponding decrease or increase in an amount of power supplied to the cache and to the processor. | 06-21-2012 |
20120159075 | CACHE LINE USE HISTORY BASED DONE BIT MODIFICATION TO D-CACHE REPLACEMENT SCHEME - A method of providing history based done logic includes receiving a cache line in a L2 cache; determining if the cache line has a history of access at least three times on a previous call into the L2 cache; providing the cache line directly to a processor if the history of access was less then the at least three times; and loading the cache line into an L1 cache if the history of access was the at least three times. | 06-21-2012 |
20120166729 | SUBCACHE AFFINITY - A method and apparatus for controlling affinity of subcaches is disclosed. When a core compute unit evicts a line of victim data, a prioritized search for space allocation on available subcaches is executed, in order of proximity between the subcache and the compute unit. The victim data may be injected into an adjacent subcache if space is available. Otherwise, a line may be evicted from the adjacent subcache to make room for the victim data or the victim data may be sent to the next closest subcache. To retrieve data, a core compute unit sends a Tag Lookup Request message directly to the nearest subcache as well as to a cache controller, which controls routing of messages to all of the subcaches. A Tag Lookup Response message is sent back to the cache controller to indicate if the requested data is located in the nearest sub-cache. | 06-28-2012 |
20120179872 | GLOBAL INSTRUCTIONS FOR SPIRAL CACHE MANAGEMENT - A method of operation of a pipelined cache memory supports global operations within the cache. The cache may be a spiral cache, with a move-to-front M2F network for moving values from a backing store to a front-most tile coupled to a processor or lower-order level of a memory hierarchy and a spiral push-back network for pushing out modified values to the backing-store. The cache controller manages application of global commands by propagating individual commands to the tiles. The global commands may provide zeroing, flushing and reconciling of the given tiles. Commands for interrupting and resuming interrupted global commands may be implemented, to reduce halting or slowing of processing while other global operations are in process. A line detector within each tile supports reconcile and flush operations, and a line patcher in the controller provides for initializing address ranges with no processor intervention. | 07-12-2012 |
20120191913 | Distributed User Controlled Multilevel Block and Global Cache Coherence with Accurate Completion Status - This invention permits user controlled cache coherence operations with the flexibility to do these operations on all levels of cache together or each level independently. In the case of an all level operation, the user does not have to monitor and sequence each phase of the operation. This invention also provides a way for users to track completion of these operations. This is critical for multi-core/multi-processor devices. Multiple cores may be accessing the end point and the user/application needs to be able to identify when the operation from one core is complete, before permitting other cores access that data or code. | 07-26-2012 |
20120191914 | PERFORMANCE AND POWER IMPROVEMENT ON DMA WRITES TO LEVEL TWO COMBINED CACHE/SRAM THAT IS CAUSED IN LEVEL ONE DATA CACHE AND LINE IS VALID AND DIRTY - This invention optimizes DMA writes to directly addressable level two memory that is cached in level one and the line is valid and dirty. When the level two controller detects that a line is valid and dirty in level one, the level two memory need not update its copy of the data. Level one memory will replace the level two copy with a victim writeback at a future time. Thus the level two memory need not store write a copy. This limits the number of DMA writes to level two directly addressable memory and thus improves performance and minimizes dynamic power. This also frees the level two memory for other master/requestors. | 07-26-2012 |
20120191915 | EFFICIENT LEVEL TWO MEMORY BANKING TO IMPROVE PERFORMANCE FOR MULTIPLE SOURCE TRAFFIC AND ENABLE DEEPER PIPELINING OF ACCESSES BY REDUCING BANK STALLS - The level two memory of this invention supports coherency data transfers with level one cache and DMA data transfers. The width of DMA transfers is 16 bytes. The width of level one instruction cache transfers is 32 bytes. The width of level one data transfers is 64 bytes. The width of level two allocates is 128 bytes. DMA transfers are interspersed with CPU traffic and have similar requirements of efficient throughput and reduced latency. An additional challenge is that these two data streams (CPU and DMA) require access to the level two memory at the same time. This invention is a banking technique for the level two memory to facilitate efficient data transfers. | 07-26-2012 |
20120191916 | OPTIMIZING TAG FORWARDING IN A TWO LEVEL CACHE SYSTEM FROM LEVEL ONE TO LEVER TWO CONTROLLERS FOR CACHE COHERENCE PROTOCOL FOR DIRECT MEMORY ACCESS TRANSFERS - A second level memory controller uses shadow tags | 07-26-2012 |
20120198160 | Efficient Cache Allocation by Optimizing Size and Order of Allocate Commands Based on Bytes Required by CPU - This invention is a data processing system having a multi-level cache system. The multi-level cache system includes at least first level cache and a second level cache. Upon a cache miss in both the at least one first level cache and the second level cache the data processing system evicts and allocates a cache line within the second level cache. The data processing system determine from the miss address whether the request falls within a low half or a high half of the allocated cache line. The data processing system first requests data from external memory of the miss half cache line. Upon receipt data is supplied to the at least one first level cache and the CPU. The data processing system then requests data from external memory for the other half of the second level cache line. | 08-02-2012 |
20120198161 | NON-BLOCKING, PIPELINED WRITE ALLOCATES WITH ALLOCATE DATA MERGING IN A MULTI-LEVEL CACHE SYSTEM - This invention handles write request cache misses. The cache controller stores write data, sends a read request to external memory for a corresponding cache line, merges the write data with data returned from the external memory and stores merged data in the cache. The cache controller includes buffers with plural entries storing the write address, the write data, the position of the write data within a cache line and unique identification number. This stored data enables the cache controller to proceed to servicing other access requests while waiting for response from the external memory. | 08-02-2012 |
20120198162 | Hazard Prevention for Data Conflicts Between Level One Data Cache Line Allocates and Snoop Writes - A comparator compares the address of DMA writes in the final entry of the FIFO stack to all pending read addresses in a monitor memory. If there is no match, then the DMA access is permitted to proceed. If the DMA write is to a cache line with a pending read, the DMA write access is stalled together with any DMA accesses behind the DMA write in the FIFO stack. DMA read accesses are not compared but may stall behind a stalled DMA write access. These stalls occur if the cache read was potentially cacheable. This is possible for some monitored accesses but not all. If a DMA write is stalled, the comparator releases it to complete once there are no pending reads to the same cache line. | 08-02-2012 |
20120198163 | Level One Data Cache Line Lock and Enhanced Snoop Protocol During Cache Victims and Writebacks to Maintain Level One Data Cache and Level Two Cache Coherence - This invention assures cache coherence in a multi-level cache system upon eviction of a higher level cache line. A victim buffer stored data on evicted lines. On a DMA access that may be cached in the higher level cache the lower level cache sends a snoop write. The address of this snoop write is compared with the victim buffer. On a hit in the victim buffer the write completes in the victim buffer. When the victim data passes to the next cache level it is written into a second victim buffer to be retired when the data is committed to cache. DMA write addresses are compared to addresses in this second victim buffer. On a match the write takes place in the second victim buffer. On a failure to match the controller sends a snoop write. | 08-02-2012 |
20120198164 | Programmable Address-Based Write-Through Cache Control - This invention is a cache system with a memory attribute register having plural entries. Each entry stores a write-through or a write-back indication for a corresponding memory address range. On a write to cached data the cache the cache consults the memory attribute register for the corresponding address range. Writes to addresses in regions marked as write-through always update all levels of the memory hierarchy. Writes to addresses in regions marked as write- back update only the first cache level that can service the write. The memory attribute register is preferably a memory mapped control register writable by the central processing unit. | 08-02-2012 |
20120198165 | Mechanism to Update the Status of In-Flight Cache Coherence In a Multi-Level Cache Hierarchy - Separate buffers store snoop writes and direct memory access writes. A multiplexer selects one of these for input to a FIFO buffer. The FIFO buffer is split into multiple FIFOs including: a command FIFO; an address FIFO; and write data FIFO. Each snoop command is compared with an allocated line set and way and deleted on a match to avoid data corruption. Each snoop command is also compared with a victim address. If the snoop address matches victim address logic redirects the snoop command to a victim buffer and the snoop write is completed in the victim buffer. | 08-02-2012 |
20120198166 | Memory Attribute Sharing Between Differing Cache Levels of Multilevel Cache - The level one memory controller maintains a local copy of the cacheability bit of each memory attribute register. The level two memory controller is the initiator of all configuration read/write requests from the CPU. Whenever a configuration write is made to a memory attribute register, the level one memory controller updates its local copy of the memory attribute register. | 08-02-2012 |
20120198167 | SYNCHRONIZING ACCESS TO DATA IN SHARED MEMORY VIA UPPER LEVEL CACHE QUEUING - A processing unit includes a store-in lower level cache having reservation logic that determines presence or absence of a reservation and a processor core including a store-through upper level cache, an instruction execution unit, a load unit that, responsive to a hit in the upper level cache on a load-reserve operation generated through execution of a load-reserve instruction by the instruction execution unit, temporarily buffers a load target address of the load-reserve operation, and a flag indicating that the load-reserve operation bound to a value in the upper level cache. If a storage-modifying operation is received that conflicts with the load target address of the load-reserve operation, the processor core sets the flag to a particular state, and, responsive to execution of a store-conditional instruction, transmits an associated store-conditional operation to the lower level cache with a fail indication if the flag is set to the particular state. | 08-02-2012 |
20120203968 | COORDINATED WRITEBACK OF DIRTY CACHELINES - A data processing system includes a processor core and a cache memory hierarchy coupled to the processor core. The cache memory hierarchy includes at least one upper level cache and a lowest level cache. A memory controller is coupled to the lowest level cache and to a system memory and includes a physical write queue from which the memory controller writes data to the system memory. The memory controller initiates accesses to the lowest level cache to place into the physical write queue selected cachelines having spatial locality with data present in the physical write queue. | 08-09-2012 |
20120203969 | MEMORY BUS WRITE PRIORITIZATION - A data processing system includes a multi-level cache hierarchy including a lowest level cache, a processor core coupled to the multi-level cache hierarchy, and a memory controller coupled to the lowest level cache and to a memory bus of a system memory. The memory controller includes a physical read queue that buffers data read from the system memory via the memory bus and a physical write queue that buffers data to be written to the system memory via the memory bus. The memory controller grants priority to write operations over read operations on the memory bus based upon a number of dirty cachelines in the lowest level cache memory. | 08-09-2012 |
20120210068 | SYSTEMS AND METHODS FOR A MULTI-LEVEL CACHE - A multi-level cache comprises a plurality of cache levels, each configured to cache I/O request data pertaining to I/O requests of a different respective type and/or granularity. A cache device manager may allocate cache storage space to each of the cache levels. Each cache level maintains respective cache metadata that associates I/O request data with respective cache address. The cache levels monitor I/O requests within a storage stack, apply selection criteria to identify cacheable I/O requests, and service cacheable I/O requests using the cache storage device. | 08-16-2012 |
20120210069 | SHARED CACHE FOR A TIGHTLY-COUPLED MULTIPROCESSOR - Computing apparatus ( | 08-16-2012 |
20120215983 | DATA CACHING METHOD - Data caching for use in a computer system including a lower cache memory and a higher cache memory. The higher cache memory receives a fetch request. It is then determined by the higher cache memory the state of the entry to be replaced next. If the state of the entry to be replaced next indicates that the entry is exclusively owned or modified, the state of the entry to be replaced next is changed such that a following cache access is processed at a higher speed compared to an access processed if the state would stay unchanged. | 08-23-2012 |
20120221793 | SYSTEMS AND METHODS FOR RECONFIGURING CACHE MEMORY - A microprocessor system is disclosed that includes a first data cache that is shared by a first group of one or more program threads in a multi-thread mode and used by one program thread in a single-thread mode. A second data cache is shared by a second group of one or more program threads in the multi-thread mode and is used as a victim cache for the first data cache in the single-thread mode. | 08-30-2012 |
20120221794 | Computer Cache System With Stratified Replacement - Methods for selecting a line to evict from a data storage system are provided. A computer system implementing a method for selecting a line to evict from a data storage system is also provided. The methods include selecting an uncached class line for eviction prior to selecting a cached class line for eviction. | 08-30-2012 |
20120226866 | DYNAMIC MIGRATION OF VIRTUAL MACHINES BASED ON WORKLOAD CACHE DEMAND PROFILING - A computer-implemented method comprises obtaining a cache hit ratio for each of a plurality of virtual machines, and identifying, from among the plurality of virtual machines, a first virtual machine having a cache hit ratio that is less than a threshold ratio. The identified first virtual machine is then migrated from the first physical server having a first cache size to a second physical server having a second cache size that is greater than the first cache size. Optionally, a virtual machine having a cache hit ratio that is less than a threshold ratio is identified on a class-specific basis, such as for L1 cache, L2 cache and L3 cache. | 09-06-2012 |
20120226867 | Binary tree based multilevel cache system for multicore processors - A binary tree based multi-level cache system for multi-core processors and its two possible implementations LogN and LogN+1 models maintaining a true pyramid is described. | 09-06-2012 |
20120233407 | CACHE PHASE DETECTOR AND PROCESSOR CORE - A cache phase detector included in a processor core according to example embodiments includes a counting unit and a signal generating unit. The counting unit generates a critical section miscount by counting a request from the processor core resulting in a tag miss and a valid cache line based on a tag miss signal and a cache line valid signal. The signal generating unit compares the critical section miscount from the counting unit with a reference value, and generates a cache phase change signal if the critical section miscount is greater than the reference value. | 09-13-2012 |
20120239883 | RESOURCE SHARING TO REDUCE IMPLEMENTATION COSTS IN A MULTICORE PROCESSOR - A processor may include several processor cores, each including a respective higher-level cache; a lower-level cache including several tag units each including several controllers, where each controller corresponds to a respective cache bank configured to store data, and where the controllers are concurrently operable to access their respective cache banks; and an interconnect network configured to convey data between the cores and the lower-level cache. The controllers in a given tag unit may share access to a resource that may include one or more of an interconnect egress port coupled to the interconnect network, an interconnect ingress port coupled to the interconnect network, a test controller, or a data storage structure. | 09-20-2012 |
20120246407 | METHOD AND SYSTEM TO IMPROVE UNALIGNED CACHE MEMORY ACCESSES - A method and system to improve unaligned cache memory accesses. In one embodiment of the invention, a processing unit has logic to facilitate access of at least two cache memory lines of a cache memory in a single read operation. By doing so, it avoids additional read operations or cycles to read the required data that is cached in more than one cache memory line. Embodiments of the invention facilitate the streaming of unaligned vector loads that does not require substantially more power than streaming aligned vector loads. For example, in one embodiment of the invention, the streaming of unaligned vector loads consumes less than two times the power requirements of streaming aligned vector loads. | 09-27-2012 |
20120254540 | METHOD AND SYSTEM FOR OPTIMIZING PREFETCHING OF CACHE MEMORY LINES - A method and system to optimize prefetching of cache memory lines in a processing unit. The processing unit has logic to determine whether a vector memory operand is cached in two or more adjacent cache memory lines. In one embodiment of the invention, the determination of whether the vector memory operand is cached in two or more adjacent cache memory lines is based on the size and the starting address of the vector memory operand. In one embodiment of the invention, the pre-fetching of the two or more adjacent cache memory lines that cache the vector memory operand is performed using a single instruction that uses one issue slot and one data cache memory execution slot. By doing so, it avoids additional software prefetching instructions or operations to read a single vector memory operand when the vector memory operand is cached in more than one cache memory line. | 10-04-2012 |
20120254541 | METHODS AND APPARATUS FOR UPDATING DATA IN PASSIVE VARIABLE RESISTIVE MEMORY - Methods and apparatus for updating data in passive variable resistive memory (PVRM) are provided. In one example, a method for updating data stored in PVRM is disclosed. The method includes updating a memory block of a plurality of memory blocks in a cache hierarchy without invalidating the memory block. The updated memory block may be copied from the cache hierarchy to a write through buffer. Additionally, the method includes writing the updated memory block to the PVRM, thereby updating the data in the PVRM. | 10-04-2012 |
20120260041 | SIMULTANEOUS EVICTION AND CLEANING OPERATIONS IN A CACHE - Embodiments provide a method comprising receiving, at a cache associated with a central processing unit that is disposed on an integrated circuit, a request to perform a cache operation on the cache; in response to receiving and processing the request, determining that first data cached in a first cache line of the cache is to be written to a memory that is coupled to the integrated circuit; identifying a second cache line in the cache, the second cache line being complimentary to the first cache line; transmitting a single memory instruction from the cache to the memory to write to the memory (i) the first data from the first cache line and (ii) second data from the second cache line; and invalidating the first data in the first cache line, without invalidating the second data in the second cache line. | 10-11-2012 |
20120265938 | PERFORMING A PARTIAL CACHE LINE STORAGE-MODIFYING OPERATION BASED UPON A HINT - Analyzing pre-processed code includes identifying at least one storage-modifying construct specifying a storage-modifying memory access to a memory hierarchy of a data processing system and determining if more than one granule of a cache line of data containing multiple granules that is targeted by the storage-modifying construct is subsequently referenced by said pre-processed code. Post-processed code including a storage-modifying instruction corresponding to the at least one storage-modifying construct in the pre-processed code is generated and stored. Generating the post-processed code includes marking the storage-modifying instruction with a partial cache line hint indicating that said storage-modifying instruction targets less than a full cache line of data within a memory hierarchy if the analyzing indicates only one granule of the target cache line will be accessed while the cache line is held in the cache memory and otherwise refraining from marking the storage-modifying instruction with the partial cache line hint. | 10-18-2012 |
20120272003 | EFFICIENT DATA PREFETCHING IN THE PRESENCE OF LOAD HITS - A microprocessor configured to access an external memory includes a first-level cache, a second-level cache, and a bus interface unit (BIU) configured to interface the first-level and second-level caches to a bus used to access the external memory. The BIU is configured to prioritize requests from the first-level cache above requests from the second-level cache. The second-level cache is configured to generate a first request to the BIU to fetch a cache line from the external memory. The second-level cache is also configured to detect that the first-level cache has subsequently generated a second request to the second-level cache for the same cache line. The second-level cache is also configured to request the BIU to refrain from performing a transaction on the bus to fulfill the first request if the BIU has not yet been granted ownership of the bus to fulfill the first request. | 10-25-2012 |
20120272004 | EFFICIENT DATA PREFETCHING IN THE PRESENCE OF LOAD HITS - A memory subsystem in a microprocessor includes a first-level cache, a second-level cache, and a prefetch cache configured to speculatively prefetch cache lines from a memory external to the microprocessor. The second-level cache and the prefetch cache are configured to allow the same cache line to be simultaneously present in both. If a request by the first-level cache for a cache line hits in both the second-level cache and in the prefetch cache, the prefetch cache invalidates its copy of the cache line and the second-level cache provides the cache line to the first-level cache. | 10-25-2012 |
20120297139 | MEMORY MANAGEMENT UNIT, APPARATUSES INCLUDING THE SAME, AND METHOD OF OPERATING THE SAME - A method of operating a memory management unit includes accessing a translation lookaside buffer (TLB), translating a page number of a virtual address into a frame number of a physical address when there is a match for the page number of the virtual address in the TLB, executing a miss process when there is no match for the page number of the virtual address in the TLB. The miss process includes accessing a page table translation (PTT) cache, checking whether access information of a k-th level page table corresponding to a k-th page number that will be accessed in the virtual address is in the PTT cache, acquiring a base address of a physical page using the access information, and determining the frame number of physical address corresponding to the page number of the virtual address using a page offset in the physical page. | 11-22-2012 |
20120311267 | EXTERNAL CACHE OPERATION BASED ON CLEAN CASTOUT MESSAGES - A processor transmits clean castout messages indicating that a cache line is not dirty and is no longer being stored by a lowest level cache of the processor. An external cache receives the clean castout messages and manages cache lines based in part on the clean castout messages. | 12-06-2012 |
20120317360 | Cache Streaming System - A system, having a stream cache and a storage. The stream cache includes a stream cache controller adapted to control or mediate input data transmitted through the stream cache; and a stream cache memory. The stream cache memory is adapted to both store at least first portions of the input data, as determined by the stream cache controller, and to further output the stored first portions of the input data to a processor. The storage is adapted to receive and store second portions of the input data, as determined by the stream cache controller, and to further transmit the stored second portions of the input data for output to the processor. | 12-13-2012 |
20130013863 | Hybrid Caching Techniques and Garbage Collection Using Hybrid Caching Techniques - Hybrid caching techniques and garbage collection using hybrid caching techniques are provided. A determination of a measure of a characteristic of a data object is performed, the characteristic being indicative of an access pattern associated with the data object. A selection of one caching structure, from a plurality of caching structures, is performed in which to store the data object based on the measure of the characteristic. Each individual caching structure in the plurality of caching structures stores data objects has a similar measure of the characteristic with regard to each of the other data objects in that individual caching structure. The data object is stored in the selected caching structure and at least one processing operation is performed on the data object stored in the selected caching structure. | 01-10-2013 |
20130031308 | DEVICE DRIVER FOR USE IN A DATA STORAGE SYSTEM - A device driver includes an aggregator aggregating data blocks into one or more container objects suited for storage in an object store; and a logger for maintaining in at least one log file for each data block an identification of a container object wherein the data block is stored with an identification of the location of the data block in the container object. | 01-31-2013 |
20130036269 | PLACEMENT OF DATA IN SHARDS ON A STORAGE DEVICE - A method, system and computer program product for placing data in shards on a storage device may include determining placement of a data set in one of a plurality of shards on the storage device. Each one of the shards may include a different at least one performance feature. Each different at least one performance feature may correspond to a different at least one predetermined characteristic associated with a particular set of data. The data set is cached in the one of the plurality of shards on the storage device that includes the at least one performance feature corresponding to the at least one predetermined characteristic associated with the data set being cached. | 02-07-2013 |
20130042068 | SHADOW REGISTERS FOR LEAST RECENTLY USED DATA IN CACHE - A cache for use in a central processing unit (CPU) of a computer includes a data array; a tag array configured to hold a list of addresses corresponding to each data entry held in the data array; a least recently used (LRU) array configured to hold data indicating least recently used data entries in the data array; a line fill buffer configured to receive data from an address in main memory that is located external to the cache in the event of a cache miss; and a shadow register associated with the line fill buffer, wherein the shadow register is configured to hold LRU data indicating a current state of the LRU array. | 02-14-2013 |
20130042069 | Apparatus And A Method For Obtaining A Blur Image - [Problem] The present invention provides an apparatus and a method, which can reduce required memory, for obtaining blur image in computer graphics. | 02-14-2013 |
20130046936 | DATA PROCESSING SYSTEM OPERABLE IN SINGLE AND MULTI-THREAD MODES AND HAVING MULTIPLE CACHES AND METHOD OF OPERATION - Systems and methods are disclosed for a computer system that includes a first load/store execution unit | 02-21-2013 |
20130054897 | Use of Cache Statistics to Ration Cache Hierarchy Access - A method, system and program are provided for controlling access to a specified cache level in a cache hierarchy in a multiprocessor system by evaluating cache statistics for a specified application at the specified cache level against predefined criteria to prevent the specified application from accessing the specified cache level if the specified application does not meeting the predefined criteria. | 02-28-2013 |
20130067169 | DYNAMIC CACHE QUEUE ALLOCATION BASED ON DESTINATION AVAILABILITY - An apparatus for controlling operation of a cache includes a first command queue, a second command queue and an input controller configured to receive requests having a first command type and a second command type and to assign a first request having the first command type to the first command queue and a second command having the first command type to the second command queue in the event that the first command queue has not received an indication that a first dedicated buffer is available. | 03-14-2013 |
20130080705 | MANAGING IN-LINE STORE THROUGHPUT REDUCTION - Various embodiments of the present invention manage a hierarchical store-through memory cache structure. A store request queue is associated with a processing core in multiple processing cores. At least one blocking condition is determined to have occurred at the store request queue. Multiple non-store requests and a set of store requests associated with a remaining set of processing cores in the multiple processing cores are dynamically blocked from accessing a memory cache in response to the blocking condition having occurred. | 03-28-2013 |
20130086324 | INTELLIGENCE FOR CONTROLLING VIRTUAL STORAGE APPLIANCE STORAGE ALLOCATION - A change in workload characteristics detected at one tier of a multi-tiered cache is communicated to another tier of the multi-tiered cache. Multiple caching elements exist at different tiers, and at least one tier includes a cache element that is dynamically resizable. The communicated change in workload characteristics causes the receiving tier to adjust at least one aspect of cache performance in the multi-tiered cache. In one aspect, at least one dynamically resizable element in the multi-tiered cache is resized responsive to the change in workload characteristics. | 04-04-2013 |
20130086325 | DYNAMIC CACHE SYSTEM AND METHOD OF FORMATION - Embodiments of the present invention provide a dynamic cache system comprising: a multi-level inspector design that handles multi-level data formats; a cache function design that handles multi-level data formats; a cache size controller design that is able to handle the varying cache sizes based on characteristics such as hit-rates, usage patterns, etc.; a cache behavior controller design that handles different types of files; and heterogeneous storage controller design that is configured to handle volumes of the storage based on the types of storage (RAM Disk, flash, HDD, etc.). Advantages of system include (among others): caching for different types of data when different types of data need to be cached, and/or cache size can be allocated based on the cache level (which itself can be established). | 04-04-2013 |
20130086326 | SYSTEM AND METHOD FOR SUPPORTING A TIERED CACHE - A computer-implemented method and system can support a tiered cache, which includes a first cache and a second cache. The first cache operates to receive a request to at least one of update and query the tiered cache; and the second cache operates to perform at least one of an updating operation and a querying operation with respect to the request via at least one of a forward strategy and a listening scheme. | 04-04-2013 |
20130097383 | METHODS FOR PROVIDING A RESPONSE AND SYSTEMS THEREOF - A method, computer readable medium, and system for generating a response includes determining from which of a plurality of levels of cache to retrieve a response. The determination is based on a number of matches between current user session data associated with a current request and stored user session data rewritten into each of one or more metadata data variables for the response when a current request for the response matches at least one prior stored request for the response. The response from the determined level of the plurality of levels of cache is provided. | 04-18-2013 |
20130111133 | DYNAMICALLY ADJUSTED THRESHOLD FOR POPULATION OF SECONDARY CACHE | 05-02-2013 |
20130111134 | MANAGEMENT OF PARTIAL DATA SEGMENTS IN DUAL CACHE SYSTEMS | 05-02-2013 |
20130111135 | VARIABLE CACHE LINE SIZE MANAGEMENT | 05-02-2013 |
20130111136 | VARIABLE CACHE LINE SIZE MANAGEMENT | 05-02-2013 |
20130111137 | PROCESSOR-CACHE SYSTEM AND METHOD | 05-02-2013 |
20130111138 | MULTI-CORE PROCESSOR SYSTEM, COMPUTER PRODUCT, AND CONTROL METHOD | 05-02-2013 |
20130111139 | Optimizing a Cache Back Invalidation Policy | 05-02-2013 |
20130117509 | TECHNIQUE TO SHARE INFORMATION AMONG DIFFERENT CACHE COHERENCY DOMAINS - A technique to enable information sharing among agents within different cache coherency domains. In one embodiment, a graphics device may use one or more caches used by one or more processing cores to store or read information, which may be accessed by one or more processing cores in a manner that does not affect programming and coherency rules pertaining to the graphics device. | 05-09-2013 |
20130124800 | APPARATUS AND METHOD FOR REDUCING PROCESSOR LATENCY - There is provided a data processing system comprising a central processing unit, a processor cache memory operably coupled to the central processing unit and an external connection operably coupled to the central processing unit and processor cache memory in which a portion of the data processing system is arranged to load data directly from the external connection into the processor cache memory and modify a source address of said directly loaded data. There is also provided a method of improving latency in a data processing system having a central processing unit operably coupled to a processor cache memory and an external connection operably coupled to the central processing unit and processor cache memory, comprising loading data directly from the external connection into the processor cache memory and modifying a source address for said data to become indicative of a location other than from the external connection. | 05-16-2013 |
20130132674 | METHOD AND SYSTEM FOR DISTRIBUTING TIERED CACHE PROCESSING ACROSS MULTIPLE PROCESSORS - A data storage system having at least one cache and at least two processors balances the load of data access operations by directing certain processes in each data access operation to one of the processors. Each processor may be optimized for its specific processes. One processor may be dedicated to receiving and servicing data access requests; another processor may be dedicated to background tasks and cache management. | 05-23-2013 |
20130132675 | DATA PROCESSING APPARATUS HAVING A CACHE CONFIGURED TO PERFORM TAG LOOKUP AND DATA ACCESS IN PARALLEL, AND A METHOD OF OPERATING THE DATA PROCESSING APPARATUS - A data processing apparatus has a cache with a data array and a tag array. The tag array stores address tag portions associated with the data values in the data array. The cache performs a tag lookup, comparing a tag portion of a received address with a set of tag entries in the tag array. The data array includes a partial tag store storing a partial tag value in association with each data entry. In parallel with the tag lookup, a partial tag value of the received address is compared with partial tag values stored in association with a set of data entries in said data array. A data value is read out if a match condition occurs. Exclusivity circuitry ensures that at most one partial tag value of said partial tag values stored in association with said set of data entries can generate said match condition. | 05-23-2013 |
20130138887 | SELECTIVELY DROPPING PREFETCH REQUESTS BASED ON PREFETCH ACCURACY INFORMATION - The disclosed embodiments relate to a system that selectively drops a prefetch request at a cache. During operation, the system receives the prefetch request at the cache. Next, the system identifies a prefetch source for the prefetch request, and then uses accuracy information for the identified prefetch source to determine whether to drop the prefetch request. In some embodiments, the accuracy information includes accuracy information for different prefetch sources. In this case, determining whether to drop the prefetch request involves first identifying a prefetch source for the prefetch request, and then using accuracy information for the identified prefetch source to determine whether to drop the prefetch request. | 05-30-2013 |
20130145095 | MELTHOD AND SYSTEM FOR INTEGRATING THE FUNCTIONS OF A CACHE SYSTEM WITH A STORAGE TIERING SYSTEM - A tiered data storage system having a cache employs a tiering management subsystem to analyze data access patterns over time, and a cache management subsystem to monitor individual input/output operations and replicate data in the cache. The tiering management subsystem determines a distribution of data between tiers and determines what data should be cached while the cache management subsystem moves data into the cache. The tiered data storage system may analyze individual input/output operations to determine if data should be consolidated from multiple regions in one or more data storage tiers into a single region. | 06-06-2013 |
20130151777 | Dynamic Inclusive Policy in a Hybrid Cache Hierarchy Using Hit Rate - A mechanism is provided for dynamic cache allocation using a cache hit rate. A first cache hit rate is monitored in a first subset utilizing a first allocation policy of N sets of a lower level cache. A second cache hit rate is also monitored in a second subset utilizing a second allocation policy different from the first allocation policy of the N sets of the lower level cache. A periodic comparison of the first cache hit rate to the second cache hit rate is made to identify a third allocation policy for a third subset of the N-sets of the lower level cache. The third allocation policy for the third subset is then periodically adjusted to at least one of the first allocation policy or the second allocation policy based on the comparison of the first cache hit rate to the second cache hit rate. | 06-13-2013 |
20130151778 | Dynamic Inclusive Policy in a Hybrid Cache Hierarchy Using Bandwidth - A mechanism is provided for dynamic cache allocation using bandwidth. A bandwidth between a higher level cache and a lower level cache is monitored. Responsive to bandwidth usage between the higher level cache and the lower level cache being below a predetermined low bandwidth threshold, the higher level cache and the lower level cache are set to operate in accordance with a first allocation policy. Responsive to bandwidth usage between the higher level cache and the lower level cache being above a predetermined high bandwidth threshold, the higher level cache and the lower level cache are set to operate in accordance with a second allocation policy. | 06-13-2013 |
20130151779 | Weighted History Allocation Predictor Algorithm in a Hybrid Cache - A mechanism is provided for weighted history allocation prediction. For each member in a plurality of members in a lower level cache, an associated reference counter is initialized to an initial value based on an operation type that caused data to be allocated to a member location of the member. For each access to the member in the lower level cache, the associated reference counter is incremented. Responsive to a new allocation of data to the lower level cache and responsive to the new allocation of data requiring the victimization of another member in the lower level cache, a member of the lower level cache is identified that has a lowest reference count value in its associated reference counter. The member with the lowest reference count value in its associated reference counter is then evicted. | 06-13-2013 |
20130151780 | Weighted History Allocation Predictor Algorithm in a Hybrid Cache - A mechanism is provided for weighted history allocation prediction. For each member in a plurality of members in a lower level cache, an associated reference counter is initialized to an initial value based on an operation type that caused data to be allocated to a member location of the member. For each access to the member in the lower level cache, the associated reference counter is incremented. Responsive to a new allocation of data to the lower level cache and responsive to the new allocation of data requiring the victimization of another member in the lower level cache, a member of the lower level cache is identified that has a lowest reference count value in its associated reference counter. The member with the lowest reference count value in its associated reference counter is then evicted. | 06-13-2013 |
20130159627 | SYSTEM AND METHOD FOR MANAGING A CACHE USING FILE SYSTEM METADATA - Systems and methods for management of a cache are disclosed. In general, embodiments described herein store access counts in file system metadata associated with files in the cache. By encoding access counts in the file system metadata, file I/O operations are reduced. Preferably, the reference count is encoded in an access count timestamp in the file system metadata. The access counts can be decoded based on the difference between the access count time stamp and a base time value, with larger differences reflecting a larger access count. The cache can be aged by advancing the base time value, thereby causing the access count for a file to drop. The base time value can also be stored in file system metadata, thereby reducing file I/O operations when performing aging. | 06-20-2013 |
20130166845 | METHOD AND DEVICE FOR RECOVERING DESCRIPTION INFORMATION, AND METHOD AND DEVICE FOR CACHING DATA IN DATABASE - A method for recovering description information, or a method for caching data in a database, includes: judging whether a database is closed normally after the last operation; if the database is not closed normally, traversing each data block in a level-2 cache, where corresponding disk location information is saved in a header of each data block; obtaining a data block in a disk according to the disk location information; and when the obtained data block in the disk is the same as a corresponding data block in the level-2 cache, establishing description information according to location information of the data block in the disk and location information of the data block in the level-2 cache, where the description information is used to describe correspondence between the location information of data in the disk and the location information of data in the level-2 cache. | 06-27-2013 |
20130166846 | Hierarchy-aware Replacement Policy - Some implementations disclosed herein provide techniques and arrangements for a hierarchy-aware replacement policy for a last-level cache. A detector may be used to provide the last-level cache with information about blocks in a lower-level cache. For example, the detector may receive a notification identifying a block evicted from the lower-level cache. The notification may include a category associated with the block. The detector may identify a request that caused the block to be filled into the lower-level cache. The detector may determine whether one or more statistics associated with the category satisfy a threshold. In response to determining that the one or more statistics associated with the category satisfy the threshold, the detector may send an indication to the last-level cache that the block is a candidate for eviction from the last-level cache. | 06-27-2013 |
20130173860 | NEAR NEIGHBOR DATA CACHE SHARING - Parallel computing environments, where threads executing in neighboring processors may access the same set of data, may be designed and configured to share one or more levels of cache memory. Before a processor forwards a request for data to a higher level of cache memory following a cache miss, the processor may determine whether a neighboring processor has the data stored in a local cache memory. If so, the processor may forward the request to the neighboring processor to retrieve the data. Because access to the cache memories for the two processors is shared, the effective size of the memory is increased. This may advantageously decrease cache misses for each level of shared cache memory without increasing the individual size of the caches on the processor chip. | 07-04-2013 |
20130173861 | NEAR NEIGHBOR DATA CACHE SHARING - Parallel computing environments, where threads executing in neighboring processors may access the same set of data, may be designed and configured to share one or more levels of cache memory. Before a processor forwards a request for data to a higher level of cache memory following a cache miss, the processor may determine whether a neighboring processor has the data stored in a local cache memory. If so, the processor may forward the request to the neighboring processor to retrieve the data. Because access to the cache memories for the two processors is shared, the effective size of the memory is increased. This may advantageously decrease cache misses for each level of shared cache memory without increasing the individual size of the caches on the processor chip. | 07-04-2013 |
20130179639 | TECHNIQUE FOR PRESERVING CACHED INFORMATION DURING A LOW POWER MODE - A technique to retain cached information during a low power mode, according to at least one embodiment. In one embodiment, information stored in a processor's local cache is saved to a shared cache before the processor is placed into a low power mode, such that other processors may access information from the shared cache instead of causing the low power mode processor to return from the low power mode to service an access to its local cache. | 07-11-2013 |
20130185512 | MANAGEMENT OF PARTIAL DATA SEGMENTS IN DUAL CACHE SYSTEMS - For movement of partial data segments within a computing storage environment having lower and higher levels of cache by a processor, a whole data segment containing one of the partial data segments is promoted to both the lower and higher levels of cache. Requested data of the whole data segment is split and positioned at a Most Recently Used (MRU) portion of a demotion queue of the higher level of cache. Unrequested data of the whole data segment is split and positioned at a Least Recently Used (LRU) portion of the demotion queue of the higher level of cache. The unrequested data is pinned in place until a write of the whole data segment to the lower level of cache completes. | 07-18-2013 |
20130205088 | MULTI-STAGE CACHE DIRECTORY AND VARIABLE CACHE-LINE SIZE FOR TIERED STORAGE ARCHITECTURES - A method in accordance with the invention includes providing first, second, and third storage tiers, wherein the first storage tier acts as a cache for the second storage tier, and the second storage tier acts as a cache for the third storage tier. The first storage tier uses a first cache line size corresponding to an extent size of the second storage tier. The second storage tier uses a second cache line size corresponding to an extent size of the third storage tier. The second cache line size is significantly larger than the first cache line size. The method further maintains, in the first storage tier, a first cache directory indicating which extents from the second storage tier are cached in the first storage tier, and a second cache directory indicating which extents from the third storage tier are cached in the second storage tier. | 08-08-2013 |
20130205089 | Cache Device and Methods Thereof - A cache device, coupled to a processing device, a plurality of system components and an external memory control module, capable of exchanging all types of traffic streams from the processing device and the plurality of system components to the external memory control module. The cache device includes a plurality of cache units, comprising a plurality of cache lines and corresponding to a plurality of cache sets; a data accessing unit, coupled to the processing device, the plurality of system components, the plurality of cache units and the external memory control module, capable of exchanging data of the processing device, the plurality of cache units and an external memory device coupled to the external memory control module according to at least one request signal from the processing device and the plurality of system components. | 08-08-2013 |
20130205090 | MULTI-CORE PROCESSOR HAVING HIERARCHICAL COMMUNICATION ARCHITECTURE - Disclosed is a mufti-core processor having hierarchical communication architecture. The multi-core processor having hierarchical communication architecture is configured to include clusters in which cores are clustered; a lowest level memory shared among the cores included in the clusters; a middle level memory shared among the clusters; and a highest level memory shared by all the clusters. In accordance with an exemplary embodiment of the present invention, it is possible to improve the performance of the applications by reducing the communication overhead between respective core and supporting the data and functional parallelization. | 08-08-2013 |
20130219122 | MULTI-STAGE CACHE DIRECTORY AND VARIABLE CACHE-LINE SIZE FOR TIERED STORAGE ARCHITECTURES - A method in accordance with the invention includes providing first, second, and third storage tiers, wherein the first storage tier acts as a cache for the second storage tier, and the second storage tier acts as a cache for the third storage tier. The first storage tier uses a first cache line size corresponding to an extent size of the second storage tier. The second storage tier uses a second cache line size corresponding to an extent size of the third storage tier. The second cache line size is significantly larger than the first cache line size. The method further maintains, in the first storage tier, a first cache directory indicating which extents from the second storage tier are cached in the first storage tier, and a second cache directory indicating which extents from the third storage tier are cached in the second storage tier. | 08-22-2013 |
20130246708 | FILTERING PRE-FETCH REQUESTS TO REDUCE PRE-FETCHING OVERHEAD - The disclosed embodiments provide a system that filters pre-fetch requests to reduce pre-fetching overhead. During operation, the system executes an instruction that involves a memory reference that is directed to a cache line in a cache. Upon determining that the memory reference will miss in the cache, the system determines whether the instruction frequently leads to cache misses. If so, the system issues a pre-fetch request for one or more additional cache lines. Otherwise, no pre-fetch request is sent. Filtering pre-fetch requests based on instructions' likelihood to miss reduces pre-fetching overhead while preserving the performance benefits of pre-fetching. | 09-19-2013 |
20130254485 | COORDINATED PREFETCHING IN HIERARCHICALLY CACHED PROCESSORS - Processors and methods for coordinating prefetch units at multiple cache levels. A single, unified training mechanism is utilized for training on streams generated by a processor core. Prefetch requests are sent from the core to lower level caches, and a packet is sent with each prefetch request. The packet identifies the stream ID of the prefetch request and includes relevant training information for the particular stream ID. The lower level caches generate prefetch requests based on the received training information. | 09-26-2013 |
20130254486 | SPECULATIVE CACHE MODIFICATION - In accordance with embodiments disclosed herein, there are provided methods, systems, mechanisms, techniques, and apparatuses for implementing a speculative cache modification design. For example, in one embodiment, such means may include an integrated circuit having a data bus; a cache communicably interfaced with the data bus; a pipeline communicably interfaced with the data bus, in which the pipeline is to receive a store instruction corresponding to a cache line to be written to cache; caching logic to perform a speculative cache write of the cache line into the cache before the store instruction retires from the pipeline; and cache line validation logic to determine if the cache line written into the cache is valid or invalid, in which the cache line validation logic is to invalidate the cache line speculatively written into the cache when determined invalid and further in which the store instruction is allowed to retire from the pipeline when the cache line is determined to be valid. | 09-26-2013 |
20130262768 | ADAPTIVE SELF-REPAIRING CACHE - A method for operating a cache that includes both robust cells and standard cells may include receiving a data to be written to the cache, determining whether a type of the data is unmodified data or modified data, and writing the data to robust cells or standard cells as a function of the type of the data. A processor includes a core that includes a cache including both robust cells and standard cells for receiving data, wherein the data is written to robust cells or standard cells as a function of whether a type of the data is determined to be unmodified data or modified data. | 10-03-2013 |
20130262769 | DATA CACHE BLOCK DEALLOCATE REQUESTS - A data processing system includes a processor core supported by upper and lower level caches. In response to executing a deallocate instruction in the processor core, a deallocation request is sent from the processor core to the lower level cache, the deallocation request specifying a target address associated with a target cache line. In response to receipt of the deallocation request at the lower level cache, a determination is made if the target address hits in the lower level cache. In response to determining that the target address hits in the lower level cache, the target cache line is retained in a data array of the lower level cache and a replacement order field in a directory of the lower level cache is updated such that the target cache line is more likely to be evicted from the lower level cache in response to a subsequent cache miss. | 10-03-2013 |
20130262770 | DATA CACHE BLOCK DEALLOCATE REQUESTS IN A MULTI-LEVEL CACHE HIERARCHY - In response to executing a deallocate instruction, a deallocation request specifying a target address of a target cache line is sent from a processor core to a lower level cache. In response, a determination is made if the target address hits in the lower level cache. If so, the target cache line is retained in a data array of the lower level cache, and a replacement order field of the lower level cache is updated such that the target cache line is more likely to be evicted in response to a subsequent cache miss in a congruence class including the target cache line. In response to the subsequent cache miss, the target cache line is cast out to the lower level cache with an indication that the target cache line was a target of a previous deallocation request of the processor core. | 10-03-2013 |
20130275682 | APPARATUS AND METHOD FOR IMPLEMENTING A MULTI-LEVEL MEMORY HIERARCHY OVER COMMON MEMORY CHANNELS - A system and method are described for integrating a memory and storage hierarchy including a non-volatile memory tier within a computer system. In one embodiment, PCMS memory devices are used as one tier in the hierarchy, sometimes referred to as “far memory.” Higher performance memory devices such as DRAM placed in front of the far memory and are used to mask some of the performance limitations of the far memory. These higher performance memory devices are referred to as “near memory.” | 10-17-2013 |
20130282983 | STORAGE SYSTEM WITH DATA MANAGEMENT MECHANISM AND METHOD OF OPERATION THEREOF - A method of operation of a storage system includes: accessing a storage tier manager coupled to a first tier storage and a second tier storage; identifying a low activity region in the first tier storage and a high activity region in the second tier storage; and exchanging a physical block region corresponding to the high activity region with the physical block region corresponding to the low activity region by the storage tier manager. | 10-24-2013 |
20130290636 | MANAGING MEMORY - Methods, and apparatus to cause performance of such methods, for managing memory. The methods include requesting a particular unit of data from a first level of memory. If the particular unit of data is not available from the first level of memory, the methods further include determining whether a free unit of data exists in the first level of memory, evicting a unit of data from the first level of memory if a free unit of data does not exist in the first level of memory, and requesting the particular unit of data from a second level of memory. If the particular unit of data is not available from the second level of memory, the methods further include reading the particular unit of data from a third level of memory. The methods still further include writing the particular unit of data to the first level of memory. | 10-31-2013 |
20130290637 | PER PROCESSOR BUS ACCESS CONTROL IN A MULTI-PROCESSOR CPU - A technique to provide hardware protection for bus accesses for a processor in a multiple processor environment where at least two zones are established to separate or segregate processor functionality. In one implementation, control registers within a cache memory that supports the multiple processors are loaded with addresses associated with access rights for a particular processor. Then, when an access request is generated, the registers are checked to authorize the access. | 10-31-2013 |
20130290638 | TRACKING OWNERSHIP OF DATA ASSETS IN A MULTI-PROCESSOR SYSTEM - A technique to provide ownership tracking of data assets in a multiple processor environment. Ownership tracking allows a data asset to be identified to a particular processor and tracked as the data asset travels within a system or sub-system. In one implementation, the sub-system is a cache memory that provides cache support to multiple processors. By utilizing flag bits attached to the data asset, ownership identification is attached to the data asset to identify which processor owns the data asset. | 10-31-2013 |
20130297876 | CACHE CONTROL TO REDUCE TRANSACTION ROLL BACK - In one embodiment, a microprocessor is provided. The microprocessor includes a cache that is controlled by a cache controller. The cache controller is configured to replace cachelines in the cache based on a replacement scheme that prioritizes the replacement of cachelines that are less likely to cause roll back of a transaction of the microprocessor. | 11-07-2013 |
20130297877 | MANAGING BUFFER MEMORY - A computing system comprises: one or more processors; and a memory system including one or more first level memories. Each first level memory is coupled to a corresponding one of the processors. Each processor is configured to execute instructions in an instruction set, at least some of the instructions in the instruction set accessing chunks of memory in the memory system. Each processor includes a plurality of storage locations, with at least some of the instructions each specifying a set of storage locations including: a first storage location in a first of the processors storing a unique identifier of a first chunk, and a second storage location in the first processor storing a reusable identifier of a storage area in the corresponding first level memory storing the first chunk. | 11-07-2013 |
20130297878 | GATHER AND SCATTER OPERATIONS IN MULTI-LEVEL MEMORY HIERARCHY - Methods and apparatus relating to gather or scatter operations in a multi-level cache are described. In some embodiments, a logic may determine whether to perform gather or scatter operations at a first memory or a second memory, based in part on a relative performance of performing the gather or scatter operations at the first memory and the second memory. Other embodiments are also described and claimed. | 11-07-2013 |
20130304991 | DATA PROCESSING APPARATUS HAVING CACHE AND TRANSLATION LOOKASIDE BUFFER - A data processing apparatus has a cache and a translation look aside buffer (TLB). A way table is provided for identifying which of a plurality of cache ways stores require data. Each way table entry corresponds to one of the TLB entries of the TLB and identifies, for each memory location of the page associated with the corresponding TLB entry, which cache way stores the data associated with that memory location. Also, the cache may be capable of servicing M access requests in the same processing cycle. An arbiter may select pending access requests for servicing by the cache in a way that ensures that the selected pending access requests specify a maximum of N different virtual page addresses, where N11-14-2013 | |
20130304992 | MULTI-CPU SYSTEM AND COMPUTING SYSTEM HAVING THE SAME - A multi-CPU data processing system, comprising: a multi-CPU processor, comprising: a first CPU configured with at least a first core, a first cache, and a first cache controller configured to access the first cache; and a second CPU configured with at least a second core, and a second cache controller configured to access a second cache, wherein the first cache is configured from a shared portion of the second cache. | 11-14-2013 |
20130311721 | STORAGE SYSTEM - In an exemplary storage system, a processor assigns an unused process to a read request designating an area of a logical volume. The processor determines whether the data designated by the read request is in a cache memory, based on a first identifier for identifying the area designated by the read request. When the designated data is not in the cache memory and a part of physical volumes providing the logical volume is a first kind of physical volume, the processor stores the first identifier associated with an identifier for identifying an area allocated in the cache memory. When the designated data is not in the cache memory and a part of the physical volumes is a second kind of physical volume, the processor stores a second identifier for identifying the process assigned to the read request associated with an identifier for identifying an area allocated in the cache memory. | 11-21-2013 |
20130311722 | CACHE SYSTEM AND A METHOD OF OPERATING A CACHE MEMORY - In one embodiment, a computer cache is extended with structures that can ( | 11-21-2013 |
20130318302 | CACHE CONTROLLER BASED ON QUALITY OF SERVICE AND METHOD OF OPERATING THE SAME - A cache controller includes an entry list determination module and a cache replacement module. The entry list determination module is configured to receive a quality of service (QoS) value of a process, and output a replaceable entry list based on the received QoS value. The cache replacement module is configured to write data in an entry included in the replaceable entry list. The process is one of a plurality of processes, each having a QoS value, and the replaceable entry list is one of a plurality of replaceable entry lists, each including a plurality of entries and each corresponding to one of the QoS values. The number of total entries is allocated to processes based on the QoS values of the processes. | 11-28-2013 |
20130318303 | APPLICATION-RESERVED CACHE FOR DIRECT I/O - Described are embodiments of mediums, methods, and systems for application-reserved use of cache for direct I/O. A method for using application-reserved cache may include reserving, by one of a plurality of cores of a processor, use of a first portion of one of a plurality of levels of cache for an application executed by the one of the plurality of cores, and transferring, by the one of the plurality of cores, data associated with the application from an input/output (I/O) device of a computing device directly to the first portion of the one of the plurality of levels of the cache. Other embodiments may be described and claimed. | 11-28-2013 |
20130326143 | Caching Frequently Used Addresses of a Page Table Walk - An architecture and method are described for performing memory management. A page walker cache is provided to cache data used during the page walk process. This cache structure speeds up the page walk process, which significantly reduces the expense of performing a page walk. The page walker cache also reduces the cost associated with usage of memory access bandwidths. | 12-05-2013 |
20130326144 | METHODS FOR MANAGING OWNERSHIP OF REDUNDANT DATA AND SYSTEMS THEREOF - A storage system, according to one embodiment, includes a processor and logic integrated with and/or executable by the processor. The logic is configured to: search for an instance of a file or portion thereof on a second storage tier; at least one of: associate the instance of the file or portion thereof on the second storage tier with a first user when the instance of the file or portion thereof is not associated with any user, and replicate the instance of the file or portion thereof on the second storage tier and associate the replicated instance of the file or portion thereof on the second storage tier with the first user; and disassociate an instance of the file on a first storage tier from the first user. | 12-05-2013 |
20130326145 | METHODS AND APPARATUS FOR EFFICIENT COMMUNICATION BETWEEN CACHES IN HIERARCHICAL CACHING DESIGN - In accordance with embodiments disclosed herein, there are provided methods, systems, mechanisms, techniques, and apparatuses for implementing efficient communication between caches in hierarchical caching design. For example, in one embodiment, such means may include an integrated circuit having a data bus; a lower level cache communicably interfaced with the data bus; a higher level cache communicably interfaced with the data bus; one or more data buffers and one or more dataless buffers. The data buffers in such an embodiment being communicably interfaced with the data bus, and each of the one or more data buffers having a buffer memory to buffer a full cache line, one or more control bits to indicate state of the respective data buffer, and an address associated with the full cache line. The dataless buffers in such an embodiment being incapable of storing a full cache line and having one or more control bits to indicate state of the respective dataless buffer and an address for an inter-cache transfer line associated with the respective dataless buffer. In such an embodiment, inter-cache transfer logic is to request the inter-cache transfer line from the higher level cache via the data bus and is to further write the inter-cache transfer line into the lower level cache from the data bus. | 12-05-2013 |
20130339608 | MULTILEVEL CACHE HIERARCHY FOR FINDING A CACHE LINE ON A REMOTE NODE - Embodiments relate to accessing a cache line on a multi-level cache system having a system memory. Based on a request for exclusive ownership of a specific cache line at the local node, requests are concurrently sent to the system memory and remote nodes of the plurality of nodes for the specific cache line by the local node. The specific cache line is found in a specific remote node. The specific remote node is one of the remote nodes. The specific cache line is removed from the specific remote node for exclusive ownership by another node. Based on the specified node having the specified cache line in ghost state, any subsequent fetch request is initiated for the specific cache line from the specific node encounters the ghost state. When the ghost state is encountered, the subsequent fetch request is directed only to nodes of the plurality of nodes. | 12-19-2013 |
20130339609 | MULTILEVEL CACHE HIERARCHY FOR FINDING A CACHE LINE ON A REMOTE NODE - Embodiments relate to accessing a cache line on a multi-level cache system having a system memory. Based on a request for exclusive ownership of a specific cache line at the local node, requests are concurrently sent to the system memory and remote nodes of the plurality of nodes for the specific cache line by the local node. The specific cache line is found in a specific remote node. The specific remote node is one of the remote nodes. The specific cache line is removed from the specific remote node for exclusive ownership by another node. Based on the specified node having the specified cache line in ghost state, any subsequent fetch request is initiated for the specific cache line from the specific node encounters the ghost state. When the ghost state is encountered, the subsequent fetch request is directed only to nodes of the plurality of nodes. | 12-19-2013 |
20130346694 | PROBE FILTER FOR SHARED CACHES - The claimed subject matter provides a method and apparatus for filtering probes to shared caches. One embodiment of the method includes filtering a probe of one or more of a plurality of first caches based on a plurality of first bits associated with a line indicated by the probe. Each of the plurality of first bits is associated with a different subset of the plurality of first caches and each first bit indicates whether the line is resident in a corresponding subset of the plurality of first caches. A second bit indicates whether the line is resident in more than one of the plurality of first caches in at least one of the subsets of the plurality of first caches. | 12-26-2013 |
20130346695 | INTEGRATED CIRCUIT WITH HIGH RELIABILITY CACHE CONTROLLER AND METHOD THEREFOR - An integrated circuit includes a register including a field for defining a high reliability mode of the integrated circuit and a cache and memory controller coupled to the register and responsive to the high reliability mode to access a memory to store, in a row of the memory, a first multiple number of cache lines, a first multiple number of tags corresponding to the first multiple number of cache lines, and reliability data corresponding to at least the first multiple number of cache lines. | 12-26-2013 |
20130346696 | METHOD AND APPARATUS FOR PROVIDING SHARED CACHES - A method and apparatus for providing shared caches. A cache memory system may be operated in a first mode or a second mode. When the cache memory system is operated in the first mode, a first cache and a second cache of the cache memory system may be operated independently. When the cache memory system is operated in the second mode, the first cache and the second cache may be shared. In the second mode, at least one bit may overlap tag bits and set index bits among bits of a memory address. | 12-26-2013 |
20130346697 | MULTILEVEL CACHE SYSTEM - Fetching a cache line into a plurality of caches of a multilevel cache system. The multilevel cache system includes at least a first cache, a second cache on a next higher level and a memory, the first cache being arranged to hold a subset of information of the second cache, the second cache being arranged to hold a subset of information of a next higher level cache or memory if no higher level cache exists. A fetch request is sent from one cache to the next cache in the multilevel cache system. The cache line is fetched in a particular state into one of the caches, and in another state into at least one of the other caches. | 12-26-2013 |
20140013052 | CRITERIA FOR SELECTION OF DATA FOR A SECONDARY CACHE - Host read operations affecting a first logical block address of a data storage device are tracked. The data storage device includes a main storage and a non-volatile cache that mirrors a portion of data of the main storage. One or more criteria associated with the host read operations are determined. The criteria are indicative of future read requests of second logical block address associated with the first logical block address. Data of the at least the second logical block address is copied from the main storage to the non-volatile cache if the criteria meets a threshold. | 01-09-2014 |
20140013053 | DETERMINING A CRITERION FOR MOVEMENT OF DATA FROM A PRIMARY CACHE TO A SECONDARY CACHE - A new segment of data is copied to a volatile, primary cache based on a host data read access request. The primary cache mirrors a first portion of a non-volatile main storage criterion is determined for movement of data from the primary cache to a non-volatile, secondary cache that mirrors a second portion of the main storage. The criterion gives higher priority to segments having addresses not yet selected for reading by the host. In response to the new segment of data being copied to the primary cache, a selected segment of data is copied from the primary cache to the secondary cache in response to the selected segment satisfying the criterion. | 01-09-2014 |
20140013054 | STORING DATA STRUCTURES IN CACHE - A method and system for implementing a data structure cache are provided herein. The method includes identifying a data structure. The method also includes identifying a plurality of frequently accessed data blocks in the data structure. Additionally, the method includes reserving a portion of a cache for storage of the frequently accessed data blocks. Furthermore, the method includes storing the frequently accessed data blocks in the reserved portion of the cache. | 01-09-2014 |
20140013055 | ENSURING CAUSALITY OF TRANSACTIONAL STORAGE ACCESSES INTERACTING WITH NON-TRANSACTIONAL STORAGE ACCESSES - A data processing system implements a weak consistency memory model for a distributed shared memory system. The data processing system concurrently executes, on a plurality of processor cores, one or more transactional memory instructions within a memory transaction and one or more non-transactional memory instructions. The one or more non-transactional memory instructions include a non-transactional store instruction. The data processing system commits the memory transaction to the distributed shared memory system only in response to enforcement of causality of the non-transactional store instruction with respect to the memory transaction. | 01-09-2014 |
20140013056 | MOVEABLE LOCKED LINES IN A MULTI-LEVEL CACHE - A processor includes a multi-level cache hierarchy where a lock property is associated with a cache line. The cache line retains the lock property and may move back and forth within the cache hierarchy. The cache line may be evicted from the cache hierarchy after the lock property is removed. | 01-09-2014 |
20140025892 | CONVERTING MEMORY ACCESSES NEAR BARRIERS INTO PREFETCHES - Methods, apparatuses, and processors for reducing memory latency in the presence of barriers. When a barrier operation is executed, subsequent memory access operations are delayed until the barrier operation retires. While the memory access operation is delayed, the memory access operation is converted into a prefetch request and sent to the L2 cache. Then, data corresponding to the prefetch request is retrieved and stored in the L1 data cache. When the memory access operation wakes up, the data for the operation will already be stored in the L1 data cache, reducing the memory latency of the operation. | 01-23-2014 |
20140032844 | SYSTEMS AND METHODS FOR FLUSHING A CACHE WITH MODIFIED DATA - Systems and methods for flushing a cache with modified data are disclosed. Responsive to a request to flush data from a cache with modified data to a next level cache that does not include the cache with modified data, the cache with modified data is accessed using an index and a way and an address associated with the index and the way is secured. Using the address, the cache with modified data is accessed a second time and an entry that is associated with the address is retrieved from the cache with modified data. The entry is placed into a location of the next level cache. | 01-30-2014 |
20140032845 | SYSTEMS AND METHODS FOR SUPPORTING A PLURALITY OF LOAD ACCESSES OF A CACHE IN A SINGLE CYCLE - A method for supporting a plurality of load accesses is disclosed. A plurality of requests to access a data cache is accessed, and in response, a tag memory is accessed that maintains a plurality of copies of tags for each entry in the data cache. Tags are identified that correspond to individual requests. The data cache is accessed based on the tags that correspond to the individual requests. A plurality of requests to access the same block of the plurality of blocks causes an access arbitration that is executed in the same clock cycle as is the access of the tag memory. | 01-30-2014 |
20140032846 | SYSTEMS AND METHODS FOR SUPPORTING A PLURALITY OF LOAD AND STORE ACCESSES OF A CACHE - Systems and methods for supporting a plurality of load and store accesses of a cache are disclosed. Responsive to a request of a plurality of requests to access a block of a plurality of blocks of a load cache, the block of the load cache and a logically and physically paired block of a store coalescing cache are accessed in parallel. The data that is accessed from the block of the load cache is overwritten by the data that is accessed from the block of the store coalescing cache by merging on a per byte basis. Access is provided to the merged data. | 01-30-2014 |
20140040551 | REWIND ONLY TRANSACTIONS IN A DATA PROCESSING SYSTEM SUPPORTING TRANSACTIONAL STORAGE ACCESSES - In a multiprocessor data processing system having a distributed shared memory system, a memory transaction that is a rewind-only transaction (ROT) and that includes one or more transactional memory access instructions and a transactional abort instruction is executed. In response to execution of the one or more transactional memory access instructions, one or more memory accesses to the distributed shared memory system indicated by the one or more transactional memory access instructions are performed. In response to execution of the transactional abort instruction, execution results of the one or more transaction memory access instructions are discarded and control is passed to a fail handler. | 02-06-2014 |
20140040552 | MULTI-CORE COMPUTE CACHE COHERENCY WITH A RELEASE CONSISTENCY MEMORY ORDERING MODEL - A method includes storing, with a first programmable processor, shared variable data to cache lines of a first cache of the first processor. The method further includes executing, with the first programmable processor, a store-with-release operation, executing, with a second programmable processor, a load-with-acquire operation, and loading, with the second programmable processor, the value of the shared variable data from a cache of the second programmable processor. | 02-06-2014 |
20140047184 | TUNABLE MULTI-TIERED STT-MRAM CACHE FOR MULTI-CORE PROCESSORS - A multi-core processor is presented. The multi-core processor includes a first spin transfer torque magnetoresistive random-access memory (STT-MRAM) cache associated with a first core of the multi-core processor and tuned according to first attributes and a second STT-MRAM cache associated with a second core of the multi-core processor and tuned according to second attributes. | 02-13-2014 |
20140052918 | SYSTEM, METHOD, AND COMPUTER PROGRAM PRODUCT FOR MANAGING CACHE MISS REQUESTS - A system, method, and computer program product are provided for managing miss requests. In use, a miss request is received at a unified miss handler from one of a plurality of distributed local caches. Additionally, the miss request is managed, utilizing the unified miss handler. | 02-20-2014 |
20140052919 | SYSTEM TRANSLATION LOOK-ASIDE BUFFER INTEGRATED IN AN INTERCONNECT - System TLBs are integrated within an interconnect, use a and share a transport network to connect to a shared walker port. Transactions are able to pass STLB allocation information through a second initiator side interconnect, in a way that interconnects can be cascaded, so as to allow initiators to control a shared STLB within the first interconnect. Within the first interconnect, multiple STLBs share an intermediate-level translation cache that improves performance when there is locality between requests to the two STLBs. | 02-20-2014 |
20140052920 | DOMAIN STATE - Method and apparatus to efficiently maintain cache coherency by reading/writing a domain state field associated with a tag entry within a cache tag directory. A value may be assigned to a domain state field of a tag entry in a cache tag directory. The cache tag directory may belong to a hierarchy of cache tag directories. | 02-20-2014 |
20140068192 | PROCESSOR AND CONTROL METHOD OF PROCESSOR - A processor includes a plurality of CPU cores, each having an LI cache memory, that executes processing and issues a request, and an L2 cache memory connected to the plurality of CPU cores, the L2 cache memory is configured, when a request which requests a target data held by none of the L1 cache memories contained in the plurality of CPU cores, is a load request that permits other CPU cores, to make a response to the CPU core having sent the request, with non-exclusive information that indicates that the target data is non-exclusive data, together with the target data; and when the request is a load request that forbids other CPU cores, to make a response to the CPU core having sent the request, with exclusive information that indicates that the target data is exclusive, together with the target data. | 03-06-2014 |
20140075120 | CACHE LINE HISTORY TRACKING USING AN INSTRUCTION ADDRESS REGISTER FILE - Embodiments relate to tracking cache lines. An aspect of embodiments includes performing an operation by a processor. Another aspect of embodiments includes fetching a cache line based on the operation. Yet another aspect of embodiments includes storing in an instruction address register file at least one of (i) an operation identifier identifying the operation and (ii) a memory location identifier identifying a level of memory from which the cache line is populated. | 03-13-2014 |
20140082286 | Prefetching Method and Apparatus - A method and apparatus for determining data to be prefetched based on previous cache miss history is disclosed. In one embodiment, a processor includes a first cache memory and a controller circuit. The controller circuit is configured to load data from a first address into the first cache memory responsive to a cache miss corresponding to the first address. The controller circuit is further configured to determine, responsive to a cache miss for the first address, if a previous cache miss occurred at a second address. Responsive to determining that the previous cache miss occurred at the second address, the controller circuit is configured to load data from a second address into the first cache. | 03-20-2014 |
20140082287 | CACHE MEMORY PREFETCHING - According to exemplary embodiments, a computer program product, system, and method for prefetching in memory include determining a missed access request for a first line in a first cache level and accessing an entry in a prefetch table, wherein the entry corresponds to a memory block, wherein the entry includes segments of the memory block. Further, the embodiment includes determining a demand segment of the segments in the entry, the demand segment corresponding to a segment of the memory block that includes the first line, reading a first field in the demand segment to determine if a second line in the demand segment is spatially related with respect to accesses of the demand segment and reading a second field in the demand segment to determine if a second segment in the entry is temporally related to the demand segment. | 03-20-2014 |
20140089586 | ARITHMETIC PROCESSING UNIT, INFORMATION PROCESSING DEVICE, AND ARITHMETIC PROCESSING UNIT CONTROL METHOD - An L2 cache control unit searches for a cache memory according to a memory access request which is provided from a request storage unit 0 through a CPU core unit, and retains in request storage units 1 and 2 the memory access request that has a cache mistake that has occurred. A bank abort generation unit counts, for each bank, the number of memory access requests to the main storage device, and instructs the L2 cache control unit to interrupt access when any of the number of counted memory access requests exceeds a specified value. According to the instruction, the L2 cache control unit interrupts the processing of the memory access request retained in the request storage unit 0. A main memory control unit issues the memory access request retained in the request storage unit 2 to the main storage device. | 03-27-2014 |
20140089587 | PROCESSOR, INFORMATION PROCESSING APPARATUS AND CONTROL METHOD OF PROCESSOR - An entry information storing unit | 03-27-2014 |
20140095791 | PERFORMANCE-DRIVEN CACHE LINE MEMORY ACCESS - According to one aspect of the present disclosure a system and technique for performance-driven cache line memory access is disclosed. The system includes: a processor, a cache hierarchy coupled to the processor, and a memory coupled to the cache hierarchy. The system also includes logic executable to, responsive to receiving a request for a cache line: divide the request into a plurality of cache subline requests, wherein at least one of the cache subline requests comprises a high priority data request and at least one of the cache subline requests comprises a low priority data request; service the high priority data request; and delay servicing of the low priority data request until a low priority condition has been satisfied. | 04-03-2014 |
20140095792 | CACHE CONTROL DEVICE AND PIPELINE CONTROL METHOD - A cache control device includes an entering unit, a first searching unit, a reading unit, a second searching unit, and a rewriting unit. The entering unit alternately enters, into a pipeline, a load request for reading a directory received from a processor and a store request for rewriting a directory received from the processor. When the first searching unit determines that the directory targeted by the load request is present in the first cache memory or the second cache memory, the reading unit reads the directory from the cache memory in which the directory is present. When the second searching unit determines that the directory targeted by the store request is present in the first cache memory, the rewriting unit rewrites the directory that is stored in the first cache memory. | 04-03-2014 |
20140108729 | SYSTEMS AND METHODS FOR LOAD CANCELING IN A PROCESSOR THAT IS CONNECTED TO AN EXTERNAL INTERCONNECT FABRIC - Systems and methods for load canceling in a processor that is connected to an external interconnect fabric are disclosed. As a part of a method for load canceling in a processor that is connected to an external bus, and responsive to a flush request and a corresponding cancellation of pending speculative loads from a load queue, a type of one or more of the pending speculative loads that are positioned in the instruction pipeline external to the processor, is converted from load to prefetch. Data corresponding to one or more of the pending speculative loads that are positioned in the instruction pipeline external to the processor is accessed and returned to cache as prefetch data. The prefetch data is retired in a cache location of the processor. | 04-17-2014 |
20140108730 | SYSTEMS AND METHODS FOR NON-BLOCKING IMPLEMENTATION OF CACHE FLUSH INSTRUCTIONS - Systems and methods for non-blocking implementation of cache flush instructions are disclosed. As a part of a method, data is accessed that is received in a write-back data holding buffer from a cache flushing operation, the data is flagged with a processor identifier and a serialization flag, and responsive to the flagging, the cache is notified that the cache flush is completed. Subsequent to the notifying, access is provided to data then present in the write-back data holding buffer to determine if data then present in the write-back data holding buffer is flagged. | 04-17-2014 |
20140108731 | Energy Optimized Cache Memory Architecture Exploiting Spatial Locality - Aspects of the present invention provide a “SuperTag” cache that manages cache at three granularities: (i) coarse grain, multi-block “super blocks,” (ii) single cache blocks and (iii) fine grain, fractional block “data segments.” Since contiguous blocks have the same tag address, by tracking multi-block super blocks, the SuperTag cache inherently increases per-block tag space, allowing higher compressibility without incurring high area overheads. To improve compression ratio, the SuperTag cache uses variable-packing compression allowing variable-size compressed blocks without requiring costly compactions. The SuperTag cache also stores data segments dynamically. In addition, the SuperTag cache is able to further improve the compression ratio by co-compressing contiguous blocks. As a result, the Super Tag cache improves energy and performance for memory intensive applications over conventional compressed caches. | 04-17-2014 |
20140108732 | CACHE LAYER OPTIMIZATIONS FOR VIRTUALIZED ENVIRONMENTS - Embodiments of the invention relate to optimizing the storage of data in a multi-cache level environment. In one aspect, data is classified into primary and secondary cache sections. Data is differentiated based on an inherent sharing characteristic of the data within a system comprising virtual machines. The data is then placed into the classified sections of the cache storage layer and/or persistent data, reflective of how the data is shared among virtual disk images access by virtual machines. | 04-17-2014 |
20140108733 | DISABLING CACHE PORTIONS DURING LOW VOLTAGE OPERATIONS - Methods and apparatus relating to disabling one or more cache portions during low voltage operations are described. In some embodiments, one or more extra bits may be used for a portion of a cache that indicate whether the portion of the cache is capable at operating at or below Vccmin levels. Other embodiments are also described and claimed. | 04-17-2014 |
20140108734 | METHOD AND APPARATUS FOR SAVING PROCESSOR ARCHITECTURAL STATE IN CACHE HIERARCHY - A processor includes a first processing unit and a first level cache associated with the first processing unit and operable to store data for use by the first processing unit used during normal operation of the first processing unit. The first processing unit is operable to store first architectural state data for the first processing unit in the first level cache responsive to receiving a power down signal. A method for controlling power to processor including a hierarchy of cache levels includes storing first architectural state data for a first processing unit of the processor in a first level of the cache hierarchy responsive to receiving a power down signal and flushing contents of the first level including the first architectural state data to a first lower level of the cache hierarchy prior to powering down the first level of the cache hierarchy and the first processing unit. | 04-17-2014 |
20140115256 | SYSTEM AND METHOD FOR EXCLUSIVE READ CACHING IN A VIRTUALIZED COMPUTING ENVIRONMENT - A technique for efficient cache management demotes a unit of data from a higher cache level to a lower cache level in a cache hierarchy when the higher level cache evicts the unit of data. In a virtualization computing environment, eviction of the unit of data may be inferred by observing privileged memory and disk operations performed by a guest operating system and trapped by virtualization software for execution. When the unit of data is inferred to be evicted, the unit of data is demoted by transferring the unit of data into the lower cache level. This technique enables exclusive caching without direct involvement or modification of the guest operating system. In alternative embodiments, a pseudo-driver installed within the guest operating system explicitly tracks memory operations and transmits page eviction information to the lower level cache, which is able to cache evicted pages while maintaining cache exclusivity. | 04-24-2014 |
20140122805 | SELECTIVE POISONING OF DATA DURING RUNAHEAD - Embodiments related to selecting a runahead poison policy from a plurality of runahead poison policies during microprocessor operation are provided. The example method includes causing the microprocessor to enter runahead upon detection of a runahead event and implementing a first runahead poison policy selected from a plurality of runahead poison policies operative to manage runahead poison injection during runahead. The example method also includes during microprocessor operation, selecting a second runahead poison policy operative to manage runahead poison injection differently from the first runahead poison policy. | 05-01-2014 |
20140129773 | HIERARCHICAL CACHE STRUCTURE AND HANDLING THEREOF - A hierarchical cache structure comprises at least one higher level cache comprising a unified cache array for data and instructions and at least two lower level caches, each split in an instruction cache and a data cache. An instruction cache and a data cache of a split second level cache are connected to a third level cache; and an instruction cache of a split first level cache is connected to the instruction cache of the split second level cache, and a data cache of the split first level cache is connected to the instruction cache and the data cache of the split second level cache. | 05-08-2014 |
20140129774 | HIERARCHICAL CACHE STRUCTURE AND HANDLING THEREOF - A hierarchical cache structure includes at least one real indexed higher level cache with a directory and a unified cache array for data and instructions, and at least two lower level caches, each split in an instruction cache and a data cache. An instruction cache of a split real indexed second level cache includes a directory and a corresponding cache array connected to the real indexed third level cache. A data cache of the split second level cache includes a directory connected to the third level cache. An instruction cache of a split virtually indexed first level cache is connected to the second level instruction cache. A cache array of a data cache of the first level cache is connected to the cache array of the second level instruction cache and to the cache array of the third level cache. A directory of the first level data cache is connected to the second level instruction cache directory and to the third level cache directory. | 05-08-2014 |
20140136784 | ENHANCED CACHE COORDINATION IN A MULTI-LEVEL CACHE - Embodiments of the present invention provide a method, system and computer program product for enhanced cache coordination in a multi-level cache. In an embodiment of the invention, a method for enhanced cache coordination in a multi-level cache is provided. The method includes receiving a processor memory request to access data in a multi-level cache and servicing the processor memory request with data in either an L1 cache or an L2 cache of the multi-level cache. The method additionally includes marking a cache line in the L1 cache and also a corresponding cache line in the L2 cache as most recently used responsive to determining that the processor memory request is serviced from the cache line in the L1 cache and that the cache line in the L1 cache is not currently marked most recently used. | 05-15-2014 |
20140136785 | ENHANCED CACHE COORDINATION IN A MULTILEVEL CACHE - Embodiments of the present invention provide a method, system and computer program product for enhanced cache coordination in a multi-level cache. In an embodiment of the invention, a method for enhanced cache coordination in a multi-level cache is provided. The method includes receiving a processor memory request to access data in a multi-level cache and servicing the processor memory request with data in either an L1 cache or an L2 cache of the multi-level cache. The method additionally includes marking a cache line in the L1 cache and also a corresponding cache line in the L2 cache as most recently used responsive to determining that the processor memory request is serviced from the cache line in the L1 cache and that the cache line in the L1 cache is not currently marked most recently used. | 05-15-2014 |
20140143493 | Bypassing a Cache when Handling Memory Requests - The described embodiments include a computing device that handles memory requests. In some embodiments, when a memory request is to be sent to a cache in the computing device or to be bypassed to a next lower level of a memory hierarchy in the computing device based on expected memory request resolution times, a bypass mechanism is configured to send the memory request to the cache or bypass the memory request to the next lower level of the memory hierarchy. | 05-22-2014 |
20140143494 | SYSTEMS AND METHODS FOR IMPROVING PROCESSOR EFFICIENCY THROUGH CACHING - Certain embodiments herein relate to using tagless access buffers (TABs) to optimize energy efficiency in various computing systems. Candidate memory references in an L | 05-22-2014 |
20140156929 | NETWORK-ON-CHIP USING REQUEST AND REPLY TREES FOR LOW-LATENCY PROCESSOR-MEMORY COMMUNICATION - A Network-On-Chip (NOC) organization comprises a die having a cache area and a core area, a plurality of core tiles arranged in the core area in a plurality of subsets, at least one cache memory bank arranged in the cache area, whereby the at least one cache memory bank is distinct from each of the plurality of core tiles. The NOC organization further comprises an interconnect fabric comprising a request tree to connect to a first cache memory bank of the at least one cache memory bank, each core tile of a first one of the subsets, the first subset corresponding to the first cache memory bank, such that each core tile of the first subset is connected to the first cache memory bank only, and allow guiding data packets from each core tile of the first subset to the first memory bank, and a reply tree to connect the first cache memory bank to each core tile of the first subset, and allow guiding data packets from the first cache memory bank to a core tile of the first subset. | 06-05-2014 |
20140156930 | CACHING OF VIRTUAL TO PHYSICAL ADDRESS TRANSLATIONS - A data processing apparatus comprising: at least one initiator device for issuing transactions, a hierarchical memory system comprising a plurality of caches and a memory and memory access control circuitry. The initiator device identifies storage locations using virtual addresses and the memory system stores data using physical addresses, the memory access control circuitry is configured to control virtual address to physical address translations. The plurality of caches, comprise a first cache and a second cache. The first cache is configured to store a plurality of address translations of virtual to physical addresses that the initiator device has requested. The second cache is configured to store a plurality of address translations of virtual to physical addresses that it is predicted that the initiator device will subsequently request. The first and second cache are arranged in parallel with each other such that the first and second caches can be accessed during a same access cycle. | 06-05-2014 |
20140156931 | STATE ENCODING FOR CACHE LINES - A method and apparatus for state encoding of cache lines is described. Some embodiments of the method and apparatus support probing, in response to a first probe of a cache line in a first cache, a copy of the cache line in a second cache when the cache line is stale and the cache line is associated with a copy of the cache line stored in the second cache that can bypass notification of the first cache in response to modifying the copy of the cache line. | 06-05-2014 |
20140156932 | ELIMINATING FETCH CANCEL FOR INCLUSIVE CACHES - A method and apparatus for eliminating fetch cancels for inclusive caches are presented. Some embodiments of the apparatus include a first cache configurable to issue fetch or prefetch requests to a second cache that is inclusive of said at least one first cache. The first cache is not permitted to cancel issued fetch or prefetch requests to the second cache. Some embodiments of the method include preventing the first cache(s) from canceling issued fetch or prefetch requests to a second cache that is inclusive of the first cache(s). | 06-05-2014 |
20140164703 | CACHE SWIZZLE WITH INLINE TRANSPOSITION - A method and circuit arrangement selectively swizzle data in one or more levels of cache memory coupled to a processing unit based upon one or more swizzle-related page attributes stored in a memory address translation data structure such as an Effective To Real Translation (ERAT) or Translation Lookaside Buffer (TLB). A memory address translation data structure may be accessed, for example, in connection with a memory access request for data in a memory page, such that attributes associated with the memory page in the data structure may be used to control whether data is swizzled, and if so, how the data is to be formatted in association with handling the memory access request. | 06-12-2014 |
20140164704 | CACHE SWIZZLE WITH INLINE TRANSPOSITION - A method and circuit arrangement selectively swizzle data in one or more levels of cache memory coupled to a processing unit based upon one or more swizzle-related page attributes stored in a memory address translation data structure such as an Effective To Real Translation (ERAT) or Translation Lookaside Buffer (TLB). A memory address translation data structure may be accessed, for example, in connection with a memory access request for data in a memory page, such that attributes associated with the memory page in the data structure may be used to control whether data is swizzled, and if so, how the data is to be formatted in association with handling the memory access request. | 06-12-2014 |
20140164705 | PREFETCH WITH REQUEST FOR OWNERSHIP WITHOUT DATA - A method performed by a processor is described. The method includes executing an instruction. The instruction has an address as an operand. The executing of the instruction includes sending a signal to cache coherence protocol logic of the processor. In response to the signal, the cache coherence protocol logic issues a request for ownership of a cache line at the address. The cache line is not in a cache of the processor. The request for ownership also indicates that the cache line is not to be sent to the processor. | 06-12-2014 |
20140164706 | MULTI-CORE PROCESSOR HAVING HIERARCHICAL CAHCE ARCHITECTURE - Disclosed is a multi-core processor having hierarchical cache architecture. A multi-core processor may comprise a plurality of cores, a plurality of first caches independently connected to each of the plurality of cores, at least one second cache respectively connected to at least one of the plurality of first caches, a plurality of third caches respectively connected to at least one of the plurality of cores, and at least one fourth cache respectively connected to a least one of the plurality of third caches. Therefore, overhead in communications between cores may be reduced, and processing speed of application may be increased by supporting data-level parallelization. | 06-12-2014 |
20140173207 | Power Gating A Portion Of A Cache Memory - In an embodiment, a processor includes multiple tiles, each including a core and a tile cache hierarchy. This tile cache hierarchy includes a first level cache, a mid-level cache (MLC) and a last level cache (LLC), and each of these caches is private to the tile. A controller coupled to the tiles includes a cache power control logic to receive utilization information regarding the core and the tile cache hierarchy of a tile and to cause the LLC of the tile to be independently power gated, based at least in part on this information. Other embodiments are described and claimed. | 06-19-2014 |
20140173208 | METHODS AND APPARATUS FOR MULTI-LEVEL CACHE HIERARCHIES - A multi-level cache structure in accordance with one embodiment includes a first cache structure and a second cache structure. The second cache structure is hierarchically above the first cache. The second cache includes a tag array comprising a plurality of tag entries corresponding to respective addresses of data within a system memory; a selector array associated with the tag array; and a data array configured to store a subset of the data. The selector array is configured to specify, for each corresponding tag entry, whether the data array includes the data corresponding to that tag entry. | 06-19-2014 |
20140181402 | SELECTIVE CACHE MEMORY WRITE-BACK AND REPLACEMENT POLICIES - A method of managing cache memory includes assigning a caching priority designator to an address that addresses information stored in a memory system. The information is stored in a cacheline of a first level of cache memory in the memory system. The cacheline is evicted from the first level of cache memory. A second level in the memory system to which to write back the information is determined based at least in part on the caching priority designator. The information is written back to the second level. | 06-26-2014 |
20140181403 | CACHE POLICIES FOR UNCACHEABLE MEMORY REQUESTS - Systems, processors, and methods for keeping uncacheable data coherent. A processor includes a multi-level cache hierarchy, and uncacheable load memory operations can be cached at any level of the cache hierarchy. If an uncacheable load misses in the L2 cache, then allocation of the uncacheable load will be restricted to a subset of the ways of the L2 cache. If an uncacheable store memory operation hits in the L1 cache, then the hit cache line can be updated with the data from the memory operation. If the uncacheable store misses in the L1 cache, then the uncacheable store is sent to a core interface unit. | 06-26-2014 |
20140181404 | INFORMATION COHERENCY MAINTENANCE SYSTEMS AND METHODS - Systems and methods for coherency maintenance are presented. The systems and methods include utilization of multiple information state tracking approaches or protocols at different memory or storage levels. In one embodiment, a first coherency maintenance approach (e.g., similar to a MESI protocol, etc.) can be implemented at one storage level while a second coherency maintenance approach (e.g., similar to a MOESI protocol, etc.) can be implemented at another storage level. Information at a particular storage level or tier can be tracked by a set of local state indications and a set of essence state indications. The essence state indication can be tracked “externally” from a storage layer or tier directory (e.g., in a directory of another cache level, in a hub between cache levels, etc.). One storage level can control operations based upon the local state indications and another storage level can control operations based in least in part upon an essence state indication. | 06-26-2014 |
20140189239 | PROCESSORS HAVING VIRTUALLY CLUSTERED CORES AND CACHE SLICES - A processor of an aspect includes a plurality of logical processors each having one or more corresponding lower level caches. A shared higher level cache is shared by the plurality of logical processors. The shared higher level cache includes a distributed cache slice for each of the logical processors. The processor includes logic to direct an access that misses in one or more lower level caches of a corresponding logical processor to a subset of the distributed cache slices in a virtual cluster that corresponds to the logical processor. Other processors, methods, and systems are also disclosed. | 07-03-2014 |
20140189240 | Apparatus and Method For Reduced Core Entry Into A Power State Having A Powered Down Core Cache - A method performed by a multi-core processor is described. The method includes, while a core is executing program code, reading a dirty cache line from the core's last level cache and sending the dirty cache line from the core for storage external from the core, where, the dirty cache line has not been evicted from the cache nor requested by another core or processor. | 07-03-2014 |
20140189241 | Method and Apparatus to Write Modified Cache Data To A Backing Store While Retaining Write Permissions - A method is described that includes performing the following for a transactional operation in response to a request from a processing unit that is directed to a cache identifying a cache line. Reading the cache line, and, if the cache line is in a Modified cache coherency protocol state, forwarding the cache line to circuitry that will cause the cache line to be written to deeper storage, and, changing another instance of the cache line that is available to the processing unit for the transactional operation to an Exclusive cache coherency state | 07-03-2014 |
20140195737 | Flush Engine - Techniques are disclosed related to flushing one or more data caches. In one embodiment an apparatus includes a processing element, a first cache associated with the processing element, and a circuit configured to copy modified data from the first cache to a second cache in response to determining an activity level of the processing element. In this embodiment, the apparatus is configured to alter a power state of the first cache after the circuit copies the modified data. The first cache may be at a lower level in a memory hierarchy relative to the second cache. In one embodiment, the circuit is also configured to copy data from the second cache to a third cache or a memory after a particular time interval. In some embodiments, the circuit is configured to copy data while one or more pipeline elements of the apparatus are in a low-power state. | 07-10-2014 |
20140201447 | DATA PROCESSING APPARATUS AND METHOD FOR HANDLING PERFORMANCE OF A CACHE MAINTENANCE OPERATION - A data processing apparatus has data processing circuitry for performing data processing operations on data, and a hierarchical cache structure for storing at least a subset of the data for access by the data processing circuitry. The hierarchical cache structure has first and second level caches, and data evicted from the first level cache is routed to the second level cache under the control of second level cache access control circuitry. Cache maintenance circuitry performs a cache maintenance operation in both the first level cache and the second level cache. The access control circuitry is responsive to maintenance indication data to modify the eviction handling operation performed in response to the evicted data, so as to cause the required cache maintenance for the second level cache to be incorporated within the eviction handling operation. | 07-17-2014 |
20140201448 | MANAGEMENT OF PARTIAL DATA SEGMENTS IN DUAL CACHE SYSTEMS - For movement of partial data segments within a computing storage environment having lower and higher levels of cache by a processor, a whole data segment containing one of the partial data segments is promoted to both the lower and higher levels of cache. Requested data of the whole data segment is split and positioned at a Most Recently Used (MRU) portion of a demotion queue of the higher level of cache. | 07-17-2014 |
20140208031 | APPARATUS AND METHOD FOR MEMORY-HIERARCHY AWARE PRODUCER-CONSUMER INSTRUCTIONS - An apparatus and method are described for efficiently transferring data from a producer core to a consumer core within a central processing unit (CPU). For example, one embodiment of a method comprises: A method for transferring a chunk of data from a producer core of a central processing unit (CPU) to consumer core of the CPU, comprising: writing data to a buffer within the producer core of the CPU until a designated amount of data has been written; upon detecting that the designated amount of data has been written, responsively generating an eviction cycle, the eviction cycle causing the data to be transferred from the fill buffer to a cache accessible by both the producer core and the consumer core; and upon the consumer core detecting that data is available in the cache, providing the data to the consumer core from the cache upon receipt of a read signal from the consumer core. | 07-24-2014 |
20140208032 | USE OF FLASH CACHE TO IMPROVE TIERED MIGRATION PERFORMANCE - For data processing in a computing storage environment by a processor device, the computing storage environment incorporating at least high-speed and lower-speed caches, and tiered levels of storage, and at a time in which at least one data segment is to be migrated from one level to another level of the tiered levels of storage, a data migration mechanism is initiated by copying data resident in the lower-speed cache corresponding to the at least one data segment to be migrated to a target on the another level, and reading remaining data, not previously copied from the lower-speed cache, from a source on the one level, and writing the remaining data to the target. | 07-24-2014 |
20140215157 | MONITORING MULTIPLE MEMORY LOCATIONS FOR TARGETED STORES IN A SHARED-MEMORY MULTIPROCESSOR - A system and method for supporting targeted stores in a shared-memory multiprocessor. A targeted store enables a first processor to push a cache line to be stored in a cache memory of a second processor. This eliminates the need for multiple cache-coherence operations to transfer the cache line from the first processor to the second processor. More specifically, the disclosed embodiments provide a system that notifies a waiting thread when a targeted store is directed to monitored memory locations. During operation, the system receives a targeted store which is directed to a specific cache in a shared-memory multiprocessor system. In response, the system examines a destination address for the targeted store to determine whether the targeted store is directed to a monitored memory location which is being monitored for a thread associated with the specific cache. If so, the system informs the thread about the targeted store. | 07-31-2014 |
20140215158 | Executing Requests from Processing Elements with Stacked Memory Devices - Executing requests from processing elements with stacked memory devices includes receiving a request from a processing element, determining which of multiple memory devices contains information pertaining to the request, forwarding the request to a selected memory device of the memory devices, and responding to the processing element with the information in response to receiving the information from the selected memory device. | 07-31-2014 |
20140237186 | FILTERING SNOOP TRAFFIC IN A MULTIPROCESSOR COMPUTING SYSTEM - Filtering snoop traffic in a multiprocessor computing system, each processor in the multiprocessor computing system coupled to a high level cache and a low level cache, the including: receiving a snoop message that identifies an address in shared memory targeted by a write operation; identifying a set in the high level cache that maps to the address in shared memory; determining whether the high level cache includes an entry associated with the address in shared memory; responsive to determining that the high level cache does not include an entry corresponding to the address in shared memory: determining whether the set in the high level cache has been bypassed by an entry in the low level cache; and responsive to determining that the set in the high level cache has not been bypassed by an entry in the low level cache, discarding the snoop message. | 08-21-2014 |
20140237187 | ADAPTIVE MULTILEVEL BINNING TO IMPROVE HIERARCHICAL CACHING - A device driver calculates a tile size for a plurality of cache memories in a cache hierarchy. The device driver calculates a storage capacity of a first cache memory. The device driver calculates a first tile size based on the storage capacity of the first cache memory and one or more additional characteristics. The device driver calculates a storage capacity of a second cache memory. The device driver calculates a second tile size based on the storage capacity of the second cache memory and one or more additional characteristics, where the second tile size is different than the first tile size. The device driver transmits the second tile size to a second coalescing binning unit. One advantage of the disclosed techniques is that data locality and cache memory hit rates are improved where tile size is optimized for each cache level in the cache hierarchy. | 08-21-2014 |
20140244932 | METHOD AND APPARATUS FOR CACHING AND INDEXING VICTIM PRE-DECODE INFORMATION - The present invention provides a method and apparatus for caching pre-decode information. Some embodiments of the apparatus include a first pre-decode array configured to store pre-decode information for an instruction cache line that is resident in a first cache in response to the instruction cache line being evicted from one or more second cache(s). Some embodiments of the apparatus also include a second array configured to store a plurality of bits associated with the first cache. Subsets of the bits are configured to store pointers to the pre-decode information associated with the instruction cache line. | 08-28-2014 |
20140258621 | NON-DATA INCLUSIVE COHERENT (NIC) DIRECTORY FOR CACHE - Embodiments relate to a non-data inclusive coherent (NIC) directory for a symmetric multiprocessor (SMP) of a computer. An aspect includes determining a first eviction entry of a highest-level cache in a multilevel caching structure of the first processor node of the SMP. Another aspect includes determining that the NIC directory is not full. Another aspect includes determining that the first eviction entry of the highest-level cache is owned by a lower-level cache in the multilevel caching structure. Another aspect includes, based on the NIC directory not being full and based on the first eviction entry of the highest-level cache being owned by the lower-level cache, installing an address of the first eviction entry of the highest-level cache in a first new entry in the NIC directory. Another aspect includes invalidating the first eviction entry in the highest-level cache. | 09-11-2014 |
20140258622 | PREFETCHING OF DATA AND INSTRUCTIONS IN A DATA PROCESSING APPARATUS - A data processing apparatus includes a processor and a hierarchical data storage system, including a memory and a cache, for storing the data and the instructions in storage locations identified by physical addresses. The apparatus includes address translation circuitry for mapping the virtual addresses to the physical addresses and load store circuitry receiving access requests from the processor. The store circuitry accesses the translation circuitry to identify physical addresses that correspond to virtual addresses of the received data access requests, and to access the corresponding physical addresses in the hierarchical data storage system. Preload circuitry receives preload requests from the processor indicating virtual addresses storage locations that are to be preloaded. Prefetch circuitry monitors at least some of the accesses performed by the load store circuitry and predicts addresses to be accessed subsequently, and transmits the predicted addresses to the preload circuitry as preload requests. | 09-11-2014 |
20140258623 | MECHANISM FOR COPYING DATA IN MEMORY - An improved mechanism for copying data in memory is described which uses aliasing. In an embodiment, data is accessed from a first location in a memory and stored in a cache line associated with a second, different location in the memory. In response to a subsequent request for data from the second location in the memory, the cache returns the data stored in the cache line associated with the second location in the memory. The method may be implemented using additional hardware logic in the cache which is arranged to receive an aliasing request from a processor which identifies both the first and second locations in memory and triggers the accessing of data from the first location for storing in a cache line associated with the second location. | 09-11-2014 |
20140281234 | SERVING MEMORY REQUESTS IN CACHE COHERENT HETEROGENEOUS SYSTEMS - Apparatus, computer readable medium, and method of servicing memory requests are presented. A read request for a memory block from a requester processing having a processor type may be serviced by providing exclusive access to the requested memory block to the requester processor when the requested memory block was modified a last time it was accessed by a previous requester processor having a same processor type as the processor type of the requester processor. Exclusive access to the requested memory block may be provided to the requester processor based on whether the requested memory block was modified by a previous processor having a same type as the requester processor at least once in the last several times the memory block was in a cache of the previous processor. Exclusive access to the requested memory block may be provided to the requester processor based on a region of the memory block. | 09-18-2014 |
20140281235 | System and Method for Software/Hardware Coordinated Adaptive Performance Monitoring - System and method embodiments are provided for coordinated hardware and software performance monitoring to determine a suitable polling time for memory cache during run time. The system and method adapt to the time-varying software workload by determining a next polling time based on captured local characteristics of memory access pattern over time. The adaptive adjustment of polling time according to system performance and dynamic software workload allows capturing more accurately the local characteristics of memory access pattern. An embodiment method includes capturing, at a current polling instance, hardware performance parameters to manage the memory cache, and adjusting a time interval for a next polling instance according to the hardware performance parameters. The captured hardware performance parameters are used to compute performance metrics, which are then used to determine the time interval for the next polling instance. | 09-18-2014 |
20140281236 | SYSTEMS AND METHODS FOR IMPLEMENTING TRANSACTIONAL MEMORY - Systems and methods for implementing transactional memory access. An example method may comprise initiating a memory access transaction; executing a transactional read operation, using a first buffer associated with a memory access tracking logic, with respect to a first memory location, and/or a transactional write operation, using a second buffer associated with the memory access tracking logic, with respect to a second memory location; executing a non-transactional read operation with respect to a third memory location, and/or a non-transactional write operation with respect to a fourth memory location; responsive to detecting, by the memory access tracking logic, access by a device other than the processor to the first memory location or the second memory location, aborting the memory access transaction; and completing, irrespectively of the state of the third memory location and the fourth memory location, the memory access transaction responsive to failing to detect a transaction aborting condition. | 09-18-2014 |
20140281237 | BROADCAST CACHE COHERENCE ON PARTIALLY-ORDERED NETWORK - A method for cache coherence, including: broadcasting, by a requester cache (RC) over a partially-ordered request network (RN), a peer-to-peer (P2P) request for a cacheline to a plurality of slave caches; receiving, by the RC and over the RN while the P2P request is pending, a forwarded request for the cacheline from a gateway; receiving, by the RC and after receiving the forwarded request, a plurality of responses to the P2P request from the plurality of slave caches; setting an intra-processor state of the cacheline in the RC, wherein the intra-processor state also specifies an inter-processor state of the cacheline; and issuing, by the RC, a response to the forwarded request after setting the intra-processor state and after the P2P request is complete; and modifying, by the RC, the intra-processor state in response to issuing the response to the forwarded request. | 09-18-2014 |
20140281238 | SYSTEMS AND METHODS FOR ACCESSING CACHE MEMORY - Systems and methods for providing data from a cache memory to requestors includes a number of cache memory levels arranged in a hierarchy. The method includes receiving a request for fetching data from the cache memory and determining one or more addresses in a cache memory level which is one level higher than a current cache memory level using one or more prediction algorithms. Further, the method includes pre-fetching the one or more addresses from the high cache memory level and determining if the data is available in the addresses. If data is available in the one or more addresses then data is fetched from the high cache level, else addresses of a next level which is higher than the high cache memory level are determined and pre-fetched. Furthermore, the method includes providing the fetched data to the requestor. | 09-18-2014 |
20140281239 | ADAPTIVE HIERARCHICAL CACHE POLICY IN A MICROPROCESSOR - A method for determining an inclusion policy includes determining a ratio of a capacity of a large cache to a capacity of a core cache in a cache subsystem of a processor and selecting an inclusive policy as the inclusion policy for the cache subsystem in response to the cache ratio exceeding an inclusion threshold. The method may further include selecting a non-inclusive policy in response to the cache ratio not exceeding the inclusion threshold and, responsive to a cache transaction resulting in a cache miss, performing an inclusion operation that invokes the inclusion policy. | 09-18-2014 |
20140281240 | Instructions To Mark Beginning and End Of Non Transactional Code Region Requiring Write Back To Persistent Storage - A processor in described having an interface to non volatile random access memory and logic circuitry. The logic circuitry is to identify cache lines modified by a transaction which views the volatile random access memory as the transaction's persistence storage. The logic circuitry is also to identify cache lines modified by a software process other than a transaction that also views said non volatile random access memory as persistence storage. | 09-18-2014 |
20140281241 | CACHED EVALUATION OF PATHS THROUGH GRAPH-BASED DATA REPRESENTATION - Embodiments of the invention generally provide a method, a computing system, and a computer-readable medium configured to request, cache, and generate translations of paths through graph-based data representations. The computer-implemented method includes receiving a first request for translation, wherein the first request specifies a first path configured to identify first payload data. The computer-implemented method further includes determining whether a graph object stored in the local cache memory includes a first translation associated with the first path. If the local cache memory does not include the first translation, then the first translation is obtained from a remote computing device and stored in the graph object. If the local cache memory does include the first translation associated with the first path, then the first translation is obtained from the local cache memory. The computer-implemented method also includes obtaining the first payload data based on the first translation. | 09-18-2014 |
20140281242 | METHODS, SYSTEMS AND APPARATUS FOR PREDICTING THE WAY OF A SET ASSOCIATIVE CACHE - A method for predicting a way of a set associative shadow cache is disclosed. As a part of a method, a request to fetch a first far taken branch instruction of a first cache line from an instruction cache is received, and responsive to a hit in the instruction cache, a predicted way is selected from a way array using a way that corresponds to the hit in the instruction cache. A second cache line is selected from a shadow cache using the predicted way and the first cache line and the second cache line are forwarded in the same clock cycle. | 09-18-2014 |
20140281243 | MULTIPLE-CORE COMPUTER PROCESSOR - A multi-core computer processor including a plurality of processor cores interconnected in a Network-on-Chip (NoC) architecture, a plurality of caches, each of the plurality of caches being associated with one and only one of the plurality of processor cores, and a plurality of memories, each of the plurality of memories being associated with a different set of at least one of the plurality of processor cores and each of the plurality of memories being configured to be visible in a global memory address space such that the plurality of memories are visible to two or more of the plurality of processor cores. | 09-18-2014 |
20140289470 | Computing Device Having Optimized File System and Methods for Use Therein - A computing device having an optimized file system and methods for use therein. File system optimizations include sector-aligned writes, anchored cluster searches, anchored index searches, companion caches dedicated to particular file management data types and predictive cache updates, all of which expedite processing on the computing device. The file system optimizations are especially advantageous for data collection systems where an embedded device is tasked with logging to a target memory data received in a continuous data stream and where none of the streamed data is deleted until after the target memory has been offloaded to another device. | 09-25-2014 |
20140304473 | METHOD AND SYSTEM FOR CACHE TIERING - A method and system for storing data for retrieval by an application running on a computer system including providing a tiered caching system including at least one cache tier and a base tier, storing data in at least one of said at least one cache tier and said base tier based on a policy, and presenting an application view of said data to the application by a means to organize data. The invention optionally provides an additional overflow tier, and preferably includes multiple cache tiers. | 10-09-2014 |
20140310468 | METHODS AND APPARATUS FOR IMPROVING PERFORMANCE OF SEMAPHORE MANAGEMENT SEQUENCES ACROSS A COHERENT BUS - Techniques are described for a multi-processor having two or more processors that increases the opportunity for a load-exclusive command to take a cache line in an Exclusive state, which results in increased performance when a store-exclusive is executed. A new bus operation read prefer exclusive is used as a hint to other caches that a requesting master is likely to store to the cache line, and, if possible, the other cache should give the line up. In most cases, this will result in the other master giving the line up and the requesting master taking the line Exclusive. In most cases, two or more processors are not performing a semaphore management sequence to the same address at the same time. Thus, a requesting master's load-exclusive is able to take a cache line in the Exclusive state an increased number of times. | 10-16-2014 |
20140317351 | METHOD AND APPARATUS FOR PREVENTING NON-TEMPORAL ENTRIES FROM POLLUTING SMALL STRUCTURES USING A TRANSIENT BUFFER - A method for preventing non-temporal entries from entering small critical structures is disclosed. The method comprises transferring a first entry from a higher level memory structure to an intermediate buffer. It further comprises determining a second entry to be evicted from the intermediate buffer and a corresponding value associated with the second entry. Subsequently, responsive to a determination that the second entry is frequently accessed, the method comprises installing the second entry into a lower level memory structure. Finally, the method comprises installing the first entry into a slot previously occupied by the second entry in the intermediate buffer. | 10-23-2014 |
20140325152 | QUADTREE BASED DATA MANAGEMENT FOR MEMORY - A method for managing memory. The method includes executing a memory management function, and reading data from memory into a particular size array structure using the memory management function based on using quadtree structure sub-functions to scan the particular size array structure for filtering the data iteratively. | 10-30-2014 |
20140325153 | MULTI-HIERARCHY INTERCONNECT SYSTEM AND METHOD FOR CACHE SYSTEM - A multi-hierarchy interconnect system for a cache system having a tag memory and a data memory includes an address interconnect scheduling device and a data interconnect scheduling device. The address interconnect scheduling device performs a tag bank arbitration to schedule address requests to a plurality of tag banks of the tag memory. The data interconnect scheduling device performs a data bank arbitration to schedule data requests to a plurality of data banks of the data memory. Besides, a multi-hierarchy interconnect method for a cache system having a tag memory and a data memory includes: performing a tag bank arbitration to schedule address requests to a plurality of tag banks of the tag memory, and performing a data bank arbitration to schedule data requests to a plurality of data banks of the data memory. | 10-30-2014 |
20140325154 | Private Memory Regions and Coherence Optimizations - A system for optimizing cache coherence message traffic volume is disclosed. The system includes a plurality of caches in a multi-level memory hierarchy and a plurality of agents. Each agent is associated with a cache. The system includes one or more monitoring engines. Each agent in the plurality of agents is associated with a monitoring engine. The agents can execute a processor level software instruction causing a memory region to be private to the agent. Each of the agents is configured to execute a memory access for data on an associated cache and to send a request for data up the hierarchy on a cache miss. The monitoring engine is configured to intercept request for data from an agent and to prevent snooping for the cache line in peer caches when the cache line associated with a memory region represented as private to the agent. | 10-30-2014 |
20140325155 | MANAGING RESOURCES USING RESOURCE EXPIRATION DATA - Resource management techniques, such as cache optimization, are employed to organize resources within caches such that the most requested content (e.g., the most popular content) is more readily available. A service provider utilizes content expiration data as indicative of resource popularity. As resources are requested, the resources propagate through a cache server hierarchy associated with the service provider. More frequently requested resources are maintained at edge cache servers based on shorter expiration data that is reset with each repeated request. Less frequently requested resources are maintained at higher levels of a cache server hierarchy based on longer expiration data associated with cache servers higher on the hierarchy. | 10-30-2014 |
20140325156 | Long Latency Tolerant Decoupled Memory Hierarchy for Simpler and Energy Efficient Designs - A decoupled memory execution verification method is provided that includes executing load and store commands separately using an appropriately programmed computer, where the load and store commands are independent of correctness, where the load commands and the store commands are re-executed in-order at memory retirement to verify correctness, where an energy efficient power decoupled execution of memory (e-PDEMI) is provided. | 10-30-2014 |
20140337580 | GATHER AND SCATTER OPERATIONS IN MULTI-LEVEL MEMORY HIERARCHY - Methods and apparatus relating to gather or scatter operations in a multi-level cache are described. In some embodiments, a logic may determine whether to perform gather or scatter operations at a first memory or a second memory, based in part on a relative performance of performing the gather or scatter operations at the first memory and the second memory. Other embodiments are also described and claimed. | 11-13-2014 |
20140351517 | VALIDATION OF CACHE LOCKING USING INSTRUCTION FETCH AND EXECUTION - A technique for locking a cache memory device (or portion thereof) which includes the following actions: (i) writing full traversal branching instructions in a cache way of a cache memory device; and (ii) subsequent to the writing step, locking the cache way. The locking action is performed by adjusting cache locking data to indicate that data in the cache way will not be overwritten during normal operations of the cache memory device. The writing action and the locking action are performed by a machine. | 11-27-2014 |
20140351518 | MULTI-LEVEL CACHE TRACKING TABLE - Disclosed herein are a computing system, integrated circuit, and method to enhance retrieval of cached data. A tracking table is used to initiate a search for data from a location specified in the table, if the data is not in a first level of a multi-level cache hierarchy. | 11-27-2014 |
20140351519 | SYSTEM AND METHOD FOR PROVIDING CACHE-AWARE LIGHTWEIGHT PRODUCER CONSUMER QUEUES - Aspects of the disclosure pertain to a system and method for providing cache-aware lightweight producer consumer queues. The system is a multiprocessor system configured for specifying separate cache attributes for inner (e.g., local) cache and outer (e.g., shared) cache for promoting lower system overhead. Separate cache attributes are specified such that shared variables are cacheable only in a cache level shared by multiple processors. | 11-27-2014 |
20140351520 | VALIDATION OF CACHE LOCKING USING INSTRUCTION FETCH AND EXECUTION - A technique for locking a cache memory device (or portion thereof) which includes the following actions: (i) writing full traversal branching instructions in a cache way of a cache memory device; and (ii) subsequent to the writing step, locking the cache way. The locking action is performed by adjusting cache locking data to indicate that data in the cache way will not be overwritten during normal operations of the cache memory device. The writing action and the locking action are performed by a machine. | 11-27-2014 |
20140359221 | DETECTING MULTIPLE STRIDE SEQUENCES FOR PREFETCHING - The present application describes some embodiments of a prefetcher that tracks multiple stride sequences for prefetching. Some embodiments of the prefetcher implement a method including generating a sum-of-strides for each of a plurality of stride lengths that are larger than one by summing a number of previous strides that is equal to the stride length. Some embodiments of the method also include prefetching data in response to repetition of one or more of the sum-of-strides for one or more of the plurality of stride lengths. | 12-04-2014 |
20140372700 | Data Filter Cache Designs for Enhancing Energy Efficiency and Performance in Computing Systems - Certain embodiments herein relate to, among other things, designing data cache systems to enhance energy efficiency and performance of computing systems. A data filter cache herein may be designed to store a portion of data stored in a level one (L1) data cache. The data filter cache may reside between the L1 data cache and a register file in the primary compute unit. The data filter cache may therefore be accessed before the L1 data cache when a request for data is received and processed. Upon a data filter cache hit, access to the L1 data cache may be avoided. The smaller data filter cache may therefore be accessed earlier in the pipeline than the larger L1 data cache to promote improved energy utilization and performance. The data filter cache may also be accessed speculatively based on various conditions to increase the chances of having a data filter cache hit. | 12-18-2014 |
20140372701 | METHODS, DEVICES, AND SYSTEMS FOR DETECTING RETURN ORIENTED PROGRAMMING EXPLOITS - Methods, devices, and systems for detecting return-oriented programming (ROP) exploits are disclosed. A system includes a processor, a main memory, and a cache memory. A cache monitor develops an instruction loading profile by monitoring accesses to cached instructions found in the cache memory and misses to instructions not currently in the cache memory. A remedial action unit terminates execution of one or more of the valid code sequences if the instruction loading profile is indicative of execution of an ROP exploit involving one or more valid code sequences. The instruction loading profile may be a hit/miss ratio derived from monitoring cache hits relative to cache misses. The ROP exploits may include code snippets that each include an executable instruction and a return instruction from valid code sequences. | 12-18-2014 |
20140379985 | MULTI-LEVEL AGGREGATION TECHNIQUES FOR MEMORY HIERARCHIES - Embodiments include method, system, and computer program product for providing aggregation hierarchy that is related memory hierarchies. In one embodiment, the method includes determining capacity of a first level memory of a memory hierarchy for processing data relating to completion of an aggregation process and generating a per thread local look-up table in said first level memory upon determining said capacity. Upon the first level memory reaching capacity, a plurality of per thread partitions to store remaining data to complete the aggregation process in a second level memory of the memory hierarchy is generated such that each of said per-thread partitions includes an identical amount of data portion on each thread. The method also includes storing the per thread partitions in said second level memory and providing a single global look up table for each of the identical data portions. | 12-25-2014 |
20150012707 | SYSTEM AND CONTROL PROTOCOL OF LAYERED LOCAL CACHING FOR ADAPTIVE BIT RATE SERVICES - A system for layered local caching of downstream shared media in a hierarchical tree network arrangement includes a first network node on a first distribution network having a first caching controller. The first network node configured to store a video segment transmitted on the first distribution network based on a first instruction received by the first caching controller from a central caching controller communicatively coupled to the first distribution network. The central caching controller is located upstream from the first network node. The system includes a second network node on a second distribution network having a second caching controller and communicatively coupled to the first network node. The second network node configured to store a video segment transmitted on the second distribution network based on a second instruction received by the second caching controller from the first caching controller. | 01-08-2015 |
20150019812 | REPLICATION BETWEEN SITES USING KEYS ASSOCIATED WITH MODIFIED DATA - Systems and methods are disclosed for replicating data stored in an in-memory data cache to a remote site. An example system includes an in-memory data cache and an in-memory keys cache. The system also includes a key insert module that detects a modification to the in-memory data cache, identifies one or more keys of the plurality of keys based on the modification, and inserts the identified one or more keys into the in-memory keys cache. The system further includes an update module that retrieves from the in-memory keys cache a set of keys, retrieves from the in-memory data cache modified data associated with the set of keys, and transmits to a remote site a modification list including the set of keys and the modified data associated with the set of keys. | 01-15-2015 |
20150019813 | MEMORY HIERARCHY USING ROW-BASED COMPRESSION - A system includes a first memory and a device coupleable to the first memory. The device includes a second memory to cache data from the first memory. The second memory includes a plurality of rows, each row including a corresponding set of compressed data blocks of non-uniform sizes and a corresponding set of tag blocks. Each tag block represents a corresponding compressed data block of the row. The device further includes decompression logic to decompress data blocks accessed from the second memory. The device further includes compression logic to compress data blocks to be stored in the second memory. | 01-15-2015 |
20150026404 | Least Recently Used Mechanism for Cache Line Eviction from a Cache Memory - A mechanism for evicting a cache line from a cache memory includes first selecting for eviction a least recently used cache line of a group of invalid cache lines. If all cache lines are valid, selecting for eviction a least recently used cache line of a group of cache lines in which no cache line of the group of cache lines is also stored within a higher level cache memory such as the L1 cache, for example. Lastly, if all cache lines are valid and there are no non-inclusive cache lines, selecting for eviction the least recently used cache line stored in the cache memory. | 01-22-2015 |
20150026405 | SYSTEM AND METHOD FOR PROVIDING A SECOND LEVEL CONNECTION CACHE FOR USE WITH A DATABASE ENVIRONMENT - Described herein is a system and method for providing a level 2 connection cache for use with a database environment. In accordance with an embodiment, a second level, or level 2 (L2), connection cache is used to cache no-session connections for use with a database. When a connection is requested, a no-session connection (NSC) can be retrieved from the cache and a database session is attached. Later, when the connection is closed, the database session is logged off and the no-session connection returned to the cache for subsequent use. | 01-22-2015 |
20150032962 | THREE-DIMENSIONAL PROCESSING SYSTEM HAVING MULTIPLE CACHES THAT CAN BE PARTITIONED, CONJOINED, AND MANAGED ACCORDING TO MORE THAN ONE SET OF RULES AND/OR CONFIGURATIONS - Three-dimensional processing systems are provided which have multiple layers of conjoined chips, wherein one or more chip layers include processor cores that share cache hierarchies over multiple chip layers. The caches can be partitioned, conjoined, and managed according to various sets of rules and configurations. | 01-29-2015 |
20150032963 | DYNAMIC SELECTION OF CACHE LEVELS - An apparatus having a first circuit and a second circuit is disclosed. The first circuit is configured to generate an access request having a first address. The second circuit is configured to (i) initiate a change in a load value of a cache system in response to the access request. The cache system has a plurality of levels. The load value represents a work load on the cache system. The second circuit is further configured to (ii) generate a second address from the first address in response to the load value and (iii) route the access request to one of the levels in the cache system in response to the second address. | 01-29-2015 |
20150032964 | HANDLING VIRTUAL MEMORY ADDRESS SYNONYMS IN A MULTI-LEVEL CACHE HIERARCHY STRUCTURE - Handling virtual memory address synonyms in a multi-level cache hierarchy structure. The multi-level cache hierarchy structure having a first level, L1 cache, the L1 cache being operatively connected to a second level, L2 cache split into a L2 data cache directory and a L2 instruction cache. The L2 data cache directory including directory entries having information of data currently stored in the L1 cache, the L2 cache being operatively connected to a third level, L3 cache. The first level cache is virtually indexed while the second and third levels are physically indexed. Counter bits are allocated in a directory entry of the L2 data cache directory for storing a counter number. The directory entry corresponds to at least one first L1 cache line. A first search is performed in the L1 cache for a requested virtual memory address, wherein the virtual memory address corresponds to a physical memory address tag at a second L1 cache line. | 01-29-2015 |
20150046651 | METHOD FOR STORING MODIFIED INSTRUCTION DATA IN A SHARED CACHE - A processor may include a cache configured to store instructions and memory data for the processor. The cache may store instructions in which a relative address, such as for a branch instruction has been calculated, such that the instruction stored in the cache is modified from how the instruction is stored in main memory. The cache may include additional information in the tag to identify an instruction entry versus a memory data entry. When receiving a cache request, the cache may look at a type tag in addition to an address tag to determine if the request is a hit or a miss based upon the request being for an instruction from an instruction fetch unit or for memory data from a memory management unit. A cache entry may be invalidated and evicted if the address matches but the data type does not match. | 02-12-2015 |
20150046652 | WRITE COMBINING CACHE MICROARCHITECTURE FOR SYNCHRONIZATION EVENTS - A method, computer program product, and system is described that enforces a release consistency with special accesses sequentially consistent (RCsc) memory model and executes release synchronization instructions such as a StRel event without tracking an outstanding store event through a memory hierarchy, while efficiently using bandwidth resources. What is also described is the decoupling of a store event from an ordering of the store event with respect to a RCsc memory model. The description also includes a set of hierarchical read/write combining buffers that coalesce stores from different parts of the system. In addition, a pool component maintains partial order of received store events and release synchronization events to avoid content addressable memory (CAM) structures, full cache flushes, as well as direct write-throughs to memory. The approach improves the performance of both global and local synchronization events since a store event may not need to reach main memory to complete. | 02-12-2015 |
20150046653 | METHODS AND SYSTEMS FOR DETERMINING A CACHE SIZE FOR A STORAGE SYSTEM - Technology for operating a cache sizing system is disclosed. In various embodiments, the technology monitors input/output (IO) accesses to a storage system within a monitor period; tracks an access map for storage addresses within the storage system during the monitor period; and counts a particular access condition of the IO accesses based on the access map during the monitor period. When sizing a cache of the storage system that enables the storage system to provide a specified level of service, the counting is for computing a working set size (WSS) estimate of the storage system. | 02-12-2015 |
20150046654 | CONTROLLING A DYNAMICALLY INSTANTIATED CACHE - A change in workload characteristics detected at one tier of a multi-tiered cache is communicated to another tier of the multi-tiered cache. Multiple caching elements exist at different tiers, and at least one tier includes a cache element that is dynamically resizable. The communicated change in workload characteristics causes the receiving tier to adjust at least one aspect of cache performance in the multi-tiered cache. In one aspect, at least one dynamically resizable element in the multi-tiered cache is resized responsive to the change in workload characteristics. | 02-12-2015 |
20150052303 | SYSTEMS AND METHODS FOR ACQUIRING DATA FOR LOADS AT DIFFERENT ACCESS TIMES FROM HIERARCHICAL SOURCES USING A LOAD QUEUE AS A TEMPORARY STORAGE BUFFER AND COMPLETING THE LOAD EARLY - A method for acquiring cache line data associated with a load from respective hierarchical cache data storage components. As a part of the method, a store queue is accessed for one or more portions of a cache line associated with a load, and, if the one or more portions of the cache line is held in the store queue, the one or more portions of the cache line is stored in a load queue location associated with the load. The load is completed if the one or more portions of the cache line stored in the load queue location includes all portions of the cache line associated with the load. | 02-19-2015 |
20150052304 | SYSTEMS AND METHODS FOR READ REQUEST BYPASSING A LAST LEVEL CACHE THAT INTERFACES WITH AN EXTERNAL FABRIC - Methods for read request bypassing a last level cache which interfaces with an external fabric are disclosed. A method includes identifying a read request for a read transaction, generating a phantom read transaction identifier for the read transaction and forwarding the read transaction with the phantom read transaction identifier beyond a last level cache before detection of a hit or miss with respect to the read transaction. The phantom read transaction identifier acts as a pointer to a real read transaction identifier. | 02-19-2015 |
20150058567 | HIERARCHICAL WRITE-COMBINING CACHE COHERENCE - A method, computer program product, and system is described that enforces a release consistency with special accesses sequentially consistent (RCsc) memory model and executes release synchronization instructions such as a StRel event without tracking an outstanding store event through a memory hierarchy, while efficiently using bandwidth resources. What is also described is the decoupling of a store event from an ordering of the store event with respect to a RCsc memory model. The description also includes a set of hierarchical read-only cache and write-only combining buffers that coalesce stores from different parts of the system. In addition, a pool component maintains partial order of received store events and release synchronization events to avoid content addressable memory (CAM) structures, full cache flushes, as well as direct write-throughs to memory. The approach improves the performance of both global and local synchronization events and reduces overhead in maintaining write-only combining buffers. | 02-26-2015 |
20150058568 | HIERARCHICAL STORAGE FOR LSM-BASED NoSQL STORES - Logically arranged hierarchy or tiered storage may comprise a layer of storage being a faster access storage (e.g. solid state drive (SSD)) and another (e.g., next) layer being a traditional disk (e.g. HDD). In one embodiment, compaction occurs within the higher layer, e.g., until there is no more room and then during the compaction sequence the data may be moved down to the lower layer. In another embodiment, compaction and migration to a lower layer may occur within the higher layer, e.g., based on one or more policies, even if the higher layer is not full. In one embodiment, the data between layers are maintained as disjoint. In one embodiment, the more recent versions are always in the higher layer and the older versions are always in the lower layer. | 02-26-2015 |
20150058569 | NON-DATA INCLUSIVE COHERENT (NIC) DIRECTORY FOR CACHE - Embodiments relate to a non-data inclusive coherent (NIC) directory for a symmetric multiprocessor (SMP) of a computer. An aspect includes determining a first eviction entry of a highest-level cache in a multilevel caching structure of the first processor node of the SMP. Another aspect includes determining that the NIC directory is not full. Another aspect includes determining that the first eviction entry of the highest-level cache is owned by a lower-level cache in the multilevel caching structure. Another aspect includes, based on the NIC directory not being full and based on the first eviction entry of the highest-level cache being owned by the lower-level cache, installing an address of the first eviction entry of the highest-level cache in a first new entry in the NIC directory. Another aspect includes invalidating the first eviction entry in the highest-level cache. | 02-26-2015 |
20150058570 | METHOD OF CONSTRUCTING SHARE-F STATE IN LOCAL DOMAIN OF MULTI-LEVEL CACHE COHERENCY DOMAIN SYSTEM - A method of constructing a Share-F state in a local domain of a multi-level cache coherency domain system, includes: 1) when it is requested to access S state remote data at the same address, determining an accessed data copy by inquiring a remote proxy directory RDIR, and determining whether the data copy is in an inter-node S state and an intra-node F state; 2) directly forwarding the data copy to a requester, and recording the data copy of the current requester as an inter-node Cache coherency domain S state and an intra-node Cache coherency domain F state; and 3) after data forwarding is completed, recording, in a remote data directory RDIR, an intra-node processor losing an F permission state as the inter-node Cache coherency domain S state and the intra-node Cache coherency domain F state. | 02-26-2015 |
20150067259 | MANAGING SHARED CACHE BY MULTI-CORE PROCESSOR - Systems and methods for managing shared cache by multi-core processor. An example processing system comprises: a plurality of processing cores, each processing core communicatively coupled to a last level cache (LLC) slice; and a cache control logic coupled to the plurality of processing cores, the cache control logic configured to perform one of: making an LLC slice of an inactive processing core available to an active processing core or power gating the LLC slice, based on estimating cache requirements by active processing cores. | 03-05-2015 |
20150081974 | STATISTICAL CACHE PROMOTION - Storing data in a cache is disclosed. It is determined that a data record is not stored in a cache. A random value is generated using a threshold value. It is determined whether to store the data record in the cache based at least in part on the generated random value. | 03-19-2015 |
20150089139 | METHOD AND APPARATUS FOR CACHE OCCUPANCY DETERMINATION AND INSTRUCTION SCHEDULING - An apparatus and method for determining whether data needed for one or more operations is stored in a cache and scheduling the operations for execution based on the determination. For example, one embodiment of a processor comprises: a hierarchy of cache levels for caching data including at least a level 1 (L1) cache; cache occupancy determination logic to determine whether data associated with one or more subsequent operations is stored in one of the cache levels; and scheduling logic to schedule execution of the subsequent operations based on the determination of whether data associated with the subsequent operations is stored in the cache levels. | 03-26-2015 |
20150095577 | PARTITIONING SHARED CACHES - Technology is provided for partitioning a shared unified cache in a multi-processor computer system. The technology can receive a request to allocate a portion of a shared unified cache memory for storing only executable instructions, partition the cache memory into multiple partitions, and allocate one of the partitions for storing only executable instructions. The technology can further determine the size of the portion of the cache memory to be allocated for storing only executable instructions as a function of the size of the multi-processor's L1 instruction cache and the number of cores in the multi-processor. | 04-02-2015 |
20150100731 | Techniques for Moving Checkpoint-Based High-Availability Log and Data Directly From a Producer Cache to a Consumer Cache - A technique of operating a data processing system, includes logging addresses for cache lines modified by a producer core in a data array of a producer cache to create a high-availability (HA) log for the producer core. The technique also includes moving the HA log directly from the producer cache to a consumer cache of a consumer core and moving HA data associated with the addresses of the HA log directly from the producer cache to the consumer cache. The HA log corresponds to a cache line that includes multiple of the addresses. Finally, the technique includes processing, by the consumer core, the HA log and the HA data for the data processing system. | 04-09-2015 |
20150100732 | Moving Checkpoint-Based High-Availability Log and Data Directly From a Producer Cache to a Consumer Cache - A technique of operating a data processing system, includes logging addresses for cache lines modified by a producer core in a data array of a producer cache to create a high-availability (HA) log for the producer core. The technique also includes moving the HA log directly from the producer cache to a consumer cache of a consumer core and moving HA data associated with the addresses of the HA log directly from the producer cache to the consumer cache. The HA log corresponds to a cache line that includes multiple of the addresses. Finally, the technique includes processing, by the consumer core, the HA log and the HA data for the data processing system. | 04-09-2015 |
20150106568 | MULTI-TIERED CACHING FOR DATA STORAGE MANAGEMENT IN A DEVICE - A data storage device includes one or more storage media that include multiple physical storage locations. The device also includes at least one cache memory having a logical space that includes a plurality of separately managed logical block address (LBA) ranges. Additionally, a controller is included in the device. The controller is configured to receive data extents addressed by a first LBA and a logical block count. The controller is also configured to identify at least one separately managed LBA range of the plurality of separately managed LBA ranges in the at least one cache memory based on LBAs associated with at least some of the received data extents. The controller stores the at least some of the received data extents in substantially monotonically increasing LBA order in at least one physical storage location, of the at least one cache memory, assigned to the identified at least one LBA range. | 04-16-2015 |
20150113220 | EFFICIENT ONE-PASS CACHE-AWARE COMPRESSION - Exemplary method, system, and computer program product embodiments for efficient one-pass cache-aware compression are provided. In one embodiment, by way of example only, an output of a fast compressor to Huffman encoding for achieving the one-pass cache-aware compression by using a predetermined Huffman-tree upon determining by the fast compressor a final representation of each data byte. | 04-23-2015 |
20150121009 | METHOD AND APPARATUS FOR REFORMATTING PAGE TABLE ENTRIES FOR CACHE STORAGE - A device for and method of storing page table entries in a first cache. A first page table entry is received having a fragment field that contains address information for a requested first page and at least a second page logically adjacent to the first page. A second page table entry is generated from the first page table entry to be stored with the first page table entry. The second page table entry provides address information for the second page. The second page table entry has a configuration that is compatible with the first cache. | 04-30-2015 |
20150127908 | Logging Addresses of High-Availability Data Via a Non-Blocking Channel - A technique for operating a data processing system includes determining whether a cache line that is to be victimized from a cache includes high availability (HA) data that has not been logged. In response determining that the cache line that is to be victimized from the cache includes HA data that has not been logged, an address for the HA data is written to an HA dirty address data structure, e.g., a dirty address table (DAT), in a first memory via a first non-blocking channel. The cache line that is victimized from the cache is written to a second memory via a second non-blocking channel. | 05-07-2015 |
20150127909 | Logging Addresses of High-Availability Data - A technique for operating a high-availability (HA) data processing system includes, in response to receiving an HA logout indication at a cache, initiating a walk of the cache to locate cache lines in the cache that include HA data. In response to determining that a cache line includes HA data, an address of the cache line is logged in a first portion of a buffer in the cache. In response to the first portion of the buffer reaching a determined fill level, contents of the first portion of the buffer are logged to another memory. In response to all cache lines in the cache being walked, the cache walk is terminated. | 05-07-2015 |
20150127910 | Techniques for Logging Addresses of High-Availability Data Via a Non-Blocking Channel - A technique for operating a data processing system includes determining whether a cache line that is to be victimized from a cache includes high availability (HA) data that has not been logged. In response determining that the cache line that is to be victimized from the cache includes HA data that has not been logged, an address for the HA data is written to an HA dirty address data structure, e.g., a dirty address table (DAT), in a first memory via a first non-blocking channel. The cache line that is victimized from the cache is written to a second memory via a second non-blocking channel. | 05-07-2015 |
20150143045 | CACHE CONTROL APPARATUS AND METHOD - Provided are a cache control apparatus and method for reducing a miss penalty. The cache control apparatus includes a first level cache configured to store data in a memory, a second level cache connected to the first level cache, and configured to be accessed by a processor when the first level cache fails to call data according to a data request instruction, a prefetch buffer connected to the first and second level caches, and configured to temporarily store data transferred from the first and second level caches to a core, and a write buffer connected to the first level cache, and configured to receive address information and data of the first level cache. | 05-21-2015 |
20150143046 | SYSTEMS AND METHODS FOR REDUCING FIRST LEVEL CACHE ENERGY BY ELIMINATING CACHE ADDRESS TAGS - Methods and systems which, for example, reduce energy usage in cache memories are described. Cache location information regarding the location of cachelines which are stored in a tracked portion of a memory hierarchy is stored in a cache location table. Address tags are stored with corresponding location information in the cache location table to associate the address tag with the cacheline and its cache location information. When a cacheline is moved to a new location in the memory hierarchy, the cache location table is updated so that the cache location information indicates where the cacheline is located within the memory hierarchy. | 05-21-2015 |
20150143047 | SYSTEMS AND METHODS FOR DIRECT DATA ACCESS IN MULTI-LEVEL CACHE MEMORY HIERARCHIES - Methods and systems for in direct data access in, e.g., multi-level cache memory systems are described. A cache memory system includes a cache location buffer configured to store cache location entries, wherein each cache location entry includes an address tag and a cache location table which are associated with a respective cacheline stored in a cache memory. The system also includes a first cache memory configured to store cachelines, each cacheline having data and an identity of a corresponding cache location entry in the cache location buffer, and a second cache memory configured to store cachelines, each cacheline having data and an identity of a corresponding cache location entry in the cache location buffer. Responsive to a memory access request for a cacheline, the cache location buffer generates access information using one of the cache location tables which enables access to the cacheline without performing a tag comparison at the one of the first and second cache memories. | 05-21-2015 |
20150143048 | MULTI-CPU SYSTEM AND COMPUTING SYSTEM HAVING THE SAME - A multi-CPU data processing system, comprising: a multi-CPU processor, comprising: a first CPU configured with at least a first core, a first cache, and a first cache controller configured to access the first cache; and a second CPU configured with at least a second core, and a second cache controller configured to access a second cache, wherein the first cache is configured from a shared portion of the second cache. | 05-21-2015 |
20150149721 | SELECTIVE VICTIMIZATION IN A MULTI-LEVEL CACHE HIERARCHY - Systems, methods, and apparatuses for implementing selective victimization to reduce power and utilized bandwidth in a multi-level cache hierarchy. Each set of an upper-level cache includes a counter that keeps track of the number of times the set was accessed. These counters are periodically decremented by another counter that tracks the total number of accesses to the cache. If a given set counter is below a certain threshold value, clean victims are dropped from this given set instead of being sent to a lower-level cache. Also, a separate counter is used to track the total number of outstanding requests for the cache as a proxy for bus-bandwidth in order to gauge the total amount of traffic in the system. The cache will implement selective victimization whenever there is a large amount of traffic in the system. | 05-28-2015 |
20150149722 | DELAYING CACHE DATA ARRAY UPDATES - Systems, methods, and apparatuses for reducing writes to the data array of a cache. A cache hierarchy includes one or more L1 caches and a L2 cache inclusive of the L2 cache(s). When a request from the L1 cache misses in the L2 cache, the L2 cache sends a fill request to memory. When the fill data returns from memory, the L2 cache delays writing the fill data to its data array. Instead, this cache line is written to the L1 cache and a clean-evict bit corresponding to the cache line is set in the L1 cache. When the L1 cache evicts this cache line, the L1 cache will write back the cache line to the L2 cache even if the cache line has not been modified. | 05-28-2015 |
20150293845 | MULTI-LEVEL MEMORY HIERARCHY - Described is a system and method for a multi-level memory hierarchy. Each level is based on different attributes including, for example, power, capacity, bandwidth, reliability, and volatility. In some embodiments, the different levels of the memory hierarchy may use an on-chip stacked dynamic random access memory, (providing fast, high-bandwidth, low-energy access to data) and an off-chip non-volatile random access memory, (providing low-power, high-capacity storage), in order to provide higher-capacity, lower power, and higher-bandwidth performance. The multi-level memory may present a unified interface to a processor so that specific memory hardware and software implementation details are hidden. The multi-level memory enables the illusion of a single-level memory that satisfies multiple conflicting constraints. A comparator receives a memory address from the processor, processes the address and reads from or writes to the appropriate memory level. In some embodiments, the memory architecture is visible to the software stack to optimize memory utilization. | 10-15-2015 |
20150293846 | Processing Of Finite Automata Based On Memory Hierarchy - At least one processor may be operatively coupled to a plurality of memories and a node cache and configured to walk nodes of a per-pattern non-deterministic finite automaton (NFA). Nodes of the per-pattern NFA may be stored amongst one or more of the plurality of memories based on a node distribution determined as a function of hierarchical levels mapped to the plurality of memories and per-pattern NFA storage allocation settings configured for the hierarchical levels, optimizing run time performance of the walk. | 10-15-2015 |
20150331608 | ELECTRONIC SYSTEM WITH TRANSACTIONS AND METHOD OF OPERATION THEREOF - An electronic system includes: a storage unit configured to store a data array; a control unit configured to: determine availability of the data array; reorder access to the data array; and provide access to the data array. | 11-19-2015 |
20150331796 | MANAGING MEMORY TRANSACTIONS IN A DISTRIBUTED SHARED MEMORY SYSTEM SUPPORTING CACHING ABOVE A POINT OF COHERENCY - In response to execution in a memory transaction of a transactional load instruction that speculatively binds to a value held in a store-through upper level cache, a processor core sets a flag, transmits a transactional load operation to a store-in lower level cache that tracks a target cache line address of a target cache line containing the value, monitors, during a core TM tracking interval, the target cache line address for invalidation messages from the store-in lower level cache until the store-in lower level cache signals that the store-in lower level cache has assumed responsibility for tracking the target cache line address, and responsive to receipt during the core TM tracking interval of an invalidation message indicating presence of a conflicting snooped operation, resets the flag. At termination of the memory transaction, the processor core fails the memory transaction responsive to the flag being reset. | 11-19-2015 |
20150331798 | MANAGING MEMORY TRANSACTIONS IN A DISTRIBUTED SHARED MEMORY SYSTEM SUPPORTING CACHING ABOVE A POINT OF COHERENCY - In response to execution in a memory transaction of a transactional load instruction that speculatively binds to a value held in a store-through upper level cache, a processor core sets a flag, transmits a transactional load operation to a store-in lower level cache that tracks a target cache line address of a target cache line containing the value, monitors, during a core TM tracking interval, the target cache line address for invalidation messages from the store-in lower level cache until the store-in lower level cache signals that the store-in lower level cache has assumed responsibility for tracking the target cache line address, and responsive to receipt during the core TM tracking interval of an invalidation message indicating presence of a conflicting snooped operation, resets the flag. At termination of the memory transaction, the processor core fails the memory transaction responsive to the flag being reset. | 11-19-2015 |
20150331804 | CACHE LOOKUP BYPASS IN MULTI-LEVEL CACHE SYSTEMS - Techniques described herein are generally related to retrieval of data in computer systems having multi-level caches. The multi-level cache may include at least a first cache and a second cache. The first cache may be configured to receive a request for a cache line. The request may be associated with an instruction executing on a tile of the computer system. A suppression status of the instruction may be determined by a first cache controller to determine whether look-up of the first cache is suppressed based upon the determined suppression status. The request for the cache line may be forwarded to the second cache by the first cache controller after the look-up of the first cache is suppressed. | 11-19-2015 |
20150347299 | PLACEMENT POLICY FOR MEMORY HIERARCHIES - A placement policy enables the selective storage of cachelines in a multi-level cache hierarchy: Reuse behavior of a cacheline is tracked during execution of an application in both a first level cache memory and a second level cache memory. A cache placement policy for the cacheline is determined based on the tracked reuse behavior. | 12-03-2015 |
20150347302 | MANAGEMENT OF SHARED PIPELINE RESOURCE USAGE BASED ON LEVEL INFORMATION - The execution or processing of an application can be adapted or modified based on a level of a cache in which a requested data block resides, by extracting level information from a cache hierarchy. When a request for a data block is made by a core to a cache memory system, the cache memory system extracts a level of a cache memory in which the data block resides from information stored in the cache memory system. The core is informed of the level of the cache memory in which the data block resides, and uses this information to adapt its processing of the application. | 12-03-2015 |
20150347592 | HIERARCHICAL IN-MEMORY SORT ENGINE - A local sorting module includes a set of storage elements storing binary vectors configured in a one-dimensional (1D) or two-dimensional (2D) array structure and separated by respective comparators configured to conditionally compare and sort the binary vectors. The comparators may perform a sort using a compare-and-flip or a compare-and-swap operation. Local sorting modules may be coupled with a global sorting module for enabling a tournament sort algorithm to output values stored in storage elements one at a time until all data is outputted in a predetermined sorting order. | 12-03-2015 |
20150370711 | METHOD AND APPARATUS FOR CACHE MEMORY DATA PROCESSING - Apparatus and methods are disclosed that enable the allocation of a cache portion of a memory buffer to be utilized by an on-cache function controller (OFC) to execute processing functions on “main line” data. A particular method may include receiving, at a memory buffer, a request from a memory controller for allocation of a cache portion of the memory buffer. The method may also include acquiring, by an on-cache function controller (OFC) of the memory buffer, the requested cache portion of the memory buffer. The method may further include executing, by the OFC, a processing function on data stored at the cache portion of the memory buffer. | 12-24-2015 |
20150378900 | CO-PROCESSOR MEMORY ACCESSES IN A TRANSACTIONAL MEMORY - Buffering a memory operand from a co-processor associated with a processor as a speculative store in a store accumulator of the processor, the processor including a cache and the store accumulator, the store accumulator buffering memory operands for writing to a higher level cache. Executing a transactional memory transaction, including the following. Suspending storing of memory operands in the store accumulator entries into the next higher level cache. Initiating a co-processor operation. Accumulating co-processor memory operands into corresponding locations of one or more queue entries of the accumulator associated with the transaction, and not storing the co-processor memory operands into the cache. Based on the transaction ending, storing accumulated co-processor memory operands from the one or more store accumulator entries associated with the transaction into the next higher level cache. | 12-31-2015 |
20150378901 | CO-PROCESSOR MEMORY ACCESSES IN A TRANSACTIONAL MEMORY - Buffering a memory operand from a co-processor associated with a processor as a speculative store in a store accumulator of the processor, the processor including a cache and the store accumulator, the store accumulator buffering memory operands for writing to a higher level cache. Executing a transactional memory transaction, including the following. Suspending storing of memory operands in the store accumulator entries into the next higher level cache. Initiating a co-processor operation. Accumulating co-processor memory operands into corresponding locations of one or more queue entries of the accumulator associated with the transaction, and not storing the co-processor memory operands into the cache. Based on the transaction ending, storing accumulated co-processor memory operands from the one or more store accumulator entries associated with the transaction into the next higher level cache. | 12-31-2015 |
20150378910 | TRANSACTIONAL EXECUTION IN A MULTI-PROCESSOR ENVIRONMENT THAT MONITORS MEMORY CONFLICTS IN A SHARED CACHE - A higher level shared cache of a hierarchical cache of a multi-processor system utilizes transaction identifiers to manage memory conflicts in corresponding transactions. The higher level cache is shared with two or more processors. Transaction indicators are set in the higher level cache corresponding to the cache lines being accessed. The transaction aborts if a memory conflict with the transaction's cache lines from another transaction is detected. | 12-31-2015 |
20150378911 | ALLOWING NON-CACHEABLE LOADS WITHIN A TRANSACTION - A computer allows non-cacheable loads or stores in a hardware transactional memory environment. Transactional loads or stores, by a processor, are monitored in a cache for TX conflicts. The processor accepts a request to execute a transactional execution (TX) transaction. Based on processor execution of a cacheable load or store instruction for loading or storing first memory data of the transaction, the computer can perform a cache miss operation on the cache. Based on processor execution of a non-cacheable load instruction for loading second memory data of the transaction, the computer can not-perform the cache miss operation on the cache based on a cache line associated with the second memory data being not-cached, and load an address of the second memory data into a non-cache-monitor. The TX transaction can be aborted based on the non-cache monitor detecting a memory conflict from another processor. | 12-31-2015 |
20150378919 | SELECTIVE PREFETCHING FOR A SECTORED CACHE - A memory subsystem includes memory hierarchy that performs selective prefetching based on prefetch hints. A lower level memory detects a cache miss for a requested cache line that is part of a superline. The lower level memory generates a request vector for the cache line that triggered the cache miss, including a field for each cache line of the superline. The request vector includes a demand request for the cache line that caused the cache miss, and the lower level memory modifies the request vector with prefetch hint information. The prefetch hint information can indicate a prefetch request for one or more other cache lines in the superline. The lower level memory sends the request vector to the higher level memory with the prefetch hint information, and the higher level memory services the demand request and selectively either services a prefetch hint or drops the prefetch hint. | 12-31-2015 |
20160004638 | Dynamically Configurable Memory - A device includes a memory including ways and a processor in communication with the memory. The processor is configured to execute logic. The logic can monitor a parameter of the processor or a device connected with the processor. The logic can allocate, based on the parameter, a number a ways and a size of ways of the memory for use by the processor. The logic can power down an unallocated number of ways and unused portions of the ways of the memory. | 01-07-2016 |
20160004639 | SYSTEM AND METHOD FOR A CACHE IN A MULTI-CORE PROCESSOR - The invention relates to a multi-core processor system, in particular a single-package multi-core processor system, comprising at least two processor cores, preferably at least four processor cores, each of said a least two cores, preferably at least four processor core, having a local LEVEL-1 cache, a tree communication structure combining the multiple LEVEL-1 caches, the tree having at 1 a one node, preferably at least three nodes for a four processor. core multi-core processor, and TAG information is associated to data managed within the tree, usable in the treatment of the data. | 01-07-2016 |
20160019146 | WRITE BACK COORDINATION NODE FOR CACHE LATENCY CORRECTION - A coordinating node acts as a write back cache, isolating local cache storage endpoints from latencies associated with accessing geographically remote cloud cache and storage resources. | 01-21-2016 |
20160019147 | DECOUPLING DATA AND METADATA IN HIERARCHICAL CACHE SYSTEM - A coordinating node creates virtual storage from a hierarchy of local and remote cache storage resources by maintaining global logical block address (LBA) metadata maps. A size of the metadata maps at each level of the hierarchy is independent of an amount of data allocated to a respective cache store at each level. | 01-21-2016 |
20160019148 | HASH DISCRIMINATOR PROCESS FOR HIERARCHICAL CACHE SYSTEM - A coordinating node maintains globally consistent logical block address (LBA) metadata for a hierarchy of caches, which may be implemented in local and cloud based storage resources. Associated storage endpoints initially determine a hash associated with each access request, but forward the access request to the coordinating node to determine a unique discriminator for each hash. | 01-21-2016 |
20160019149 | HISTORY BASED MEMORY SPECULATION FOR PARTITIONED CACHE MEMORIES - A cache memory that selectively enables and disables speculative reads from system memory is disclosed. The cache memory may include a plurality of partitions, and a plurality of registers. Each register may be configured to stored data indicative of a source of returned data for previous requests directed to a corresponding partition. Circuitry may be configured to receive a request for data to a given partition. The circuitry may be further configured to read contents of a register corresponding to the given partition, and initiate a speculative read dependent upon the contents of the register. | 01-21-2016 |
20160019151 | USING L1 CACHE AS RE-ORDER BUFFER - A method is shown that eliminates the need for a dedicated reorder buffer register bank or memory space in a multi level cache system. As data requests from the L2 cache may be returned out of order, the L1 cache uses it's cache memory to buffer the out of order data and provides the data to the requesting processor in the correct order from the buffer. | 01-21-2016 |
20160019184 | NO-LOCALITY HINT VECTOR MEMORY ACCESS PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS - A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode a no-locality hint vector memory access instruction. The no-locality hint vector memory access instruction to indicate a packed data register of the plurality of packed data registers that is to have a source packed memory indices. The source packed memory indices to have a plurality of memory indices. The no-locality hint vector memory access instruction is to provide a no-locality hint to the processor for data elements that are to be accessed with the memory indices. The processor also includes an execution unit coupled with the decode unit and the plurality of packed data registers. The execution unit, in response to the no-locality hint vector memory access instruction, is to access the data elements at memory locations that are based on the memory indices. | 01-21-2016 |
20160026569 | ZERO CYCLE CLOCK INVALIDATE OPERATION - A method to eliminate the delay of a block invalidate operation in a multi CPU environment by overlapping the block invalidate operation with normal CPU accesses, thus making the delay transparent. A range check is performed on each CPU access while a block invalidate operation is in progress, and an access that maps to within the address range of the block invalidate operation will be treated as a cache miss to ensure that the requesting CPU will receive valid data. | 01-28-2016 |
20160026580 | CACHE LINE CROSSING LOAD TECHNIQUES FOR A CACHING SYSTEM - A technique for handling an unaligned load operation includes detecting a cache line crossing load operation that is associated with a first cache line and a second cache line. In response to an cache including the first cache line but not including the second cache line, the second cache line is reloaded into the cache in a same set as the first cache line. In response to reloading the second cache line in the cache, a cache line crossing link indicator associated with the first cache line is asserted to indicate that both the first and second cache lines include portions of a desired data element. | 01-28-2016 |
20160034400 | DATA PREFETCH RAMP IMPLEMENATION BASED ON MEMORY UTILIZATION - A technique for data prefetching for a multi-core chip includes determining memory utilization of the multi-core chip. In response to the memory utilization of the multi-core chip exceeding a first level, data prefetching for the multi-core chip is modified from a first data prefetching arrangement to a second data prefetching arrangement to minimize unused prefetched cache lines. In response to the memory utilization of the multi-core chip not exceeding the first level, the first data prefetching arrangement is maintained. The first and second data prefetching arrangements are different. | 02-04-2016 |
20160034403 | ACCESS SUPPRESSION IN A MEMORY DEVICE - A memory device and a method of operating the memory device are provided. The memory device comprises a plurality of storage units and access control circuitry. The access control is configured to receive an access request and in response to the access request to initiate an access procedure in each of the plurality of storage units. The access control circuitry is configured to receive an access kill signal after the access procedure has been initiated and, in response to the access kill signal, to initiate an access suppression to suppress the access procedure in at least one of the plurality of storage units. Hence, by initiating the access procedures in all storage units in response to the access request, e.g. without waiting for a further indication of a specific storage unit in which to carry out the access procedure, the overall access time for the memory device kept low, but by enabling at least one of the access procedures later to be suppressed in response to the access kill signal dynamic power consumption of the memory device can be reduced. | 02-04-2016 |
20160041907 | SYSTEMS AND METHODS TO MANAGE TIERED CACHE DATA STORAGE - Systems and methods for managing records stored in a storage cache are provided. A cache index is created and maintained to track where records are stored in buckets in the storage cache. The cache index maps the memory locations of the cached records to the buckets in the cache storage and can be quickly traversed by a metadata manager to determine whether a requested record can be retrieved from the cache storage. Bucket addresses stored in the cache index include a generation number of the bucket that is used to determine whether the cached record is stale. The generation number allows a bucket manager to evict buckets in the cache without having to update the bucket addresses stored in the cache index. Further, the bucket manager is tiered thus allowing efficient use of differing filter functions and even different types of memories as may be desired in a given implementation. | 02-11-2016 |
20160041908 | SYSTEMS AND METHODS FOR MAINTAINING THE COHERENCY OF A STORE COALESCING CACHE AND A LOAD CACHE - A method for maintaining the coherency of a store coalescing cache and a load cache is disclosed. As a part of the method, responsive to a write-back of an entry from a level one store coalescing cache to a level two cache, the entry is written into the level two cache and into the level one load cache. The writing of the entry into the level two cache and into the level one load cache is executed at the speed of access of the level two cache. | 02-11-2016 |
20160041930 | SYSTEMS AND METHODS FOR SUPPORTING A PLURALITY OF LOAD ACCESSES OF A CACHE IN A SINGLE CYCLE - A method for supporting a plurality of load accesses is disclosed. A plurality of requests to access a data cache is accessed, and in response, a tag memory is accessed that maintains a plurality of copies of tags for each entry in the data cache. Tags are identified that correspond to individual requests. The data cache is accessed based on the tags that correspond to the individual requests. A plurality of requests to access the same block of the plurality of blocks causes an access arbitration that is executed in the same clock cycle as is the access of the tag memory. | 02-11-2016 |
20160048452 | DYNAMIC HIERARCHICAL MEMORY CACHE AWARENESS WITHIN A STORAGE SYSTEM - A computing device-implemented method for implementing dynamic hierarchical memory cache (HMC) awareness within a storage system is described. Specifically, when performing dynamic read operations within a storage system, a data module evaluates a data prefetch policy according to a strategy of determining if data exists in a hierarchical memory cache and thereafter amending the data prefetch policy, if warranted. The system then uses the data prefetch policy to perform a read operation from the storage device to minimize future data retrievals from the storage device. Further, in a distributed storage environment that include multiple storage nodes cooperating to satisfy data retrieval requests, dynamic hierarchical memory cache awareness can be implemented for every storage node without degrading the overall performance of the distributed storage environment. | 02-18-2016 |
20160055100 | SYSTEM AND METHOD FOR REVERSE INCLUSION IN MULTILEVEL CACHE HIERARCHY - A processing system having multilevel cache employs techniques for identifying and selecting valid candidate cache lines for eviction from a lower level cache of an inclusive cache hierarchy, so as to reduce invalidations resulting from an eviction of a cache line in a lower level cache that also resides in a higher level cache. In response to an eviction trigger for a lower level cache, a cache controller identifies candidate cache lines for eviction from the cache lines residing in the lower level cache based on the replacement policy. The cache controller uses residency metadata to identify the candidate cache line as a valid candidate if it does not also reside in the higher cache and as an invalid candidate if it does reside in the higher cache. The cache controller prevents eviction of invalid candidates, so as to avoid unnecessary invalidations in the higher cache while maintaining inclusiveness. | 02-25-2016 |
20160062891 | CACHE BACKING STORE FOR TRANSACTIONAL MEMORY - In response to a transactional store request, the higher level cache transmits, to the lower level cache, a backup copy of an unaltered target cache line in response to a target real address hitting in the higher level cache, updates the target cache line with store data to obtain an updated target cache line, and records the target real address as belonging to a transaction footprint of the memory transaction. In response to a conflicting access to the transaction footprint prior to completion of the memory transaction, the higher level cache signals failure of the memory transaction to the processor core, invalidates the updated target cache line in the higher level cache, and causes the backup copy of the target cache line in the lower level cache to be restored as a current version of the target cache line. | 03-03-2016 |
20160062892 | CACHE BACKING STORE FOR TRANSACTIONAL MEMORY - In response to a transactional store request, the higher level cache transmits, to the lower level cache, a backup copy of an unaltered target cache line in response to a target real address hitting in the higher level cache, updates the target cache line with store data to obtain an updated target cache line, and records the target real address as belonging to a transaction footprint of the memory transaction. In response to a conflicting access to the transaction footprint prior to completion of the memory transaction, the higher level cache signals failure of the memory transaction to the processor core, invalidates the updated target cache line in the higher level cache, and causes the backup copy of the target cache line in the lower level cache to be restored as a current version of the target cache line. | 03-03-2016 |
20160062899 | THREAD-BASED CACHE CONTENT SAVING FOR TASK SWITCHING - Embodiments relate to thread-based cache content savings for task switching in a computer processor. An aspect includes determining a cache entry in a cache of the computer processor that is owned by the first thread, wherein the determination is made based on a hardware thread identifier (ID) of the first thread matching a hardware thread ID in the cache entry. Another aspect includes determining whether the determined cache entry is eligible for prefetching. Yet another aspect includes, based on determining that the determined cache entry is eligible for prefetching, setting a marker in the cache entry to active. | 03-03-2016 |
20160062905 | HIERARCHICAL CACHE STRUCTURE AND HANDLING THEREOF - A hierarchical cache structure includes at least one real indexed higher level cache with a directory and a unified cache array for data and instructions, and at least two lower level caches, each split in an instruction cache and a data cache. An instruction cache of a split real indexed second level cache includes a directory and a corresponding cache array connected to the real indexed third level cache. A data cache of the split second level cache includes a directory connected to the third level cache. An instruction cache of a split virtually indexed first level cache is connected to the second level instruction cache. A cache array of a data cache of the first level cache is connected to the cache array of the second level instruction cache and to the cache array of the third level cache. A directory of the first level data cache is connected to the second level instruction cache directory and to the third level cache directory. | 03-03-2016 |
20160070651 | INSTRUCTION AND LOGIC FOR A CACHE PREFETCHER AND DATALESS FILL BUFFER - A processor includes a cache hierarchy and an execution unit. The cache hierarchy includes a lower level cache and a higher level cache. The execution unit includes logic to issue a memory operation to access the cache hierarchy. The lower level cache includes logic to determine that a requested cache line of the memory operation is unavailable in the lower level cache, determine that a line fill buffer of the lower level cache is full, and initiate prefetching of the requested cache line from the higher level cache based upon the determination that the line fill buffer of the lower level cache is full. The line fill buffer is to forward miss requests to the higher level cache. | 03-10-2016 |
20160070659 | Concurrently Executing Critical Sections in Program Code in a Processor - In the described embodiments, entities in a computing device selectively write specified values to a lock variable in a local cache and one or more lower levels of a memory hierarchy to enable multiple entities to enable the concurrent execution of corresponding critical sections of program code that are protected by a same lock. | 03-10-2016 |
20160071601 | SEMICONDUCTOR MEMORY DEVICE - A semiconductor memory device includes: a memory cell array including memory strings, one of the memory strings including memory cells; word lines commonly connected to the memory strings; and a controller configured to execute a write operation and a read operation on a page, the page being stored in memory cells connected to one of the word lines. The controller is configured to measure a cell current flowing in the memory string, and adjust a write voltage applied to a word line, based on a result of the cell current. | 03-10-2016 |
20160092373 | INSTRUCTION AND LOGIC FOR ADAPTIVE DATASET PRIORITIES IN PROCESSOR CACHES - A processor includes a front end, a cache, and a cache controller. The front end includes logic to receive an instruction defining a priority dataset. The priority dataset includes ranges of memory addresses each corresponding to a respective priority level. The cache controller includes logic to detect a miss in the cache for a requested cache value, determine a candidate cache victim from the cache, determine a priority of the requested cache value and the candidate cache victim according to the priority dataset, and evict the candidate cache victim based on a determination that the priority of the candidate cache victim is less or equal to the priority of the requested cache value. | 03-31-2016 |
20160098354 | SYSTEM INCLUDING HIERARCHICAL MEMORY MODULES HAVING DIFFERENT TYPES OF INTEGRATED CIRCUIT MEMORY DEVICES - Volatile memory devices corresponding to a first memory hierarchy may be on a first memory module that is coupled to a memory controller by a first signal path. A nonvolatile memory device corresponding to a second memory hierarchy may be on a second memory module that is coupled to the first memory module by a second signal path. Memory transactions for the nonvolatile memory device may be transferred from the memory controller to the first memory hierarchy using the first signal path, and data associated with an accumulation of the memory transactions may be written from the first memory hierarchy to the second memory hierarchy using the second signal path and a first and second control signal. The first control signal may be generated in view of a detection of wear and the second control signal may be generated in view of a detection of a defect. | 04-07-2016 |
20160110286 | DATA WRITING METHOD AND MEMORY SYSTEM - A data writing method and a memory system are disclosed. The method is applied to a memory system including at least a memory controller and a memory device, and the method includes: receiving, by the memory controller, change information sent by a cache, wherein the change information is information that is generated after the cache divides a first to-be-written cache line cache line of a last level cache LLC into at least one data block and that is used to indicate whether data in each of the at least one data block is changed; for each changed data block in which data is changed, sending, by the memory controller according to the change information, a corresponding column address and corresponding data to the memory device; and for a data block in which data is not changed, skipping performing, by the memory controller according to the change information, a write. | 04-21-2016 |
20160132273 | TIERED CACHING AND MIGRATION IN DIFFERING GRANULARITIES - For data processing in a distributed computing storage environment by a processor device, the distributed computing environment incorporating at least high-speed and lower-speed caches, and managed tiered levels of storage, groups of data segments and clumped hot ones of the data segments are migrated between the tiered levels of storage such that uniformly hot ones of the groups of data segments are migrated to use a Solid State Drive (SSD) portion of the tiered levels of storage; uniformly hot groups of data segments are determined using a first, heat map for a selected one of the group of the data segments; and a second heat map is used to determine the clumped hot groups. | 05-12-2016 |
20160132434 | STORE CACHE FOR TRANSACTIONAL MEMORY - A method to merge one or more non-transactional stores and one or more thread-specific transactional stores into one or more cache line templates in a store buffer in a store cache. The method receives a thread-specific non-transactional store address and a first data, maps the store address to a first cache line template, and merges the first data into the first cache line template, according to a store policy. The method further receives a thread-specific transactional store address and a second data, maps the thread-specific store address into a second cache line template, according to a store policy. The method further writes back a copy of a cache line template to a cache and invalidates a third cache line template, which frees the third cache line template from a store address mapping. | 05-12-2016 |
20160132435 | SPINLOCK RESOURCES PROCESSING - According to an example, in a spinlock processing method, a value of a spinlock cache variable may be read from a cache and the value of the spinlock cache variable may be written into a register. A determination may be made as to whether the value of the spinlock cache variable is an initial value. If yes, the value of the spinlock cache variable in the register may be updated. A determination may also be made as to whether the spinlock cache variable is accessed by a core after the value of the spinlock cache variable is written into the register. If yes, a value of a spinlock cache variable may be obtained from the cache. If no, the updated value of the spinlock cache variable may be written into the cache. Moreover, an access speed of the cache may be larger than an access speed of an L2 cache. | 05-12-2016 |
20160139829 | PROGRAMMABLE ORDERING AND PREFETCH - An input/output bridge controls access to a memory by a number of devices. The bridge enforces ordering of access requests according to a register storing an order configuration, which can be programmed to accommodate a given application. When suspending an access request as a result of enforcing an order configuration, the bridge may also cause a prefetch at the memory for the suspended access request. Subsequently, following the completion of a previous access request meeting the order configuration, the suspended access request is released. Due to the prefetch, an access operation can be completed with minimal delay. | 05-19-2016 |
20160147662 | NESTED CACHE COHERENCY PROTOCOL IN A TIERED MULTI-NODE COMPUTER SYSTEM - A computer system comprising multiple nodes, each node comprising a plurality of processors and a local cache hierarchy, suppresses local cache coherency of a node operations or global cache coherency operations between nodes based on the coherency request being a global or local request, and the state of the cache line at the node. | 05-26-2016 |
20160147666 | Multilevel Cache-Based Data Read/Write Method and Apparatus, and Computer System - A multilevel cache-based data read/write method and a computer system. The method includes acquiring a query address of a physical memory data block in which data is to be read/written, acquiring a cache location attribute of the physical memory data block, querying whether a cache is hit until one cache is hit or all caches are missed, where the querying is performed according to the query address in descending order of levels of caches storable for the physical memory data block, and the levels of the caches are indicated by the cache location attribute, and if one cache is hit, reading/writing the data in the query address of the physical memory data block in the hit cache; or, if all caches are missed, reading/writing the data in the query address of the physical memory data block in a memory. | 05-26-2016 |
20160154736 | CACHE CONTROLLING METHOD FOR MEMORY SYSTEM AND CACHE SYSTEM THEREOF | 06-02-2016 |
20160162406 | Systems, Methods, and Apparatuses to Decompose a Sequential Program Into Multiple Threads, Execute Said Threads, and Reconstruct the Sequential Execution - Systems, methods, and apparatuses for decomposing a sequential program into multiple threads, executing these threads, and reconstructing the sequential execution of the threads are described. A plurality of data cache units (DCUs) store locally retired instructions of speculatively executed threads. A merging level cache (MLC) merges data from the lines of the DCUs. An inter-core memory coherency module (ICMC) globally retire instructions of the speculatively executed threads in the MLC. | 06-09-2016 |
20160170883 | APPARATUS AND METHOD FOR CONSIDERING SPATIAL LOCALITY IN LOADING DATA ELEMENTS FOR EXECUTION | 06-16-2016 |
20160170884 | CACHE SYSTEM WITH A PRIMARY CACHE AND AN OVERFLOW CACHE THAT USE DIFFERENT INDEXING SCHEMES | 06-16-2016 |
20160170886 | MULTI-CORE PROCESSOR SUPPORTING CACHE CONSISTENCY, METHOD, APPARATUS AND SYSTEM FOR DATA READING AND WRITING BY USE THEREOF | 06-16-2016 |
20160179672 | MIRRORING A CACHE HAVING A MODIFIED CACHE STATE | 06-23-2016 |
20160196210 | CACHE MEMORY SYSTEM AND PROCESSOR SYSTEM | 07-07-2016 |
20160203083 | SYSTEMS AND METHODS FOR PROVIDING DYNAMIC CACHE EXTENSION IN A MULTI-CLUSTER HETEROGENEOUS PROCESSOR ARCHITECTURE | 07-14-2016 |
20160253259 | MIXED CACHE MANAGEMENT | 09-01-2016 |
20160378372 | PERFORMANCE OF VIRTUAL MACHINE FAULT TOLERANCE MICRO-CHECKPOINTING USING TRANSACTIONAL MEMORY - Techniques disclosed herein generally describe providing fault tolerance in a virtual machine cluster using hardware transactional memory. According to one embodiment, a micro-checkpointing tool suspends execution of a virtual machine instance on a primary server. The micro-checkpointing tool identifies one or more memory pages associated with the virtual machine instance that were modified since a previous synchronization. The micro-checkpointing tool maps a first task to an operation to be performed on a memory of the primary server, where the first task is to resume the virtual machine instance. The micro-checkpointing tool also maps a second task to an operation to be performed on the memory of the primary server, where the second task is to copy the identified memory pages associated with the virtual machine instance to a secondary server. The first and second tasks are then performed on the memory. | 12-29-2016 |
20160378652 | CACHE MEMORY SYSTEM AND PROCESSOR SYSTEM - A cache memory system has a group of layered memories has two or more memories having different characteristics, an access information storage which stores address conversion information from a virtual address into a physical address, and stores at least one of information on access frequency or information on access restriction, for data to be accessed with an access request, and a controller to select a specific memory from the group of layered memories and perform access control, based on at least one of the information on access frequency and the information on access restriction in the access information storage, for data to be accessed with an access request from the processor, wherein the information on access restriction in the access information storage comprises at least one of read-only information, write-only information, readable and writable information, and dirty information indicating that write-back to a lower layer memory is not yet performed. | 12-29-2016 |
20160378654 | Optimizing Performance of Tiered Storage - Embodiments of the invention relate to a storage system organized into a hierarchy of storage tiers, with at least one tier reflecting a high performance tier and at least one tier reflecting a lower performance tier. The high performance tier has a capacity restriction and has a limited quantity of blocks and pages may be placed in the tier. Assessments are conducted and a preferred selection of blocks and pages are recommended for placement; the recommendation is based on the assessment. The recommendation is converted to an actual placement, resulting in placement of at least one block, an in one embodiment at least one page, in the high performance tier. | 12-29-2016 |
20160378655 | HOT PAGE SELECTION IN MULTI-LEVEL MEMORY HIERARCHIES - Systems, apparatuses, and methods for sorting memory pages in a multi-level heterogeneous memory architecture. The system may classify pages into a first “hot” category or a second “cold” category. The system may attempt to place the “hot” pages into the memory level(s) closest to the systems' processor cores. The system may track parameters associated with each page, with the parameters including number of accesses, types of accesses, power consumed per access, temperature, wearability, and/or other parameters. Based on these parameters, the system may generate a score for each page. Then, the system may compare the score of each page to a threshold. If the score of a given page is greater than the threshold, the given page may be designated as “hot”. If the score of the given page is less than the threshold, the given page may be designated as “cold”. | 12-29-2016 |
20160378671 | CACHE MEMORY SYSTEM AND PROCESSOR SYSTEM - A cache memory includes a first cache memory that is accessible per cache line, and a second cache memory that is accessible per word, the second cache memory being positioned in a same cache layer as the first cache memory. It is achieved to improve an average access speed to the first cache memory and also to improve access efficiency because of data access per word, thereby reducing power consumption. | 12-29-2016 |
20180024941 | ADAPTIVE TABLEWALK TRANSLATION STORAGE BUFFER PREDICTOR | 01-25-2018 |