Jeffrey A. Stuecheli, Austin US

Jeffrey A. Stuecheli, Austin, TX US

Patent application number	Description	Published
20080209135	DATA PROCESSING SYSTEM, METHOD AND INTERCONNECT FABRIC SUPPORTING DESTINATION DATA TAGGING - A data processing system includes a plurality of communication links and a plurality of processing units including a local master processing unit. The local master processing unit includes interconnect logic that couples the processing unit to one or more of the plurality of communication links and an originating master coupled to the interconnect logic. The originating master originates an operation by issuing a write-type request on at least one of the one or more communication links, receives from a snooper in the data processing system a destination tag identifying a route to the snooper, and, responsive to receipt of the combined response and the destination tag, initiates a data transfer including a data payload and a data tag identifying the route provided within the destination tag.	08-28-2008
20080222648	DATA PROCESSING SYSTEM AND METHOD OF DATA PROCESSING SUPPORTING TICKET-BASED OPERATION TRACKING - A data processing system includes a plurality of processing units coupled by a plurality of communication links for point-to-point communication such that at least some of the communication between multiple different ones of the processing units is transmitted via intermediate processing units among the plurality of processing units. The communication includes operations having a request and a combined response representing a system response to the request. At least each intermediate processing unit includes one or more masters that initiate first operations, a snooper that receives at least second operations initiated by at least one other of the plurality of processing units, a physical queue that stores master tags of first operations initiated by the one or more masters within that processing unit, and a ticketing mechanism that assigns to second operations observed at the intermediate processing unit a ticket number indicating an order of observation with respect to other second operations observed by the intermediate processing unit. The ticketing mechanism provides the ticket number assigned to an operation to the snooper for processing with a combined response of the operation.	09-11-2008
20080250208	System and Method for Improving the Page Crossing Performance of a Data Prefetcher - A system and method for improving the page crossing performance of a data prefetcher is presented. A prefetch engine tracks times at which a data stream terminates due to a page boundary. When a certain percentage of data streams terminate at page boundaries, the prefetch engine sets an aggressive profile flag. In turn, when the data prefetch engine receives a real address that corresponds to the beginning/end of a new page, and the aggressive profile flag is set, the prefetch engine uses an aggressive startup profile to generate and schedule prefetches on the assumption that the real address is highly likely to be the continuation of a long data stream. As a result, the system and method minimize latency when crossing real page boundaries when a program is predominately accessing long streams.	10-09-2008
20080301377	DATA PROCESSING SYSTEM, CACHE SYSTEM AND METHOD FOR UPDATING AN INVALID COHERENCY STATE IN RESPONSE TO SNOOPING AN OPERATION - A cache coherent data processing system includes at least first and second coherency domains. In a first cache memory within the first coherency domain of the data processing system, a coherency state field associated with a storage location and an address tag is set to a first data-invalid coherency state that indicates that the address tag is valid and that the storage location does not contain valid data. In response to snooping an exclusive access operation, the exclusive access request specifying a target address matching the address tag and indicating a relative domain location of a requestor that initiated the exclusive access operation, the first cache memory updates the coherency state field from the first data-invalid coherency state to a second data-invalid coherency state that indicates that the address tag is valid, that the storage location does not contain valid data, and whether a target memory block associated with the address tag is cached within the first coherency domain upon successful completion of the exclusive access operation based upon the relative location of the requestor.	12-04-2008
20080307137	DATA PROCESSING SYSTEM, METHOD AND INTERCONNECT FABRIC FOR SYNCHRONIZED COMMUNICATION IN A DATA PROCESSING SYSTEM - A data processing system includes a plurality of processing units, including at least a local master and a local hub, which are coupled for communication via a communication link. The local master includes a master capable of initiating an operation, a snooper capable of receiving an operation, and interconnect logic coupled to a communication link coupling the local master to the local hub. The interconnect logic includes request logic that synchronizes internal transmission of a request of the master to the snooper with transmission, via the communication link, of the request to the local hub.	12-11-2008
20090006766	DATA PROCESSING SYSTEM AND METHOD FOR PREDICTIVELY SELECTING A SCOPE OF BROADCAST OF AN OPERATION UTILIZING A HISTORY-BASED PREDICTION - According to a method of data processing, a predictor is maintained that indicates a historical scope of broadcast for one or more previous operations transmitted on an interconnect of a data processing system. A scope of broadcast of a subsequent operation is predictively selected by reference to the predictor.	01-01-2009
20090265293	Access speculation predictor implemented via idle command processing resources - An access speculation predictor is provided that may be implemented using idle command processing resources, such as registers of idle finite state machines (FSMs) in a memory controller. The access speculation predictor may predict whether to perform speculative retrieval of data for a data request from a main memory of the data processing system based on history information stored for a memory region targeted by the data request. In particular, a first address may be extracted from the data request and compared to memory regions associated with second addresses stored in address registers of a plurality of FSMs of the memory controller. A FSM whose memory region includes the first address may be selected. History information for the memory region may be obtained from the selected FSM. The history information may be used to control whether to speculatively retrieve the data for the data request from a main memory.	10-22-2009
20090327612	Access Speculation Predictor with Predictions Based on a Domain Indicator of a Cache Line - An access speculation predictor may predict whether to perform speculative retrieval of data for a data request from a main memory based on whether or not a domain indicator in the data request indicates that the cache line corresponding to the data has a special invalid state or not. In particular, a first address and a domain indicator are extracted from first data request. The first address is used to select a finite state machine (FSM) of a memory controller based on memory regions associated with the FSMs of the memory controller. Speculative retrieval of data for the first data request from main memory is controlled based on whether the domain indicator identifies the special invalid state or not and, if the domain indicator identifies that the cache line does not have the special invalid state, based on information stored in registers associated with the selected FSM.	12-31-2009
20090327615	Access Speculation Predictor with Predictions Based on a Scope Predictor - An access speculation predictor may predict whether to perform speculative retrieval of data for a data request from a main memory based on whether a scope predictor indicates whether a local or global request is predicted to be necessary to obtain the data for the data request. In particular, a first address and a scope predictor may be extracted from a first data request. A determination may be made as to whether a memory controller receiving the first data request is local to a source of the first data request or not. Speculative retrieval of the data for the first data request from a main memory may be controlled based on whether the memory controller is local to the source of the first data request and whether the scope predictor identifies whether a local or a global request is predicted to be necessary.	12-31-2009
20090327619	Access Speculation Predictor with Predictions Based on Memory Region Prior Requestor Tag Information - An access speculation predictor may predict whether to perform speculative retrieval of data for a data request from a main memory based on whether or not a current requestor tag matches a previous requestor tag. In particular, a first address and a first requester tag may be extracted from a first data request and a finite state machine (FSM) of a memory controller may be selected whose memory region includes the first address. A second requester tag, that identifies a previous requester that attempted to access the memory region association with the selected FSM, may be retrieved from a register associated with the selected FSM and compared to the first requester tag. Speculatively retrieving the data for the first data request from a main memory may be controlled based on results of the comparison of the first requester tag to the second requester tag.	12-31-2009
20100100682	Victim Cache Replacement - A data processing system includes a processor core having an associated upper level cache and a lower level victim cache. In response to a memory access request of the processor core that specifies a non-modifying access to a target coherency granule, a determination is made whether the memory access request hits or misses in a directory of the lower level victim cache. In response to determining that the memory access request hits in the lower level victim cache in a data-valid coherence state, the lower level victim cache provides the target coherency granule of the memory access request to the upper level cache. The lower level victim cache preserves the target coherency granule in the lower level victim cache in a shared coherence state if the memory access request is of a first type and invalidates the target coherency granule if the memory access request is of a second type.	04-22-2010
20100100683	Victim Cache Prefetching - A processing unit for a multiprocessor data processing system includes a processor core and a cache hierarchy coupled to the processor core to provide low latency data access. The cache hierarchy includes an upper level cache coupled to the processor core and a lower level victim cache coupled to the upper level cache. In response to a prefetch request of the processor core that misses in the upper level cache, the lower level victim cache determines whether the prefetch request misses in the directory of the lower level victim cache and, if so, allocates a state machine in the lower level victim cache that services the prefetch request by issuing the prefetch request to at least one other processing unit of the multiprocessor data processing system.	04-22-2010
20100153650	Victim Cache Line Selection - A cache memory includes a cache array including a plurality of congruence classes each containing a plurality of cache lines, where each cache line belongs to one of multiple classes which include at least a first class and a second class. The cache memory also includes a cache directory of the cache array that indicates class membership. The cache memory further includes a cache controller that selects a victim cache line for eviction from a congruence class. If the congruence class contains a cache line belonging to the second class, the cache controller preferentially selects as the victim cache line a cache line of the congruence class belonging to the second class based upon access order. If the congruence class contains no cache line belonging to the second class, the cache controller selects as the victim cache line a cache line belonging to the first class based upon access order.	06-17-2010
20100262778	Empirically Based Dynamic Control of Transmission of Victim Cache Lateral Castouts - In response to a data request, a victim cache line is selected for castout from a lower level cache, and a target lower level cache of one of the plurality of processing units is selected. A determination is made whether the selected target lower level cache has provided more than a threshold number of retry responses to lateral castout (LCO) commands of the first lower level cache, and if so, a different target lower level cache is selected. The first processing unit thereafter issues a LCO command on the interconnect fabric. The LCO command identifies the victim cache line to be castout and indicates that the target lower level cache is an intended destination of the victim cache line. In response to a successful coherence response to the LCO command, the victim cache line is removed from the first lower level cache and held in the second lower level cache.	10-14-2010
20100262783	Mode-Based Castout Destination Selection - In response to a data request of a first of a plurality of processing units, the first processing unit selects a victim cache line to be castout from the lower level cache of the first processing unit and determines whether a mode is set. If not, the first processing unit issues on the interconnect fabric an LCO command identifying the victim cache line and indicating that a lower level cache is the intended destination. If the mode is set, the first processing unit issues a castout command with an alternative intended destination. In response to a coherence response to the LCO command indicating success of the LCO command, the first processing unit removes the victim cache line from its lower level cache, and the victim cache line is held elsewhere in the data processing system. The mode can be set to inhibit castouts to system memory, for example, for testing.	10-14-2010
20100262784	Empirically Based Dynamic Control of Acceptance of Victim Cache Lateral Castouts - A second lower level cache receives an LCO command issued by a first lower level cache on an interconnect fabric. The LCO command indicates an address of a victim cache line to be castout from the first lower level cache and indicates that the second lower level cache is an intended destination of the victim cache line. The second lower level cache determines whether to accept the victim cache line from the first lower level cache based at least in part on the address of the victim cache line indicated by the LCO command. In response to determining not to accept the victim cache line, the second lower level cache provides a coherence response to the LCO command refusing the identified victim cache line. In response to determining to accept the victim cache line, the second lower level cache updates an entry corresponding to the identified victim cache line.	10-14-2010
20100268882	LOAD REQUEST SCHEDULING IN A CACHE HIERARCHY - A system and method for tracking core load requests and providing arbitration and ordering of requests. When a core interface unit (CIU) receives a load operation from the processor core, a new entry in allocated in a queue of the CIU. In response to allocating the new entry in the queue, the CIU detects contention between the load request and another memory access request. In response to detecting contention, the load request may be suspended until the contention is resolved. Received load requests may be stored in the queue and tracked using a least recently used (LRU) mechanism. The load request may then be processed when the load request resides in a least recently used entry in the load request queue. CIU may also suspend issuing an instruction unless a read claim (RC) machine is available. In another embodiment, CIU may issue stored load requests in a specific priority order.	10-21-2010
20110047352	MEMORY COHERENCE DIRECTORY SUPPORTING REMOTELY SOURCED REQUESTS OF NODAL SCOPE - A data processing system includes at least a first through third processing nodes coupled by an interconnect fabric. The first processing node includes a master, a plurality of snoopers capable of participating in interconnect operations, and a node interface that receives a request of the master and transmits the request of the master to the second processing unit with a nodal scope of transmission limited to the second processing node. The second processing node includes a node interface having a directory. The node interface of the second processing node permits the request to proceed with the nodal scope of transmission if the directory does not indicate that a target memory block of the request is cached other than in the second processing node and prevents the request from succeeding if the directory indicates that the target memory block of the request is cached other than in the second processing node.	02-24-2011
20110161589	SELECTIVE CACHE-TO-CACHE LATERAL CASTOUTS - A data processing system includes first and second processing units and a system memory. The first processing unit has first upper and first lower level caches, and the second processing unit has second upper and lower level caches. In response to a data request, a victim cache line to be castout from the first lower level cache is selected, and the first lower level cache selects between performing a lateral castout (LCO) of the victim cache line to the second lower level cache and a castout of the victim cache line to the system memory based upon a confidence indicator associated with the victim cache line. In response to selecting an LCO, the first processing unit issues an LCO command on the interconnect fabric and removes the victim cache line from the first lower level cache, and the second lower level cache holds the victim cache line.	06-30-2011
20120144105	Method and Apparatus for Performing Refresh Operations in High-Density Memories - A method for performing refresh operations is disclosed. In response to a completion of a memory operation, a determination is made whether or not a refresh backlog count is greater than a first predetermined value. In a determination that the refresh backlog count is greater than the first predetermined value, a refresh operation is performed as soon as possible. In a determination that the refresh backlog count is not greater than the first predetermined value, a refresh operation is performed after a delay of an idle count value.	06-07-2012
20120203976	MEMORY COHERENCE DIRECTORY SUPPORTING REMOTELY SOURCED REQUESTS OF NODAL SCOPE - A data processing system includes at least a first through third processing nodes coupled by an interconnect fabric. The first processing node includes a master, a plurality of snoopers capable of participating in interconnect operations, and a node interface that receives a request of the master and transmits the request of the master to the second processing unit with a nodal scope of transmission limited to the second processing node. The second processing node includes a node interface having a directory. The node interface of the second processing node permits the request to proceed with the nodal scope of transmission if the directory does not indicate that a target memory block of the request is cached other than in the second processing node and prevents the request from succeeding if the directory indicates that the target memory block of the request is cached other than in the second processing node.	08-09-2012
20120206984	METHOD AND APPARATUS FOR PERFORMING REFRESH OPERATIONS IN HIGH-DENSITY MEMORIES - A method for performing refresh operations is disclosed. In response to a completion of a memory operation, a determination is made whether or not a refresh backlog count is greater than a first predetermined value. In a determination that the refresh backlog count is greater than the first predetermined value, a refresh operation is performed as soon as possible. In a determination that the refresh backlog count is not greater than the first predetermined value, a refresh operation is performed after a delay of an idle count value.	08-16-2012
20120330802	METHOD AND APPARATUS FOR SUPPORTING MEMORY USAGE ACCOUNTING - An apparatus for providing memory energy accounting within a data processing system having multiple chiplets is disclosed. The apparatus includes a system memory, a memory access collection module, a memory throttle counter, and a memory credit accounting module. The memory access collection module receives a first set of signals from a first cache memory within a chiplet and a second set of signals from a second cache memory within the chiplet. The memory credit accounting module tracks the usage of the system memory on a per user basis according to the results of cache accesses extracted from the first and second set of signals from the first and second cache memories within the chiplet.	12-27-2012
20120330803	METHOD AND APPARATUS FOR SUPPORTING MEMORY USAGE THROTTLING - An apparatus for providing system memory usage throttling within a data processing system having multiple chiplets is disclosed. The apparatus includes a system memory, a memory access collection module, a memory credit accounting module and a memory throttle counter. The memory access collection module receives a first set of signals from a first cache memory within a chiplet and a second set of signals from a second cache memory within the chiplet. The memory credit accounting module tracks the usage of the system memory on a per user virtual partition basis according to the results of cache accesses extracted from the first and second set of signals from the first and second cache memories within the chiplet. The memory throttle counter for provides a throttle control signal to prevent any access to the system memory when the system memory usage has exceeded a predetermined value.	12-27-2012
20120331231	METHOD AND APPARATUS FOR SUPPORTING MEMORY USAGE THROTTLING - An apparatus for providing system memory usage throttling within a data processing system having multiple chiplets is disclosed. The apparatus includes a system memory, a memory access collection module, a memory credit accounting module and a memory throttle counter. The memory access collection module receives a first set of signals from a first cache memory within a chiplet and a second set of signals from a second cache memory within the chiplet. The memory credit accounting module tracks the usage of the system memory on a per user virtual partition basis according to the results of cache accesses extracted from the first and second set of signals from the first and second cache memories within the chiplet. The memory throttle counter for provides a throttle control signal to prevent any access to the system memory when the system memory usage has exceeded a predetermined value.	12-27-2012
20130117513	MEMORY QUEUE HANDLING TECHNIQUES FOR REDUCING IMPACT OF HIGH LATENCY MEMORY OPERATIONS - Techniques for handling queuing of memory accesses prevent passing excessive requests that implicate a region of memory subject to a high latency memory operation, such as a memory refresh operation, memory scrubbing or an internal bus calibration event, to a re-order queue of a memory controller. The memory controller includes a queue for storing pending memory access requests, a re-order queue for receiving the requests, and a control logic implementing a queue controller that determines if there is a collision between a received request and an ongoing high-latency memory operation. If there is a collision, then transfer of the request to the re-order queue may be rejected outright, or a count of existing queued operations that collide with the high latency operation may be used to determine if queuing the new request will exceed a threshold number of such operations.	05-09-2013
20130128682	MEMORY SYSTEM WITH DYNAMIC REFRESHING - An embodiment provided is a memory system with dynamic refreshing that includes a memory device with memory cells. The system also includes a refresh module in communication with the memory device and with a memory controller, the refresh module configured for receiving a refresh command from the memory controller and for refreshing a number of the memory cells in the memory device in response to receiving the refresh command. The number of memory cells refreshed in response to receiving the refresh command is responsive to at least one of a desired bandwidth characteristic and a desired latency characteristic.	05-23-2013
20130138878	Method for Scheduling Memory Refresh Operations Including Power States - A method for performing refresh operations on a rank of memory devices is disclosed. After the completion of a memory operation, a determination is made whether or not a refresh backlog count value is less than a predetermined value and the rank of memory devices is being powered down. If the refresh backlog count value is less than the predetermined value and the rank of memory devices is being powered down, an Idle Count threshold value is set to a maximum value such that a refresh operation will be performed after a maximum delay time. If the refresh backlog count value is not less than the predetermined value or the rank of memory devices is not in a powered down state, the Idle Count threshold value is set based on the slope of an Idle Delay Function such that a refresh operation will be performed accordingly.	05-30-2013
20130151577	Performing Arithmetic Operations Using Both Large and Small Floating Point Values - Mechanisms are provided for performing a floating point arithmetic operation in a data processing system. A plurality of floating point operands of the floating point arithmetic operation are received and bits in a mantissa of at least one floating point operand of the plurality of floating point operands are shifted. One or more bits of the mantissa that are shifted outside a range of bits of the mantissa of at least one floating point operand are stored and a vector value is generated based on the stored one or more bits of the mantissa that are shifted outside of the range of bits of the mantissa of the at least one floating point operand. A resultant value is generated for the floating point arithmetic operation based on the vector value and the plurality of floating point operands.	06-13-2013
20130151578	Performing Arithmetic Operations Using Both Large and Small Floating Point Values - Mechanisms are provided for performing a floating point arithmetic operation in a data processing system. A plurality of floating point operands of the floating point arithmetic operation are received and bits in a mantissa of at least one floating point operand of the plurality of floating point operands are shifted. One or more bits of the mantissa that are shifted outside a range of bits of the mantissa of at least one floating point operand are stored and a vector value is generated based on the stored one or more bits of the mantissa that are shifted outside of the range of bits of the mantissa of the at least one floating point operand. A resultant value is generated for the floating point arithmetic operation based on the vector value and the plurality of floating point operands.	06-13-2013
20130151777	Dynamic Inclusive Policy in a Hybrid Cache Hierarchy Using Hit Rate - A mechanism is provided for dynamic cache allocation using a cache hit rate. A first cache hit rate is monitored in a first subset utilizing a first allocation policy of N sets of a lower level cache. A second cache hit rate is also monitored in a second subset utilizing a second allocation policy different from the first allocation policy of the N sets of the lower level cache. A periodic comparison of the first cache hit rate to the second cache hit rate is made to identify a third allocation policy for a third subset of the N-sets of the lower level cache. The third allocation policy for the third subset is then periodically adjusted to at least one of the first allocation policy or the second allocation policy based on the comparison of the first cache hit rate to the second cache hit rate.	06-13-2013
20130151778	Dynamic Inclusive Policy in a Hybrid Cache Hierarchy Using Bandwidth - A mechanism is provided for dynamic cache allocation using bandwidth. A bandwidth between a higher level cache and a lower level cache is monitored. Responsive to bandwidth usage between the higher level cache and the lower level cache being below a predetermined low bandwidth threshold, the higher level cache and the lower level cache are set to operate in accordance with a first allocation policy. Responsive to bandwidth usage between the higher level cache and the lower level cache being above a predetermined high bandwidth threshold, the higher level cache and the lower level cache are set to operate in accordance with a second allocation policy.	06-13-2013
20130151779	Weighted History Allocation Predictor Algorithm in a Hybrid Cache - A mechanism is provided for weighted history allocation prediction. For each member in a plurality of members in a lower level cache, an associated reference counter is initialized to an initial value based on an operation type that caused data to be allocated to a member location of the member. For each access to the member in the lower level cache, the associated reference counter is incremented. Responsive to a new allocation of data to the lower level cache and responsive to the new allocation of data requiring the victimization of another member in the lower level cache, a member of the lower level cache is identified that has a lowest reference count value in its associated reference counter. The member with the lowest reference count value in its associated reference counter is then evicted.	06-13-2013
20130151780	Weighted History Allocation Predictor Algorithm in a Hybrid Cache - A mechanism is provided for weighted history allocation prediction. For each member in a plurality of members in a lower level cache, an associated reference counter is initialized to an initial value based on an operation type that caused data to be allocated to a member location of the member. For each access to the member in the lower level cache, the associated reference counter is incremented. Responsive to a new allocation of data to the lower level cache and responsive to the new allocation of data requiring the victimization of another member in the lower level cache, a member of the lower level cache is identified that has a lowest reference count value in its associated reference counter. The member with the lowest reference count value in its associated reference counter is then evicted.	06-13-2013
20130173858	Method for Scheduling Memory Refresh Operations Including Power States - A method for performing refresh operations on a rank of memory devices is disclosed. After the completion of a memory operation, a determination is made whether or not a refresh backlog count value is less than a predetermined value and the rank of memory devices is being powered down. If the refresh backlog count value is less than the predetermined value and the rank of memory devices is being powered down, an Idle Count threshold value is set to a maximum value such that a refresh operation will be performed after a maximum delay time. If the refresh backlog count value is not less than the predetermined value or the rank of memory devices is not in a powered down state, the Idle Count threshold value is set based on the slope of an Idle Delay Function such that a refresh operation will be performed accordingly.	07-04-2013
20130212330	MEMORY RECORDER QUEUE BIASING PRECEDING HIGH LATENCY OPERATIONS - A memory system and data processing system for controlling memory refresh operations in dynamic random access memories. The memory controller comprises logic that: tracks a time remaining before a scheduled time for performing a high priority, high latency operation a first memory rank of the memory system; responsive to the time remaining reaching a pre-established early notification time before the schedule time for performing the high priority, high latency operation, biases the re-order queue containing memory access operations targeting the plurality of ranks to prioritize scheduling of any first memory access operations that target the first memory rank. The logic further: schedules the first memory access operations to the first memory rank for early completion relative to other memory access operations in the re-order queue that target other memory ranks; and performs the high priority, high latency operation at the first memory rank at the scheduled time.	08-15-2013
20130232320	PERSISTENT PREFETCH DATA STREAM SETTINGS - A prefetch unit includes a transience register and a length register. The transience register hosts an indication of transient for data stream prefetching. The length register hosts an indication of a stream length for data stream prefetching. The prefetch unit monitors the transience register and the length register. The prefetch unit generates prefetch requests of data streams with a transient property up to the stream length limit when the transience register indicates transient and the length register indicates the stream length limit for data stream prefetching. A cache controller coupled with the prefetch unit implements a cache replacement policy and cache coherence protocols. The cache controller writes data supplied from memory responsive to the prefetch requests into cache with an indication of transient. The cache controller victimizes cache lines with an indication of transient independent of the cache replacement policy.	09-05-2013
20140052936	MEMORY QUEUE HANDLING TECHNIQUES FOR REDUCING IMPACT OF HIGH-LATENCY MEMORY OPERATIONS - Techniques for handling queuing of memory accesses prevent passing excessive requests that implicate a region of memory subject to a high latency memory operation, such as a memory refresh operation, memory scrubbing or an internal bus calibration event, to a re-order queue of a memory controller. The memory controller includes a queue for storing pending memory access requests, a re-order queue for receiving the requests, and a control logic implementing a queue controller that determines if there is a collision between a received request and an ongoing high-latency memory operation. If there is a collision, then transfer of the request to the re-order queue may be rejected outright, or a count of existing queued operations that collide with the high latency operation may be used to determine if queuing the new request will exceed a threshold number of such operations.	02-20-2014
20140082272	Memory Reorder Queue Biasing Preceding High Latency Operations - A method for controlling memory refresh operations in dynamic random access memories. The method includes determining a count of deferred memory refresh operations for a first memory rank. Responsive to the count approaching a high priority threshold, issuing an early high priority refresh notification for the first memory rank, which indicates the pre-determined time for performing a high priority memory refresh operation at the first memory rank. Responsive to the early high priority refresh notification, the behavior of a read reorder queue is dynamically modified to give priority scheduling to at least one read command targeting the first memory rank, and one or more of the at least one read command is executed on the first memory rank according to the priority scheduling. Priority scheduling removes these commands from the re-order queue before the refresh operation is initiated at the first memory rank.	03-20-2014
20140143611	SELECTIVE POSTED DATA ERROR DETECTION BASED ON REQUEST TYPE - In a data processing system, a selection is made, based at least on an access type of a memory access request, between at least a first timing and a second timing of data transmission with respect to completion of error detection processing on a target memory block of the memory access request. In response to receipt of the memory access request and selection of the first timing, data from the target memory block is transmitted to a requestor prior to completion of error detection processing on the target memory block. In response to receipt of the memory access request and selection of the second timing, data from the target memory block is transmitted to the requestor after and in response to completion of error detection processing on the target memory block.	05-22-2014
20140143613	SELECTIVE POSTED DATA ERROR DETECTION BASED ON REQUEST TYPE - In a data processing system, a selection is made, based at least on an access type of a memory access request, between at least a first timing and a second timing of data transmission with respect to completion of error detection processing on a target memory block of the memory access request. In response to receipt of the memory access request and selection of the first timing, data from the target memory block is transmitted to a requestor prior to completion of error detection processing on the target memory block. In response to receipt of the memory access request and selection of the second timing, data from the target memory block is transmitted to the requestor after and in response to completion of error detection processing on the target memory block.	05-22-2014
20140250275	SELECTION OF POST-REQUEST ACTION BASED ON COMBINED RESPONSE AND INPUT FROM THE REQUEST SOURCE - A data structure includes a plurality of entries each corresponding to a different systemwide combined response of a data processing system. A particular entry includes identifiers of multiple possible actions that can be taken in response to a systemwide combined response. Master logic issues a memory access request on a system fabric of the data processing system. The master logic, responsive to receiving the systemwide combined response and a selection of one of the multiple possible actions from a source of the memory access request prior to receipt of the systemwide combined response, selects the particular entry based on the systemwide combined response and selects one of the multiple possible actions identified in the particular entry based on the received selection. The master logic services the memory access request in accordance with the systemwide combined response by performing the selected one of the multiple possible actions.	09-04-2014
20140250276	SELECTION OF POST-REQUEST ACTION BASED ON COMBINED RESPONSE AND INPUT FROM THE REQUEST SOURCE - A data structure includes a plurality of entries each corresponding to a different systemwide combined response of a data processing system. A particular entry includes identifiers of multiple possible actions that can be taken in response to a systemwide combined response. Master logic issues a memory access request on a system fabric of the data processing system. The master logic, responsive to receiving the systemwide combined response and a selection of one of the multiple possible actions from a source of the memory access request prior to receipt of the systemwide combined response, selects the particular entry based on the systemwide combined response and selects one of the multiple possible actions identified in the particular entry based on the received selection. The master logic services the memory access request in accordance with the systemwide combined response by performing the selected one of the multiple possible actions.	09-04-2014
20140304558	TRANSIENT CONDITION MANAGEMENT UTILIZING A POSTED ERROR DETECTION PROCESSING PROTOCOL - In a data processing system, a memory subsystem detects whether or not at least one potentially transient condition is present that would prevent timely servicing of one or more memory access requests directed to the associated system memory. In response to detecting at least one such potentially transient condition, the memory system identifies a first read request affected by the at least one potentially transient condition. In response to identifying the read request, the memory subsystem signals to a request source to issue a second read request for the same target address by transmitting to the request source dummy data and a data error indicator.	10-09-2014
20140304573	TRANSIENT CONDITION MANAGEMENT UTILIZING A POSTED ERROR DETECTION PROCESSING PROTOCOL - In a data processing system, a memory subsystem detects whether or not at least one potentially transient condition is present that would prevent timely servicing of one or more memory access requests directed to the associated system memory. In response to detecting at least one such potentially transient condition, the memory system identifies a first read request affected by the at least one potentially transient condition. In response to identifying the read request, the memory subsystem signals to a request source to issue a second read request for the same target address by transmitting to the request source dummy data and a data error indicator.	10-09-2014
20140310471	PROVISION OF EARLY DATA FROM A LOWER LEVEL CACHE MEMORY - In response to snooping a read-type memory access request of a requestor on a system fabric of a data processing system, a memory channel interface forwards the request to a memory buffer and starts a timer. In response to the forwarded request, the memory buffer performs a lookup of a target address of the request in a memory controller cache. In response to the target address hitting in a coherence state permitting provision of early data, the memory buffer provides a response indicating early data and provides a copy of a target memory block of the request to the memory channel interface. The memory channel interface, responsive to receipt prior to expiration of the timer of the response indicating early data, transmits the copy of the target memory block to the requestor via the system fabric prior to receiving a combined response of the data processing system to the request.	10-16-2014
20140310472	PROVISION OF EARLY DATA FROM A LOWER LEVEL CACHE MEMORY - In response to snooping a read-type memory access request of a requestor on a system fabric of a data processing system, a memory channel interface forwards the request to a memory buffer and starts a timer. In response to the forwarded request, the memory buffer performs a lookup of a target address of the request in a memory controller cache. In response to the target address hitting in a coherence state permitting provision of early data, the memory buffer provides a response indicating early data and provides a copy of a target memory block of the request to the memory channel interface. The memory channel interface, responsive to receipt prior to expiration of the timer of the response indicating early data, transmits the copy of the target memory block to the requestor via the system fabric prior to receiving a combined response of the data processing system to the request.	10-16-2014
20140310477	MODIFICATION OF PREFETCH DEPTH BASED ON HIGH LATENCY EVENT - A prefetch stream is established in a prefetch unit of a memory controller for a system memory at a lowest level of a volatile memory hierarchy of the data processing system based on a memory access request received from a processor core. The memory controller receives an indication of an upcoming high latency event affecting access to the system memory. In response to the indication, the memory controller temporarily increases a prefetch depth of the prefetch stream with respect to the system memory and issues, to the system memory, a plurality of prefetch requests in accordance with the temporarily increased prefetch depth in advance of the upcoming high latency event	10-16-2014
20140310478	MODIFICATION OF PREFETCH DEPTH BASED ON HIGH LATENCY EVENT - A prefetch stream is established in a prefetch unit of a memory controller for a system memory at a lowest level of a volatile memory hierarchy of the data processing system based on a memory access request received from a processor core. The memory controller receives an indication of an upcoming high latency event affecting access to the system memory. In response to the indication, the memory controller temporarily increases a prefetch depth of the prefetch stream with respect to the system memory and issues, to the system memory, a plurality of prefetch requests in accordance with the temporarily increased prefetch depth in advance of the upcoming high latency event	10-16-2014
20140310486	DYNAMIC RESERVATIONS IN A UNIFIED REQUEST QUEUE - A unified request queue includes multiple entries for servicing multiple types of requests. Each of the entries of the unified request queue is generally allocable to requests of any of the multiple request types. A number of entries in the unified request queue is reserved for a first request type among the multiple types of requests. The number of entries reserved for the first request type is dynamically varied based on a number of requests of the first request type rejected by the unified request queue due to allocation of entries in the unified request queue to other requests.	10-16-2014
20140310487	DYNAMIC RESERVATIONS IN A UNIFIED REQUEST QUEUE - A unified request queue includes multiple entries for servicing multiple types of requests. Each of the entries of the unified request queue is generally allocable to requests of any of the multiple request types. A number of entries in the unified request queue is reserved for a first request type among the multiple types of requests. The number of entries reserved for the first request type is dynamically varied based on a number of requests of the first request type rejected by the unified request queue due to allocation of entries in the unified request queue to other requests.	10-16-2014
20140372704	LEAST-RECENTLY-USED (LRU) TO FIRST-DIRTY-MEMBER DISTANCE-MAINTAINING CACHE CLEANING SCHEDULER - A technique for scheduling cache cleaning operations maintains a clean distance between a set of least-recently-used (LRU) clean lines and the LRU dirty (modified) line for each congruence class in the cache. The technique is generally employed at a victim cache at the highest-order level of the cache memory hierarchy, so that write-backs to system memory are scheduled to avoid having to generate a write-back in response to a cache miss in the next lower-order level of the cache memory hierarchy. The clean distance can be determined by counting all of the LRU clean lines in each congruence class that have a reference count that is less than or equal to the reference count of the LRU dirty line.	12-18-2014
20140372705	LEAST-RECENTLY-USED (LRU) TO FIRST-DIRTY-MEMBER DISTANCE-MAINTAINING CACHE CLEANING SCHEDULER - A technique for scheduling cache cleaning operations maintains a clean distance between a set of least-recently-used (LRU) clean lines and the LRU dirty (modified) line for each congruence class in the cache. The technique is generally employed at a victim cache at the highest-order level of the cache memory hierarchy, so that write-backs to system memory are scheduled to avoid having to generate a write-back in response to a cache miss in the next lower-order level of the cache memory hierarchy. The clean distance can be determined by counting all of the LRU clean lines in each congruence class that have a reference count that is less than or equal to the reference count of the LRU dirty line.	12-18-2014
20140379989	COHERENT ATTACHED PROCESSOR PROXY HAVING HYBRID DIRECTORY - A coherent attached processor proxy (CAPP) includes transport logic having a first interface configured to support communication with a system fabric of a primary coherent system and a second interface configured to support communication with an attached processor (AP) that is external to the primary coherent system and that includes a cache memory that holds copies of memory blocks belonging to a coherent address space of the primary coherent system. The CAPP further includes one or more master machines that initiate memory access requests on the system fabric of the primary coherent system on behalf of the AP, one or more snoop machines that service requests snooped on the system fabric, and a CAPP directory having a precise directory having a plurality of entries each associated with a smaller data granule and a coarse directory having a plurality of entries each associated with a larger data granule.	12-25-2014
20140379997	COHERENT ATTACHED PROCESSOR PROXY HAVING HYBRID DIRECTORY - A coherent attached processor proxy (CAPP) includes transport logic having a first interface configured to support communication with a system fabric of a primary coherent system and a second interface configured to support communication with an attached processor (AP) that is external to the primary coherent system and that includes a cache memory that holds copies of memory blocks belonging to a coherent address space of the primary coherent system. The CAPP further includes one or more master machines that initiate memory access requests on the system fabric of the primary coherent system on behalf of the AP, one or more snoop machines that service requests snooped on the system fabric, and a CAPP directory having a precise directory having a plurality of entries each associated with a smaller data granule and a coarse directory having a plurality of entries each associated with a larger data granule.	12-25-2014
20150074162	Performing Arithmetic Operations Using Both Large and Small Floating Point Values - Mechanisms are provided for performing a floating point arithmetic operation in a data processing system. A plurality of floating point operands of the floating point arithmetic operation are received and bits in a mantissa of at least one floating point operand of the plurality of floating point operands are shifted. One or more bits of the mantissa that are shifted outside a range of bits of the mantissa of at least one floating point operand are stored and a vector value is generated based on the stored one or more bits of the mantissa that are shifted outside of the range of bits of the mantissa of the at least one floating point operand. A resultant value is generated for the floating point arithmetic operation based on the vector value and the plurality of floating point operands.	03-12-2015

Patent applications by Jeffrey A. Stuecheli, Austin, TX US

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Jeffrey A. Stuecheli, Austin US

Jeffrey A. Stuecheli, Austin, TX US