Patent application number | Description | Published |
20130282666 | METHOD AND SYSTEM FOR IMPLEMENTING A REDO REPEATER - Disclosed are methods and apparatuses to provide a redo repeater that allows for no data loss protection without the performance impact to the primary database even when a significant geographical distance separates the primary and standby databases. The Repeater is a lightweight entity that receives redo from the primary database with the purpose of redistributing that redo throughout the primary/standby system configuration. The Repeater able to extend no data loss protection and switchover functionality to terminal standby databases even though the primary database does not need to have a direct connection with those destinations. | 10-24-2013 |
20130282667 | METHOD AND SYSTEM FOR IMPLEMENTING A CONDITIONAL REDO REPEATER - Disclosed are methods and apparatuses to provide a redo repeater that allows for no data loss protection without the performance impact to the primary database even when a significant geographical distance separates the primary and standby databases. The Repeater is a lightweight entity that receives redo from the primary database with the purpose of redistributing that redo throughout the primary/standby system configuration. The Repeater able to extend no data loss protection and switchover functionality to terminal standby databases even though the primary database does not need to have a direct connection with those destinations. | 10-24-2013 |
20140258224 | AUTOMATIC RECOVERY OF A FAILED STANDBY DATABASE IN A CLUSTER - A method, system, and computer program product. The method for non-intrusive redeployment of a standby database facility comprises configuring a database system having a shared lock manager process to synchronize two or more concurrent access instances, then granting lock requests for access to a cache of database blocks. At some moment in time, the shared lock manager process may fail, and a monitor process detects the failure or other stoppage of the shared lock manager process. A new shared lock manager process and other processes are started, at least one of which serves for identifying the database blocks in the cache that have not yet been written to the database. The identified blocks are formed into a recovery set of redo operations. During this time, incoming requests for access to the cache of database blocks are briefly blocked, at least until the recovery set of redo operations has been formed. | 09-11-2014 |
Patent application number | Description | Published |
20080235294 | No data loss system with reduced commit latency - Techniques for reducing commit latency in a database system having a primary database system and a standby database system that is receiving a stream of redo data items from the primary. The standby sends an acknowledgment for a received item of redo data before the standby writes the redo data item to a redo log for the stream. When a no more redo event occurs in the standby, the standby sets a “no data lost flag” in the redo log if the stream of redo data items has no gaps and all of the redo data items received in the standby have been written to the redo log. The database system may operate in a first mode in which an acknowledgment is sent as just described and a second mode in which an acknowledgment is sent after the redo data item has been written to the redo log. | 09-25-2008 |
20120030508 | DATABASE SYSTEM CONFIGURED FOR AUTOMATIC FAILOVER WITH USER-LIMITED DATA LOSS - Techniques used in an automatic failover configuration having a primary database system, a standby database system, and an observer. In the automatic failover configuration, the primary database system remains available even in the absence of both the standby and the observer as long as the standby and the observer become absent sequentially. The failover configuration may use asynchronous transfer modes to transfer redo to the standby and permits automatic failover only when the observer is present and the failover will not result in data loss due to the asynchronous transfer mode beyond a specified maximum. The database systems and the observer have copies of failover configuration state and the techniques include techniques for propagating the most recent version of the state among the databases and the observer and techniques for using carefully-ordered writes to ensure that state changes are propagated in a fashion which prevents divergence. | 02-02-2012 |
20140258241 | ZERO AND NEAR-ZERO DATA LOSS DATABASE BACKUP AND RECOVERY - A method, system and computer program product for low loss database backup and recovery. The method commences by transmitting, by a first server to a third server, a copy of a database snapshot backup, the transmitting commencing at a first time. Then capturing, by the first server, a stream of database redo data, the capturing commencing before or upon transmitting the database snapshot backup, and continuing until a third time. The stream of database redo data is received by an intermediate server after which the intermediate server transmits the stream of database redo data to the third server. Now, the third server has the database snapshot backups and the database redo data. The third server can send to a fourth server all or portion of the database redo data to be applied to the copy of the database snapshot backup restored there to create a restored database. | 09-11-2014 |
Patent application number | Description | Published |
20100281471 | METHODS AND APPARATUSES FOR COMPILER-CREATING HELPER THREADS FOR MULTI-THREADING - Methods and apparatuses for compiler-created helper thread for multi-threading are described herein. In one embodiment, exemplary process includes identifying a region of a main thread that likely has one or more delinquent loads, the one or more delinquent loads representing loads which likely suffer cache misses during an execution of the main thread, analyzing the region for one or more helper threads with respect to the main thread, and generating code for the one or more helper threads, the one or more helper threads being speculatively executed in parallel with the main thread to perform one or more tasks for the region of the main thread. Other methods and apparatuses are also described. | 11-04-2010 |
20130061240 | TWO WAY COMMUNICATION SUPPORT FOR HETEROGENOUS PROCESSORS OF A COMPUTER PLATFORM - A computer system may comprise a computer platform and input-output devices. The computer platform may include a plurality of heterogeneous processors comprising a central processing unit (CPU) and a graphics processing unit) GPU, for example. The GPU may be coupled to a GPU compiler and a GPU linker/loader and the CPU may be coupled to a CPU compiler and a CPU linker/loader. The user may create a shared object in an object oriented language and the shared object may include virtual functions. The shared object may be fine grain partitioned between the heterogeneous processors. The GPU compiler may allocate the shared object to the CPU and may create a first and a second enabling path to allow the GPU to invoke virtual functions of the shared object. Thus, the shared object that may include virtual functions may be shared seamlessly between the CPU and the GPU. | 03-07-2013 |
20130219096 | PROGRAMMABLE EVENT DRIVEN YIELD MECHANISM WHICH MAY ACTIVATE OTHER THREADS - Method, apparatus, and program means for a programmable event driven yield mechanism that may activate other threads. In one embodiment, an apparatus includes execution resources to execute a plurality of instructions and a monitor to detect a condition indicating a low level of progress. The monitor can disrupt processing of a program by transferring to a handler in response to detecting the condition indicating a low level of progress. In another embodiment, thread switch logic may be coupled to a plurality of event monitors which monitor events within the multithreading execution logic. The thread switch logic switches threads based at least partially on a programmable condition of one or more of the performance monitors. | 08-22-2013 |
Patent application number | Description | Published |
20080244549 | METHOD AND APPARATUS FOR EXPLOITING THREAD-LEVEL PARALLELISM - According to one example embodiment, there is disclosed herein uses partial recurrence relaxation for parallelizing DOACROSS loops on multi-core computer architectures. By one example definition, a DOACROSS may be a loop that allows successive iterations executing by overlapping; that is, all iterations must impose a partial execution order. According to one embodiment, the inventive subject matter may be used to transform the dependence structure of a given loop with recurrences for maximal degree of thread-level parallelism (TLP), where the threads can be mapped on to either different logical processors (in a hyperthreaded processor) or can be mapped onto different physical cores (or processors) in a multi-core processor. | 10-02-2008 |
20110067011 | TRANSFORMATION OF SINGLE-THREADED CODE TO SPECULATIVE PRECOMPUTATION ENABLED CODE - In one embodiment a thread management method identifies in a main program a set of instructions that can be dynamically activated as speculative precomputation threads. A wait/sleep operation is performed on the speculative precomputation threads between thread creation and activation, and progress of non-speculative threads is gauged through monitoring a set of global variables, allowing the speculative precomputation threads to determine its relative progress with respect to non-speculative threads. | 03-17-2011 |
20110153983 | Gathering and Scattering Multiple Data Elements - According to a first aspect, efficient data transfer operations can be achieved by: decoding by a processor device, a single instruction specifying a transfer operation for a plurality of data elements between a first storage location and a second storage location; issuing the single instruction for execution by an execution unit in the processor; detecting an occurrence of an exception during execution of the single instruction; and in response to the exception, delivering pending traps or interrupts to an exception handler prior to delivering the exception. | 06-23-2011 |
20120166761 | VECTOR CONFLICT INSTRUCTIONS - A processing core implemented on a semiconductor chip is described having first execution unit logic circuitry that includes first comparison circuitry to compare each element in a first input vector against every element of a second input vector. The processing core also has second execution logic circuitry that includes second comparison circuitry to compare a first input value against every data element of an input vector. | 06-28-2012 |
20140095843 | Systems, Apparatuses, and Methods for Performing Conflict Detection and Broadcasting Contents of a Register to Data Element Positions of Another Register - Systems, apparatuses, and methods of performing in a computer processor broadcasting data in response to a single vector packed broadcasting instruction that includes a source writemask register operand, a destination vector register operand, and an opcode. In some embodiments, the data of the source writemask register is zero extended prior to broadcasting. | 04-03-2014 |
20140096119 | LOOP VECTORIZATION METHODS AND APPARATUS - Loop vectorization methods and apparatus are disclosed. An example method includes setting a dynamic adjustment value of a vectorization loop; executing the vectorization loop to vectorize a loop by grouping iterations of the loop into one or more vectors; identifying a dependency between iterations of the loop as; and setting the dynamic adjustment value based on the identified dependency. | 04-03-2014 |
20140115594 | MECHANISM TO SCHEDULE THREADS ON OS-SEQUESTERED SEQUENCERS WITHOUT OPERATING SYSTEM INTERVENTION - Method, apparatus and system embodiments to schedule OS-independent “shreds” without intervention of an operating system. For at least one embodiment, the shred is scheduled for execution by a scheduler routine rather than the operating system. A scheduler routine may run on each enabled sequencer. The schedulers may retrieve shred descriptors from a queue system. The sequencer associated with the scheduler may then execute the shred described by the descriptor. Other embodiments are also described and claimed. | 04-24-2014 |
20140149802 | Apparatus And Method To Obtain Information Regarding Suppressed Faults - A processor includes an execution unit, a fault mask coupled to the execution unit, and a suppress mask coupled to the execution unit. The fault mask is to store a first plurality of bit values to indicate which elements of a multi-element vector have an associated fault generated in response to execution of an instruction on the element in the execution unit. The suppress mask is to store a second plurality of bit values to indicate which of the elements are to have an associated fault suppressed. The processor also includes counter logic to increment a counter in response to an indication of a first fault associated with the first element and received from the fault mask, and an indication of a first suppression associated with the first element and received from the suppress mask. Other embodiments are described as claimed. | 05-29-2014 |
20140189307 | METHODS, APPARATUS, INSTRUCTIONS, AND LOGIC TO PROVIDE VECTOR ADDRESS CONFLICT RESOLUTION WITH VECTOR POPULATION COUNT FUNCTIONALITY - Instructions and logic provide SIMD address conflict resolution with vector population count functionality. Some embodiments include processors with a register with a variable plurality of data fields, each of the data fields to store a variable second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of bits set to one for corresponding data fields. Responsive to decoding a vector population count instruction, execution units count the number of bits set to one for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector population count instructions can be used with variable sized elements and conflict masks to generate iteration counts and completion masks to be used each iteration to resolve dependencies in gather-modify-scatter SIMD operations. | 07-03-2014 |
20140189308 | METHODS, APPARATUS, INSTRUCTIONS, AND LOGIC TO PROVIDE VECTOR ADDRESS CONFLICT DETECTION FUNCTIONALITY - Instructions and logic provide SIMD address conflict detection functionality. Some embodiments include processors with a register with a variable plurality of data fields, each of the data fields to store an offset for a data element in a memory. A destination register has corresponding data fields, each of these data fields to store a variable second plurality of bits to store a conflict mask having a mask bit for each offset. Responsive to decoding a vector conflict instruction, execution units compare the offset in each data field with every less significant data field to determine if they hold a matching offset, and in corresponding conflict masks in the destination register, set any mask bits corresponding to a less significant data field with a matching offset. Vector address conflict detection can be used with variable sized elements and to generate conflict masks to resolve dependencies in gather-modify-scatter SIMD operations. | 07-03-2014 |
20140281401 | Systems, Apparatuses, and Methods for Determining a Trailing Least Significant Masking Bit of a Writemask Register - The execution of a KZBTZ finds a trailing least significant zero bit position in an first input mask and sets an output mask to have the values of the first input mask, but with all bit positions closer to the most significant bit position than the trailing least significant zero bit position in an first input mask set to zero. In some embodiments, a second input mask is used as a writemask such that bit positions of the first input mask are not considered in the trailing least significant zero bit position calculation depending upon a corresponding bit position in the second input mask. | 09-18-2014 |
20140344553 | Gathering and Scattering Multiple Data Elements - According to a first aspect, efficient data transfer operations can be achieved by: decoding by a processor device, a single instruction specifying a transfer operation for a plurality of data elements between a first storage location and a second storage location; issuing the single instruction for execution by an execution unit in the processor; detecting an occurrence of an exception during execution of the single instruction; and in response to the exception, delivering pending traps or interrupts to an exception handler prior to delivering the exception. | 11-20-2014 |
20150052333 | Systems, Apparatuses, and Methods for Stride Pattern Gathering of Data Elements and Stride Pattern Scattering of Data Elements - Embodiments of systems, apparatuses, and methods for performing gather and scatter stride instruction in a computer processor are described. In some embodiments, the execution of a gather stride instruction causes a conditionally storage of strided data elements from memory into the destination register according to at least some of bit values of a writemask. | 02-19-2015 |
Patent application number | Description | Published |
20120254588 | SYSTEMS, APPARATUSES, AND METHODS FOR BLENDING TWO SOURCE OPERANDS INTO A SINGLE DESTINATION USING A WRITEMASK - Embodiments of systems, apparatuses, and methods for performing a blend instruction in a computer processor are described. In some embodiments, the execution of a blend instruction causes a data element-by-element selection of data elements of first and second source operands using the corresponding bit positions of a writemask as a selector between the first and second operands and storage of the selected data elements into the destination at the corresponding position in the destination. | 10-04-2012 |
20120254589 | SYSTEM, APPARATUS, AND METHOD FOR ALIGNING REGISTERS - Embodiments of systems, apparatuses, and methods for performing an align instruction in a computer processor are described. In some embodiments, the execution of an align instruction causes the selective storage of data elements of two concatenated sources to be stored in a destination. | 10-04-2012 |
20120254591 | SYSTEMS, APPARATUSES, AND METHODS FOR STRIDE PATTERN GATHERING OF DATA ELEMENTS AND STRIDE PATTERN SCATTERING OF DATA ELEMENTS - Embodiments of systems, apparatuses, and methods for performing gather and scatter stride instruction in a computer processor are described. In some embodiments, the execution of a gather stride instruction causes a conditionally storage of strided data elements from memory into the destination register according to at least some of bit values of a writemask. | 10-04-2012 |
20120254592 | SYSTEMS, APPARATUSES, AND METHODS FOR EXPANDING A MEMORY SOURCE INTO A DESTINATION REGISTER AND COMPRESSING A SOURCE REGISTER INTO A DESTINATION MEMORY LOCATION - Embodiments of systems, apparatuses, and methods for performing an expand and/or compress instruction in a computer processor are described. In some embodiments, the execution of an expand instruction causes the selection of elements from a source that are to be sparsely stored in a destination based on values of the writemask and store each selected data element of the source as a sparse data element into a destination location, wherein the destination locations correspond to each writemask bit position that indicates that the corresponding data element of the source is to be stored. | 10-04-2012 |
20120254593 | SYSTEMS, APPARATUSES, AND METHODS FOR JUMPS USING A MASK REGISTER - Embodiments of systems, apparatuses, and methods for performing a jump instruction in a computer processor are described. In some embodiments, the execution of a blend instruction causes a conditional jump to an address of a target instruction when all of bits of a writemask are zero, wherein the address of the target instruction is calculated using an instruction pointer of the instruction and the relative offset. | 10-04-2012 |
20130305020 | VECTOR FRIENDLY INSTRUCTION FORMAT AND EXECUTION THEREOF - A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams. | 11-14-2013 |
20130318511 | VECTORIZATION OF SCALAR FUNCTIONS INCLUDING VECTORIZATION ANNOTATIONS AND VECTORIZED FUNCTION SIGNATURES MATCHING - Methods and apparatuses associated with vectorization of scalar callee functions are disclosed herein. In various embodiments, compiling a first program may include generating one or more vectorized versions of a scalar callee function of the first program, based at least in part on vectorization annotations of the first program. Additionally, compiling may include generating one or more vectorized function signatures respectively associated with the one or more vectorized versions of the scalar callee function. The one or more vectorized function signatures may enable an appropriate vectorized version of the scalar callee function to be matched and invoked for a generic call from a caller function of a second program to a vectorized version of the scalar callee function. | 11-28-2013 |
20130326192 | BROADCAST OPERATION ON MASK REGISTER - Embodiments of systems, apparatuses, and methods for performing a mask broadcast instruction in a computer processor are described. In some embodiments, the execution of a mask broadcast instruction causes a broadcast of a data element of the source operand to a destination register of the destination operand according to the broadcast size. | 12-05-2013 |
20140149724 | VECTOR FRIENDLY INSTRUCTION FORMAT AND EXECUTION THEREOF - A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams. | 05-29-2014 |