Entries |
Document | Title | Date |
20080288819 | Computing System with Transactional Memory Using Millicode Assists - A computing system processes memory transactions for parallel processing of multiple threads of execution with millicode assists. The computing system transactional memory support provides a Transaction Table in memory and a method of fast detection of potential conflicts between multiple transactions. Special instructions may mark the boundaries of a transaction and identify memory locations applicable to a transaction. A ‘private to transaction’ (PTRAN) tag, directly addressable as part of the main data storage memory location, enables a quick detection of potential conflicts with other transactions that are concurrently executing on another thread of said computing system. The tag indicates whether (or not) a data entry in memory is part of a speculative memory state of an uncommitted transaction that is currently active in the system. Program millicode provides transactional memory functions including creating and updating transaction tables, committing transactions and controlling the rollback of transactions which fail. | 11-20-2008 |
20090019308 | Method and Apparatus for Data Recovery System Using Storage Based Journaling - A storage system maintains a journal and a snapshot of one or more data volumes. Two journal entry types are maintained, an AFTER journal entry and a BEFORE journal entry. Two modes of data recovery are provided: “fast” recovery and “undo-able” recovery. A combination of both recovery modes allows the user to quickly recover a targeted data state. | 01-15-2009 |
20090044051 | EXTRACTING LOG AND TRACE BUFFERS IN THE EVENT OF SYSTEM CRASHES - A system and program storage device for extracting data of a buffer after a failure of an operating system. An application is registered prior to the failure. The registering includes identifying a buffer in which the data to be extracted is stored prior to the failure. The buffer is reserved to maintain the data residing in the buffer as unchanged from initiation to completion of a fast reboot of the operating system. The fast reboot is responsive to the failure. An in-memory file is generated during the fast reboot, points to the data residing in the buffer, and is stored in volatile memory and not in persistent storage. The data is extracted via an instruction which is executed by the application after completion of the fast reboot, and which operates on the in-memory file. | 02-12-2009 |
20090132853 | Hardware-error tolerant computing - Embodiments include a computing system, a device, and a method. A computing system includes a processor subsystem having an adjustable operating parameter. The computing system also includes an information store operable to save a sequence of instructions. The computing system further includes a controller module. The controller module includes a monitor circuit for detecting an incidence of an operating-parameter-caused error corresponding to an execution of an instruction of the sequence of instructions by the processor subsystem. The controller further includes a control circuit for adjusting the adjustable operating parameter based upon an error-tolerant performance criterion. | 05-21-2009 |
20090132854 | METHOD AND APPARATUS TO LAUNCH WRITE QUEUE READ DATA IN A MICROPROCESSOR RECOVERY UNIT - A method of checkpointing a microprocessor by providing, in parallel, a current read value from a queue and a next read value from the queue, and then selectively passing one of the current read value and next read value to a capture latch based on an instruction completion signal. The capture latch can directly drive the checkpoint register circuitry in the recovery unit of the microprocessor. If the queue is empty, a pair of multiplexers connected to the input of the register queue array are used to pass the input data value. The instruction completion signal may indicate whether all instructions in an instruction group have successfully completed. | 05-21-2009 |
20090217091 | DATA BACKING UP FOR NETWORKED STORAGE DEVICES USING DE-DUPLICATION TECHNIQUE - A technique of backing up data for networked storage devices using de-duplication is disclosed in which a communication device divides a to-be-stored new file into data blocks, defines and updates a statistical value representative of a history of reference to each data block within previous files, and transmits the statistical value to another communication device. The communication device, upon reception of the statistical value, selects a preloaded data block, based on the received statistical value, and transmits to another communication device a copying request for making a copy of a real data block identical to the preloaded data block. The communication device, upon reception of the copy, stores the copy as the preloaded data block. | 08-27-2009 |
20090276662 | Stored Memory Recovery System - Various embodiments of systems and methods for preserving saved memory states to which a computer system can be restored are disclosed. In certain embodiments, the systems and methods intercept write operations to protected memory locations and redirect them to alternate memory locations. Embodiments of the systems and methods include creation of a table for each memory state. Certain embodiments additionally include a recovery capability, by which the protected memory in the computer system is capable of being restored or recovered to a recovery point that represents a saved memory state. Further embodiments relate to systems and methods for preventing protected memory locations from being overwritten that utilize a plurality of memory state values. | 11-05-2009 |
20090300415 | Computer System and Method for Performing Integrity Detection on the Same - The present invention proposes a computer system and a method capable of performing integrity detection, comprising: a running mode unit which comprises an integrity detection boot variable to determine whether or not to initiate an integrity detection boot mode by judging said running mode unit; an EFI integrity detection unit ( | 12-03-2009 |
20090300416 | REMEDYING METHOD FOR TROUBLES IN VIRTUAL SERVER SYSTEM AND SYSTEM THEREOF - According to the invention, a managing server, using a snapshot-appended information table which stores management information for identifying snapshots of a virtual server, a setting change table which stores setting change information on the virtual server, and a policy table which stores policies to be met by the virtual server, acquires the setting change information from the setting change table, selects the setting change information items from the acquired setting change information matching policies stored in the policy table, acquires management information on the snapshots of the virtual server from the snapshot-appended information table, identifies a snapshot of the virtual server with reference to the acquired management information, changes the identified snapshot of the virtual server based on the selected setting change information items, and rolls back the virtual server according to the changed snapshot. | 12-03-2009 |
20090313503 | SYSTEMS AND METHODS OF EVENT DRIVEN RECOVERY MANAGEMENT - Systems and methods of event driven recovery management are disclosed. In one embodiment, a method of providing event driven recovery management includes continually copying one or more data blocks that are generated from a computing device, associating at least one event marker with the copies of the one or more data blocks, and allowing access to the copies of the one or more data blocks according to the at least one event marker in order to provide event driven recovery. For purposes of this disclosure, an event marker, a book mark, an application consistency point, and/or a business event are interchangeably used, depending on the context. | 12-17-2009 |
20100037096 | System-directed checkpointing implementation using a hypervisor layer - While system-directed checkpointing can be implemented in various ways, for example by adding checkpointing support in the memory controller or in the operating system in otherwise standard computers, implementation at the hypervisor level enables the necessary state information to be captured efficiently while providing a number of ancillary advantages over those prior-art methods. This disclosure details procedures for realizing those advantages through relatively minor modifications to normal hypervisor operations. Specifically, by capturing state information in a guest-operating-system-specific manner, any guest operating system can be rolled back independently and resumed without losing either program or input/output (I/O) continuity and without affecting the operation of the other operating systems or their associated applications supported by the same hypervisor. Similarly, by managing I/O queues as described in this disclosure, rollback can be accomplished without requiring I/O operations to be repeated and I/O device failures can be circumvented without losing any I/O data in the process. | 02-11-2010 |
20100037097 | VIRTUAL COMPUTER SYSTEM, ERROR RECOVERY METHOD IN VIRTUAL COMPUTER SYSTEM, AND VIRTUAL COMPUTER CONTROL PROGRAM - A virtual computer system executes a virtual computer control program on a physical computer and thereby causes guest programs on the logical partitions, respectively. The virtual computer control program includes an error recovery module to periodically recover from an error in a cache memory, an error interruption handler module responsive to an interruption notice caused by an error which has occurred in the cache memory, to recover from an error in the cache memory, and an error data initialization module to recover from an error in the cache memory with shutdown or restart of one of the logical partitions as a momentum. And the virtual computer control program conducts recovery processing from an error in the cache memory. | 02-11-2010 |
20100058110 | ISOTROPIC PROCESSOR - The present disclosure is directed toward a method for restoring a computer processor to a previous state. Described is a processor/memory architecture that may store successive instructions/data into a pushdown stack. As instructions are loaded and executed, the loading and executing of new instructions may be suspended. The instruction execution and memory stack then may be restored to a previous processor state in terms of instructions, processor memory state, register values, etc. | 03-04-2010 |
20100095152 | Checkpointing A Hybrid Architecture Computing System - A method, apparatus, and program product checkpoint an application in a parallel computing system of the type that includes a plurality of hybrid nodes. Each hybrid node includes a host element and a plurality of accelerator elements. Each host element may include at least one multithreaded processor, and each accelerator element may include at least one multi-element processor. In a first hybrid node from among the plurality of hybrid nodes, checkpointing the application includes executing at least a portion of the application in the host element, configuring and executing at least one computation kernel in at least one accelerator element, and, in response to receiving a command to checkpoint the application, checkpointing the host element separately from the at least one accelerator element upon which the at least one computation kernel is executing. | 04-15-2010 |
20100174944 | COMPOSITE TASK FRAMEWORK - A primary task manager, which is a local task manager, can perform a distributed task on a local server. If the performing of the task with the local task manager succeeds, the distributed task can then be propagated to at least one secondary task manager, which is a remote task manager. The remote task manager is capable of performing the distributed task. If the performing of the task with the local task manager fails, an undo task that is associated with the distributed task can be performed. | 07-08-2010 |
20100199128 | FANOUT CONNECTIVITY STRUCTURE FOR USE IN FACILITATING PROCESSING WITHIN A PARALLEL COMPUTING ENVIRONMENT - A hierarchical fanout connectivity infrastructure is built and used to start a parallel application within a parallel computing environment. The connectivity infrastructure is passed to a checkpoint library, which employs the infrastructure and a defined sequence of events, to perform checkpoint, restart and/or migration operations on the parallel application. | 08-05-2010 |
20100223499 | FINGERPRINTING EVENT LOGS FOR SYSTEM MANAGEMENT TROUBLESHOOTING - A technique for automatically detecting and correcting configuration errors in a computing system. In a learning process, recurring event sequences, including e.g., registry access events, are identified from event logs, and corresponding rules are developed. In a detecting phase, the rules are applied to detected event sequences to identify violations and to recover from failures. Event sequences across multiple hosts can be analyzed. The recurring event sequences are identified efficiently by flattening a hierarchical sequence of the events such as is obtained from the Sequitur algorithm. A trie is generated from the recurring event sequences and edges of nodes of the trie are marked as rule edges or non-rule edges. A rule is formed from a set of nodes connected by rule edges. The rules can be updated as additional event sequences are analyzed. False positive suppression policies include a violation- consistency policy and an expected event disappearance policy. | 09-02-2010 |
20100251020 | METHOD AND APPARATUS FOR DATA RECOVERY USING STORAGE BASED JOURNALING - A storage system maintains a journal and a snapshot of one or more data volumes. Two journal entry types are maintained, an AFTER journal entry and a BEFORE journal entry. Two modes of data recovery are provided: “fast” recovery and “undo-able” recovery. A combination of both recovery modes allows the user to quickly recover a targeted data state. | 09-30-2010 |
20100262862 | DATA PROCESSING SYSTEM, DATA PROCESSING METHOD, AND COMPUTER - A computer for the stream data processing system includes a query recovery point management table. A recovery point management section determines a recovery point for the stream data processing system by identifying the oldest one of input tuples used for generating output tuples, which are managed, or an earlier tuple through the use of a query recovery point stored in the query recovery point management table, and transmits the determined recovery point for the stream data processing system to an additional computer. The additional computer stores the last-received recovery point for the stream data processing system in a checkpoint file. When the computer for the stream data processing system recovers from a fault, the additional computer transmits data succeeding the stored recovery point to the computer for the stream data processing system. | 10-14-2010 |
20100306586 | Storage apparatus and method of data processing - A storage apparatus includes a backup processing unit that stores data stored in a first memory into a second memory as backup data upon occurrence of a power failure, a restore processing unit that upon recovery from the power failure restores the backup data backed up in the second memory to the first memory and erases the backup data, and an erasure processing termination unit that terminates the erasure processing upon a power failure occurring during erasure processing for erasing the backup data stored in the second memory, and a re-backup processing unit that re-backs up data in the first memory corresponding to the backup data erased from the second memory before the erasure processing is terminated by the erasure processing termination unit to a location in the second memory subsequent to a last location that contains the backup data which has not been erased. | 12-02-2010 |
20110055630 | Safely Rolling Back Transactions In A Transactional Memory System With Concurrent Readers - A technique for safely rolling back transactional memory transactions without impacting concurrent readers of the uncommitted transaction data. An updater uses a transactional memory technique to perform an data update on data that is shared with a reader. The update is implemented as a transaction in which the updated data is initially uncommitted due to the transaction being subject to roll back. The reader is allowed to perform a data read on the uncommitted data during the transaction. Upon a rollback of the transaction, reclamation of memory locations used by the uncommitted data is deferred until a grace period has elapsed after which the reader can no longer be referencing the uncommitted data. | 03-03-2011 |
20110072305 | RECOVERY METHOD MANAGEMENT DEVICE AND RECOVERY METHOD MANAGEMENT METHOD - A recovery method management method includes executing and completing a work on a work target of a system according to a work start command and a work completion command, creating working method information for each work target, acquiring before-work-start system information and after-work-completion system information of the system to create before-and-after-work change information for each work target, storing and managing the working method information and the before-and-after-work change information in a work information managing and storing unit for each work target, creating recovery method information for each similar recovery work target among the work targets on the basis of the working method information, creating before-and-after-recovery change information for each recovery work target on the basis of the before-and-after-work change information, and storing and managing the recovery method information created and the before-and-after-recovery change information created in a recovery method managing and storing unit for each recovery work target. | 03-24-2011 |
20110131447 | Automated modular and secure boot firmware update - A method, apparatus, system, and computer program product for an automated modular and secure boot firmware update. An updated boot firmware code module is received in a secure partition of a system, the updated boot firmware code module to replace one original boot firmware code module for the system. Only the one original boot firmware code module is automatically replaced with the updated boot firmware code module. The updated boot firmware code module is automatically executed with the plurality of boot firmware code modules for the system and without user intervention when the system is next booted. The updated boot firmware code module may be written to an update partition of a firmware volume, wherein the update partition of the firmware volume is read along with another partition of the firmware volume containing the plurality of boot firmware code modules when the system is booted. | 06-02-2011 |
20110131448 | PERFORMING A WORKFLOW HAVING A SET OF DEPENDANCY-RELATED PREDEFINED ACTIVITIES ON A PLURALITY OF TASK SERVERS - A technique of performing a workflow on a plurality of task servers involves starting a plurality of task server processes on the plurality of task servers. Each task server provides an operating system which is constructed and arranged to locally run a respective task server process. The technique further involves receiving a workflow which includes a set of dependency-related predefined activities, and placing task identifiers in a queue structure based on the received workflow. The task identifiers identify tasks to be performed in a distributed manner by the plurality of task server processes started on the plurality of task servers. Each task is a specific execution of a dependency-related predefined activity of the workflow. Progress in performing the workflow is made as the plurality of task server processes (i) claim task identifiers from the queue structure and (ii) perform the tasks identified by the claimed task identifiers. | 06-02-2011 |
20110167300 | DEVICE DRIVER ROLLBACK - Techniques for device driver management/installation are provided. In at least some embodiments, a device driver management system can be employed by a user to selectively rollback a currently installed device driver to one or a plurality of previously installed device driver(s). Additionally, the system can be employed by the user to revert to a pristine state of not having the device driver installed at all, for example, the NULL driver (e.g., in the situation in which the first driver installed on the device causes machine instability). The system stores information associated with driver(s) running on a specific device and allows a user to selectively revert to any one of a plurality of previously installed device driver(s), for example, if they experience a problem with a newer driver. Rollback point(s) can be stored, for example, in the system registry. | 07-07-2011 |
20110185227 | METHOD AND SYSTEM FOR VIRTUAL ON-DEMAND RECOVERY FOR REAL-TIME, CONTINUOUS DATA PROTECTION - A data management system or “DMS” provides an automated, continuous, real-time, substantially no downtime data protection service to one or more data sources associated with a set of application host servers. To facilitate the data protection service, a host driver embedded in an application server captures real-time data transactions, preferably in the form of an event journal that is provided to other DMS components. The driver functions to translate traditional file/database/block I/O and the like into a continuous, application-aware, output data stream. The host driver includes an event processor. When an authorized user determines that a primary copy of the data in the host server has become incorrect or corrupted, the event processor can perform a recovery operation to an entire data source or a subset of the data source using former point-in-time data in the DMS. The recovery operation may have two phases. First, the structure of the host data in primary storage is recovered to the intended recovering point-in-time. Thereafter, the actual data itself is recovered. The event processor enables such data recovery in an on-demand manner, in that it allows recovery to happen simultaneously while an application accesses and updates the recovering data. | 07-28-2011 |
20110185228 | REMEDYING METHOD FOR TROUBLES IN VIRTUAL SERVER SYSTEM AND SYSTEM THEREOF - According to the invention, a managing server, using a snapshot-appended information table which stores management information for identifying snapshots of a virtual server, a setting change table which stores setting change information on the virtual server, and a policy table which stores policies to be met by the virtual server, acquires the setting change information from the setting change table, selects the setting change information items from the acquired setting change information matching policies stored in the policy table, acquires management information on the snapshots of the virtual server from the snapshot-appended information table, identifies a snapshot of the virtual server with reference to the acquired management information, changes the identified snapshot of the virtual server based on the selected setting change information items, and rolls back the virtual server according to the changed snapshot. | 07-28-2011 |
20110264955 | SIGNAL TEST APPARATUS AND METHOD - A system and method for restoring a signal test apparatus to a previous state receives a time interval set by a user to create restore point files. The signal test apparatus tests signals of a test object and creates a restore point file according to the time interval. The restore point file stores signal test data of a test object when the restore point file is created. If the signal test apparatus needs to be restored to a previous state, the signal test data of a latest restore point file are acquired. The acquired signal test data are displayed on a display of the signal test apparatus. | 10-27-2011 |
20110296241 | ACCELERATING RECOVERY IN MPI ENVIRONMENTS - A method, system, and computer usable program product for accelerating recovery in an MPI environment are provided in the illustrative embodiments. A first portion of a distributed application executes using a first processor and a second portion using a second processor in a distributed computing environment. After a failure of operation of the first portion, the first portion is restored to a checkpoint. A first part of the first portion is distributed to a third processor and a second part to a fourth processor. A computation of the first portion is performed using the first and the second parts in parallel. A first message is computed in the first portion and sent to the second portion, the message having been initially computed after a time of the checkpoint. A second message is replayed from the second portion without computing the second message in the second portion. | 12-01-2011 |
20120005530 | System and Method for Communication Between Concurrent Transactions Using Transaction Communicator Objects - Transactional memory implementations may be extended to include special transaction communicator objects through which concurrent transactions can communicate. Changes by a first transaction to a communicator may be visible to concurrent transactions before the first transaction commits. Although isolation of transactions may be compromised by such communication, the effects of this compromise may be limited by tracking dependencies among transactions, and preventing any transaction from committing unless every transaction whose changes it has observed also commits. For example, mutually dependent or cyclically dependent transactions may commit or abort together. Transactions that do not communicate with each other may remain isolated. The system may provide a communicator-isolating transaction that ensures isolation even for accesses to communicators, which may be implemented using nesting transactions. True (e.g., read-after-write) dependencies, ordering (e.g., write-after-write) dependencies, and/or anti-dependencies (e.g., write-after-read dependencies) may be tracked, and a resulting dependency graph may be perused by the commit protocol. | 01-05-2012 |
20120011401 | DYNAMICALLY MODELING AND SELECTING A CHECKPOINT SCHEME BASED UPON AN APPLICATION WORKLOAD - Illustrated is a system and method for executing a checkpoint scheme as part of processing a workload using an application. The system and method also includes identifying a checkpoint event that requires an additional checkpoint scheme. The system and method includes retrieving checkpoint data associated with the checkpoint event. It also includes building a checkpoint model based upon the checkpoint data. The system and method further includes identifying the additional checkpoint scheme, based upon the checkpoint model, the additional checkpoint scheme to be executed as part of the processing of the workload using the application. | 01-12-2012 |
20120096311 | METHOD AND APPARATUS FOR AN IMPROVED FILE REPOSITORY - A method and apparatus for of storing data comprising monitoring a plurality of storage units within a mass storage area and detecting when a storage unit within the mass storage area is overloaded. The method further comprising randomly distributing the data on the overloaded storage unit to the other storage units within the mass storage area. | 04-19-2012 |
20120096312 | ARRANGEMENT FOR RECOVERY OF DATA BY NETWORK NODES BASED ON RETRIEVAL OF ENCODED DATA DISTRIBUTED AMONG THE NETWORK NODES - Distributed data, having been stored in a distributed storage system as a collection of distributed data elements, is recovered based on connection of multiple user nodes, each user node having stored selected distributed data elements as a corresponding portion of the distributed data during replication of the distributed data elements throughout the distributed storage system. Each distributed data element is identifiable by a corresponding unique object identifier (OID). Each user node includes a discovery resource for discovering reachable user nodes, a local cache configured for identifying at least the corresponding portion of the distributed data based on the respective OIDs, and an identification service module configured for resolving a data object to a corresponding OID, via the corresponding local cache, or based on sending a query to the reachable user nodes. Hence, user nodes can recover distributed data based on exchanging resolution information and OID information. | 04-19-2012 |
20120144235 | Reducing Application Downtime During Failover - Reducing application downtime during failover including identifying a critical line in the startup of an application, the critical line comprising the point in the startup of the application in which the application begins to use dependent resources; checkpointing the application at the critical line of startup; identifying a failure in the application; and restarting the application from the checkpointed application at the critical line. | 06-07-2012 |
20120166871 | Methods and Systems for Providing Fault Recovery to Side Effects Occurring During Data Processing - Embodiments may recover from faults by forming a new set of rows by removing rows associated with faulting save operations and repeating the saving and forming operations using the new set of rows until a set of rows that can be saved from the known start state without fault is determined. When the subset of successful rows is found, embodiments are able to provide assurance that no side effects (i.e., code or operations triggered by saving of a data to a particular location) have been executed on behalf of any of the failed rows (side effects from custom PL/SOQL code included) by deferring execution of triggers until an entire set of rows can be saved and committed. | 06-28-2012 |
20120198275 | SYSTEMS AND METHODS FOR TRANSFORMATION OF LOGICAL DATA OBJECTS FOR STORAGE - Systems and methods for transforming a logical data object for storage in a storage device operable with at least one storage protocol, creating, reading, writing, optimization and restoring thereof. Transforming the logical data object comprises creating in the storage device a transformed logical data object comprising one or more allocated storage sections with a predefined size; transforming one or more sequentially obtained chunks of obtained data corresponding to the transforming logical data object; and sequentially storing the processed data chunks into said storage sections in accordance with a receive order of said chunks, wherein said storage sections serve as atomic elements of transformation/de-transformation operations during input/output transactions on the logical data object. The processing may comprise two or more data transformation techniques coordinated in time, concurrently executing autonomous sets of instructions, and provided in a manner preserving the sequence of processing and storing the processed data chunks. | 08-02-2012 |
20120210168 | METHOD AND SYSTEM FOR ERROR CORRECTION OF A STORAGE MEDIA - A data file on a storage media is processed during playback or execution to identify unreadable data. Replacement data corresponding to the unreadable data is obtained over a communications network, and the replacement data is used to playback or execute the data file as if the data file does not contain any unreadable data. | 08-16-2012 |
20120216073 | Restarting Processes - Techniques are disclosed that include a computer-implemented method, including storing information related to an initial state of a process upon being initialized, wherein execution of the process includes executing at least one execution phase and upon completion of the executing of the execution phase storing information representative of an end state of the execution phase; aborting execution of the process in response to a predetermined event; and resuming execution of the process from one of the saved initial and end states without needing to shut down the process. | 08-23-2012 |
20120216074 | IN-FLIGHT BLOCK MAP FOR A CLUSTERED REDIRECT-ON-WRITE FILESYSTEM - A cluster server manages allocation of free blocks to cluster clients performing writes in a clustered file system. The cluster server manages free block allocation with a free block map and an in-flight block map. The free block map is a data structure or hardware structure with data that indicates blocks or extents of the clustered file system that can be allocated to a client for the client to write data. The in-flight block map is a data structure or hardware structure with data that indicates blocks that have been allocated to clients, but remain in-flight. A block remains in-flight until the clustered file system metadata has been updated to reflect a write performed to that block by a client. After a consistency snapshot of the metadata is published to the storage resources, the data at the block will be visible to other nodes of the cluster. | 08-23-2012 |
20120226939 | ACCELERATING RECOVERY IN MPI ENVIRONMENTS - A computer usable program product for accelerating recovery in an MPI environment is provided in the illustrative embodiments. A first portion of a distributed application executes using a first processor and a second portion using a second processor in a distributed computing environment. After a failure of operation of the first portion, the first portion is restored to a checkpoint. A first part of the first portion is distributed to a third processor and a second part to a fourth processor. A computation of the first portion is performed using the first and the second parts in parallel. A first message is computed in the first portion and sent to the second portion, the message having been initially computed after a time of the checkpoint. A second message is replayed from the second portion without computing the second message in the second portion. | 09-06-2012 |
20120246513 | System-directed checkpointing implementation using a hypervisor layer - While system-directed checkpointing can be implemented in various ways, for example by adding checkpointing support in the memory controller or in the operating system in otherwise standard computers, implementation at the hypervisor level enables the necessary state information to be captured efficiently while providing a number of ancillary advantages over those prior-art methods. This disclosure details procedures for realizing those advantages through relatively minor modifications to normal hypervisor operations. Specifically, by capturing state information in a guest-operating-system-specific manner, any guest operating system can be rolled back independently and resumed without losing either program or input/output (I/O) continuity and without affecting the operation of the other operating systems or their associated applications supported by the same hypervisor. Similarly, by managing I/O queues as described in this disclosure, rollback can be accomplished without requiring I/O operations to be repeated and I/O device failures can be circumvented without losing any I/O data in the process. | 09-27-2012 |
20120254659 | METHOD AND SYSTEM FOR VIRTUAL ON-DEMAND RECOVERY - A data management system (“DMS”) provides an automated, continuous, real-time, substantially no downtime data protection service to one or more data sources. A host driver embedded in an application server captures real-time data transactions, preferably in the form of an event journal. The driver functions to translate traditional file/database/block I/O and the like into a continuous, application-aware, output data stream. The host driver includes an event processor that can perform a recovery operation to an entire data source or a subset of the data source using former point-in-time data in the DMS. The recovery operation may have two phases. First, the structure of the host data in primary storage is recovered to the intended recovering point-in-time. Thereafter, the actual data itself is recovered. The event processor enables such data recovery in an on-demand manner, by allowing recovery to happen simultaneously while an application accesses and updates the recovering data. | 10-04-2012 |
20120266018 | FAULT-TOLERANT COMPUTER SYSTEM, FAULT-TOLERANT COMPUTER SYSTEM CONTROL METHOD AND RECORDING MEDIUM STORING CONTROL PROGRAM FOR FAULT-TOLERANT COMPUTER SYSTEM - In a fault-tolerant computer system that includes a computer | 10-18-2012 |
20120297249 | Platform for Continuous Mobile-Cloud Services - Data that is collected and disseminated by mobile devices typically has to be processed, correlated with other data, aggregated, and then transmitted back to the mobile device users before the information becomes stale or otherwise irrelevant. These operations may be performed in a cloud-based solution that manages dataflow. The cloud-based solutions may be scalable and implemented in a fault-tolerant distributed system to support user-facing continuous sensing and processing services in the cloud-computing system. A system may monitor execution of data and shift workloads (i.e., balancing) in response to spatial and temporal load imbalances that occur in a continuous computing environment. A failure recovery protocol may be implemented that uses a checkpoint-based partial rollback recovery mechanism with selective re-execution, which may allow recovery of the continuous processing after an error while avoiding large amounts of downtime and re-execution. | 11-22-2012 |
20130055018 | DETECTION OF LOGICAL CORRUPTION IN PERSISTENT STORAGE AND AUTOMATIC RECOVERY THEREFROM - A method, system, and computer program product for restoring blocks of data stored at a corrupted data site using two or more mirror sites. The method commences by receiving a trigger event from a component within an application server environment where the trigger event indicates detection of a corrupted data site. The trigger is classified into at least one of a plurality of trigger event types, which trigger event type signals further processing for retrieving from at least two mirror sites, a first stored data block and a second stored data block corresponding to the same logical block identifier from the first mirror site. The retrieved blocks are compared to determine a match value, and when the match value is greater than a confidence threshold, then writing good data to the corrupted data site before performing consistency checks on blocks in physical or logical proximity to the corrupted data site. | 02-28-2013 |
20130067277 | Method and System for Enabling Checkpointing Fault Tolerance Across Remote Virtual Machines - A checkpointing fault tolerance network architecture enables a backup computer system to be remotely located from a primary computer system. An intermediary computer system is situated between the primary computer system and the backup computer system to manage the transmission of checkpoint information to the backup VM in an efficient manner. The intermediary computer system is networked to the primary VM through a first connection and is networked to the backup VM through a second connection. The intermediary computer system identifies updated data corresponding to memory pages that have been less frequently modified by the primary VM and transmits such updated data to the backup VM through the first connection. In such manner, the intermediary computer system holds back updated data corresponding to more frequently modified memory pages, since such memory pages may be more likely to be updated again in the future. | 03-14-2013 |
20130103981 | Self-rescue method and device for damaged file system - The present application discloses a self-rescue method and device for a damaged file system. The method includes: a fault warning message is sent to a background server when it is found during boot of a device that a file system is damaged; the device receives an acknowledgement message from the background server, wherein the acknowledgement message contains a path and file name of a backup version selected by the background server according to a product type; and the device downloads a version file and reboots from the version file. The device, when finding during the boot that the file system is damaged, implements network communications between the foreground and the background prior to switching to a large version, in order to acquire a version from the background server actively and reload it, so that the damaged file system is self-repaired automatically without manual interference. | 04-25-2013 |
20130151895 | APPARATUS AND METHOD OF MANAGING DATABASES OF ACTIVE NODE AND STANDBY NODE OF MAIN MEMORY DATABASE MANAGEMENT SYSTEM - Databases of an active node and a standby node of a main memory database management system (MMDBMS) are managed so as to prevent loss of a transaction caused by failure of any one of the active node or the standby node. The MMDBMS is configured to prevent data mismatch between the active node and the standby node when failure of any one of the active node and the standby node occurs. In case of failure of one of the nodes, log information from the other node is obtained to recover the failed node. | 06-13-2013 |
20130159768 | SYSTEM AND METHOD FOR RESTORING DATA - A method for recovering data using metadata includes the steps of: receiving, at a storage location, data from a computing application; associating, at said storage location, metadata to said data received at said storage location; and storing said data and associated metadata in said storage location; wherein said data stored in said storage location is identifiable by said computing application using said metadata, thereby to recover said data in response to a data recovery event. | 06-20-2013 |
20130166951 | SYSTEM-DIRECTED CHECKPOINTING IMPLEMENTATION USING A HYPERVISOR LAYER - While system-directed checkpointing can be implemented in various ways, for example by adding checkpointing support in the memory controller or in the operating system in otherwise standard computers, implementation at the hypervisor level enables the necessary state information to be captured efficiently while providing a number of ancillary advantages over those prior-art methods. This disclosure details procedures for realizing those advantages through relatively minor modifications to normal hypervisor operations. Specifically, by capturing state information in a guest-operating-system-specific manner, any guest operating system can be rolled back independently and resumed without losing either program or input/output (I/O) continuity and without affecting the operation of the other operating systems or their associated applications supported by the same hypervisor. Similarly, by managing I/O queues as described herein, rollback can be accomplished without requiring I/O operations to be repeated and I/O device failures can be circumvented without losing any I/O data in the process. | 06-27-2013 |
20130173958 | Extending Cache In A Multi-Processor Computer - Methods, apparatuses, and computer program products of extending cache in a multi-processor computer are provided. Embodiments include detecting, by a donor processor, nonuse of a donor processor's cache; broadcasting to one or more processors in the multi-processor computer, by the donor processor, a donor-ready message indicating the donor processor's cache is available for ownership transferment; receiving from a first requesting processor, by the donor processor, a first ownership-request message requesting ownership of the donor processor's cache by the first requesting processor; transmitting to the first requesting processor, by the donor processor, an ownership-grant message indicating an intention of the donor processor to transfer ownership of the donor processor's cache to the first requesting processor; and receiving from the first requesting processor, by the donor processor, an ownership-claim message indicating that the first requesting processor intends to claim ownership of the donor processor's cache. | 07-04-2013 |
20130179729 | FAULT TOLERANT SYSTEM IN A LOOSELY-COUPLED CLUSTER ENVIRONMENT - An approach to providing failure protection in a loosely-coupled cluster environment. A node in the cluster generates checkpoints of application data in a consistent state for an application that is running on a first node in the cluster. The node sends the checkpoint to one or more of the other nodes in the cluster. The node may also generate log entries of changes in the application data that occur between checkpoints of the application data. The node may send the log entries to other nodes in the cluster. The node may similarly receive external checkpoints and external log entries from other nodes in the cluster. In response to a node failure, the node may start an application on the failed node and recover the application using the external checkpoints and external log entries for the application. | 07-11-2013 |
20130212433 | SELF-MANAGED PROCESSING DEVICE - A processing device may automatically provide protective services and may provide backup services for backing up and restoring user files, system files, configuration files, as well as other information. The processing device may be configured to check one or more performance conditions and perform an action to improve performance based on the one or more performance conditions. The processing device may monitor configuration and file changes and provide a user with a capability to persist or discard configuration changes and/or file changes made by an application during a session. The processing device may include a recovery button or switch, which when selected or pressed may cause the processing device to be restored to an operational state. The processing device may automatically detect instabilities and may automatically attempt to repair possible causes of the instabilities. The processing device may also include an additional chipset, which may perform backup and recovery services. | 08-15-2013 |
20130290780 | RECOVERY SEGMENTS - In one example, a method for implementing recovery segments includes sending an application message from a parent process executed by a first computing device to a child process executed by a second computing device and identifying a dependency created by the application message. This identified dependency is included in a dependence set of the child process and saved. A checkpoint is generated by the parent process and a checkpoint message that includes dependency information is sent from the parent process to the child process. The child process modifies the dependence set according to the dependency information and generates a second checkpoint that is saved in nonvolatile memory of the second computing device. Upon occurrence of a failure of the parent process, the child process reverts to a most recent checkpoint generated by the child process that does not include the effects of processing an orphan message. | 10-31-2013 |
20130290781 | LOW OVERHEAD FAULT TOLERANCE THROUGH HYBRID CHECKPOINTING AND REPLAY - A virtualized computer system provides fault tolerant operation of a primary virtual machine. In one embodiment, this system includes a backup computer system that stores a snapshot of the primary virtual machine and a log file containing non-deterministic events occuring in the instruction stream of the primary virtual machine. The primary virtual machine periodically updates the snapshot and the log file. Upon a failure of the primary virtual machine, the backup computer can instantiate a failover backup virtual machine by consuming the stored snapshot and log file. | 10-31-2013 |
20130290782 | LOW OVERHEAD FAULT TOLERANCE THROUGH HYBRID CHECKPOINTING AND REPLAY - A virtualized computer system provides fault tolerant operation of a primary virtual machine. In one embodiment, this system includes a backup computer system that stores a snapshot of the primary virtual machine and a log file containing non-deterministic events occuring in the instruction stream of the primary virtual machine. The primary virtual machine periodically updates the snapshot and the log file. Upon a failure of the primary virtual machine, the backup computer can instantiate a failover backup virtual machine by consuming the stored snapshot and log file. | 10-31-2013 |
20130311826 | TRANSPARENT CHECKPOINTING AND PROCESS MIGRATION IN A DISTRIBUTED SYSTEM - A distributed system for creating a checkpoint for a plurality of processes running on the distributed system. The distributed system includes a plurality of compute nodes with an operating system executing on each compute node. A checkpoint library resides at the user level on each of the compute nodes, and the checkpoint library is transparent to the operating system residing on the same compute node and to the other compute nodes. Each checkpoint library uses a windowed messaging logging protocol for checkpointing of the distributed system. Processes participating in a distributed computation on the distributed system may be migrated from one compute node to another compute node in the distributed system by re-mapping of hardware addresses using the checkpoint library. | 11-21-2013 |
20140006858 | UNIVERSAL PLUGGABLE CLOUD DISASTER RECOVERY SYSTEM | 01-02-2014 |
20140101484 | MANAGEMENT OF A DISTRIBUTED COMPUTING SYSTEM THROUGH REPLICATION OF WRITE AHEAD LOGS - Several methods and a system of a replicated service for write ahead logs are disclosed. In one embodiment, a method includes persisting a state of a distributed system through a write ahead log (WAL) interface. The method also includes maintaining a set of replicas of a WAL through a consensus protocol. In addition, the method includes providing a set of mechanisms for at least one of detection and a recovery from a hardware failure. The method further includes recovering a persistent state of a set of applications. In addition, the method includes maintaining the persistent state across a set of nodes through the hardware failover. In one embodiment, the system may include a WAL interface to persist a state of a distributed system. The system may also include a WAL replication servlet to maintain and/or recover a set of replicas of a WAL. | 04-10-2014 |
20140108864 | DECOUPLED APPLICATION PROGRAM-OPERATING SYSTEM COMPUTING ARCHITECTURE - A method of application program-operating system decoupling includes performing, through an application program configured to execute on a client machine, a system call to a first operating system executing on a server machine over an interconnect configured to couple the server machine to the client machine. The method also includes serving the application program configured to execute on the client machine through the first operating system executing on the server machine in accordance with the system call. | 04-17-2014 |
20140143598 | SYNCHRONIZATION FRAMEWORK THAT RESTORES A NODE FROM BACKUP - Architecture for restoring nodes. After restoring a node, fix-up occurs to make the node appears as a different node than before the restore operation. The node appears as a new node, which new node knows the data up to a certain point from when the new node had the prior identity. This allows for new changes generated by the new node to flow to the other nodes in the topology, as well as have the changes that the prior identity sent to other nodes flow back to the new node. In other words, the architecture maintains information to create the new node in the topology while maintaining prior data knowledge. Additionally, item level metadata of associated data items is updated to correlate with the updated data items so that changes can be correctly enumerated and applied. This metadata update occurs across scopes of which the data items are included. | 05-22-2014 |
20140149793 | DATABASE CHANGE COMPENSATION AFTER A TRANSACTION COMMIT - A virtualization manager receives a request to perform a command in a virtual machine system and executes a plurality of transactions associated with the command, each of the plurality of transactions comprising one or more operations executed on entities in the virtual machine system. The virtualization manager commits changes made to the entities in the virtual machine system as a result of the plurality of transactions to a management database for the virtual machine system. In addition, the virtualization manager generates a business entity snapshot corresponding to a first transaction of the plurality of transactions, the business entity snapshot comprising state information for one or more entities in the virtual machine system affected by the first transaction. | 05-29-2014 |
20140189429 | METHOD AND SYSTEM FOR IMPLEMENTING CONSISTENCY GROUPS WITH VIRTUAL MACHINES - Disclosed is an approach for implementing disaster recovery for virtual machines. Consistency groups are implemented for virtual machines, where the consistency group link together two or more VMs. The consistency group includes any set of VMs which need to be managed on a consistent basis in the event of a disaster recovery scenario. | 07-03-2014 |
20140258777 | HARDWARE SUPPORTED MEMORY LOGGING - Logging changes to a physical memory region during a logging time interval includes: detecting a write operation to the physical memory region, wherein the write operation modifies an indirect representation that corresponds to a physical data line in the physical memory region; and recording log information associated with the write operation. | 09-11-2014 |
20140281709 | RECOVERY OF APPLICATION FROM SNAPSHOT - The targeted recovery of application-specific data corresponding to an application without performing recovery of the entire volume. The recovery is initiated by beginning to copy the prior state of the content of an application-specific data container from a prior snapshot to the application-specific data container in an operation volume accessible by the application. However, while the content of the application-specific data container is still being copied from the snapshot to the application-specific data container, the application is still permitted to perform read and write operations on the application-specific data container. Thus, the application-specific data container appears to the application to be fully accessible even though recovery of the content of the application-specific data container is still continuing in the background. | 09-18-2014 |
20140281710 | TRANSACTIONS FOR CHECKPOINTING AND REVERSE EXECUTION - A method of backstepping through a program execution includes dividing the program execution into a plurality of epochs, wherein the program execution is performed by an active core, determining, during a subsequent epoch of the plurality of epochs, that a rollback is to be performed, performing the rollback including re-executing a previous epoch of the plurality of epochs, wherein the previous epoch includes one or more instructions of the program execution stored by a checkpointing core, and adjusting a granularity of the plurality of epochs according to a frequency of the rollback. | 09-18-2014 |
20140325272 | DATA TRANSFER AND RECOVERY PROCESS - A backup image generator can create a primary image and periodic delta images of all or part of a primary server. The images can be sent to a network attached storage device and a remote storage server. In the event of a failure of the primary server, the failure can be diagnosed to develop a recovery strategy. Based on the diagnosis, at least one delta image may be applied to a copy of the primary image to generate an updated primary image at either the network attached storage or the remote storage server. The updated primary image may be converted to a virtual server in a physical to virtual conversion at either the network attached storage device or remote storage server and users may be redirected to the virtual server. The updated primary image may also be restored to the primary server in a virtual to physical conversion. As a result, the primary data storage may be timely backed-up, recovered and restored with the possibility of providing server and business continuity in the event of a failure. | 10-30-2014 |
20140331087 | IDENTIFYING AND CORRECTING AN UNDESIRED CONDITION OF A DISPERSED STORAGE NETWORK ACCESS REQUEST - A method begins by a processing module sending a transaction verification request to the set of dispersed storage (DS) units, wherein the transaction verification request includes a transaction number that corresponds to a particular dispersed storage network (DSN) access request. The method continues with the processing module receiving transaction verification responses from at least some of the set of DS units to produce received transaction verification responses. The method continues with the processing module identifying an undesired condition with processing the DSN access request and initiating a corrective remedy for the undesired condition when a DS unit of the set of DS units does not provide a desired transaction verification response. | 11-06-2014 |
20140372799 | System Differential Upgrade Method, Apparatus, and Mobile Terminal - A system differential upgrade method and apparatus, and a mobile terminal are provided. The method includes: obtaining an upgrade script and upgrade data; upgrading a file to be upgraded according to the upgrade script and the upgrade data; generating, according to the file processing command that is being executed currently in the upgrade script, and the file to be upgraded corresponding to the file processing command that is being executed currently, rollback data and a rollback script corresponding to the file to be upgraded; and when the upgrade fails, executing the rollback script according to the rollback data. The apparatus includes an obtaining module, an upgrading module, a generating module, and an executing module. According to the embodiments of the present invention, when an upgrade fails, the rollback script is executed according to the rollback data, which may restore a system to that before the upgrade. | 12-18-2014 |
20150052395 | ANNOTATED ATOMIC WRITE - Techniques are disclosed relating to writing data atomically to one or more recording media. In one embodiment, a request is received to perform an atomic write for a set of data. Responsive to the request, the set of data is written across a plurality of storage units including storing metadata at a dedicated location within at least one of the plurality of storage units. The metadata is usable to determine whether the writing completed successfully. In some embodiments, the request is received from an application that has been assigned an address range of the plurality of storage units. In such an embodiment, the address range is accessible to the application for storing data, and the dedicated location resides outside of the address range. In one embodiment, the metadata specifies an address range where the set of data was written and a sequence number. | 02-19-2015 |
20150067397 | METHOD AND APPARATUS FOR RESTORING FAILED USER WORKFLOW INSTANCES FROM DATA STORE - Method and Apparatus for rapid scalable unified infrastructure system management platform are disclosed by discovery of compute nodes, network components across data centers, both public and private for a user; assessment of type, capability, VLAN, security, virtualization configuration of the discovered unified infrastructure nodes and components; configuration of nodes and components covering add, delete, modify, scale; and rapid roll out of nodes and components across data centers both public and private. | 03-05-2015 |
20150074458 | Systems and Methods of Event Driven Recovery Management - Systems and methods of event driven recovery management are disclosed. In one embodiment, a method of providing event driven recovery management includes continually copying one or more data blocks that are generated from a computing device, associating at least one event marker with the copies of the one or more data blocks, and allowing access to the copies of the one or more data blocks according to the at least one event marker in order to provide event driven recovery. For purposes of this disclosure, an event marker, a book mark, an application consistency point, and/or a business event are interchangeably used, depending on the context. | 03-12-2015 |
20150089286 | EVENT COUNTER CHECKPOINTING AND RESTORING - Event counter checkpointing and restoring is disclosed. In one implementation, a processor includes a first event counter to count events that occur during execution within the processor, event counter checkpoint logic, communicably coupled with the first event counter, to store, prior to a transactional execution of the processor, a value of the first event counter, a second event counter to count events prior to and during the transactional execution, wherein the second event counter is to increment without resetting after the transactional execution is aborted, event count restore logic to restore the first event counter to the stored value after the transactional execution is aborted, and tuning logic to determine, in response to aborting of the transactional execution, a number of the events that occurred during the transactional execution based on the stored value of the first event counter and a value of the second event counter. | 03-26-2015 |
20150100823 | DATA BACKUP METHOD AND INTERFACE CARD - Equipping a first computer with an interface card having an access function of accessing a storage and a communication function of performing a communication via a network. Connecting the interface card equipped in the first computer and a second computer by the network. Causing the interface card to process target data received from the second computer and to cause the target data to be saved in a first storage, which is a storage connected to the interface card, when the second computer transmits the target data to be saved to the interface card. | 04-09-2015 |
20150106651 | SYSTEM AND METHOD FOR PERFORMING AN IN-SERVICE SOFTWARE UPGRADE IN NON-REDUNDANT SYSTEMS - An information handling system is provided. The information handling system includes one or more devices coupled together to route information between the one or more devices and other devices coupled thereto based on routing information stored in the one or more devices. The one or more devices includes a routing processor, one or more line cards coupled to the routing processor, the one or more line cards receiving the routing information from the routing processor for routing data packets to a destination, and a memory coupled to the routing processor. The routing processor is configured to create an active image having a current state of the routing information and create a standby image having the current state of the routing information, wherein the standby image requests the current state of the routing information from the active image using a key that is calculated using a portion of the routing information. | 04-16-2015 |
20150106652 | SYSTEM REPAIR METHOD AND DEVICE, AND STORAGE MEDIUM - A system repair method and device, and a storage medium are provided. The system repair method includes: performing security check on system files and registries in a system; when the detection result is abnormal, judging whether the system files and/or the g registries are required to be repaired according to preset system repair rules; and if yes, repairing the system files and/or the registries. The present invention avoids possible abnormal repair in system repair, reduces risks in the system repair, improves security and accuracy of the system repair, and ensures reliability of the system repair. | 04-16-2015 |
20150127982 | DISTRIBUTED COMPUTING BACKUP AND RECOVERY SYSTEM - The distributed computing backup and recovery (DCBR) system and method provide backup and recovery for distributed computing models (e.g., NoSQL) The DCBR system extends the protections from server node-level failure and introduces persistence in time so that the evolving data set may be stored and recovered to a past point in time. The DCBR system, instead of performing backup and recovery for an entire dataset, may be configured to apply to a subset of data. Instead of keeping or recovering snapshots of the entire dataset which requires the entire cluster, the DCBR system identifies the particular nodes and/or archive files where the dataset resides so that backup or recovery may be done with a much smaller number of nodes. | 05-07-2015 |
20150293818 | METHOD OF PROTECTED RECOVERY OF DATA, COMPUTER PROGRAM PRODUCT AND COMPUTER SYSTEM - A method of protected recovery of data stored in a backup computer system on a source computer system, wherein an access controller is provided that queries access information of a user group to access a recovery process, but prohibits access of the user group to the data stored in the backup computer system and prohibits general access of the user group to the source computer system per se, subject to write access if necessary to rewrite data onto the source computer system. The recovery process can be instigated by a user of the user group if the queried access information matches stored access information of the user group, wherein the instigated recovery process comprises a rewriting of selected data from the backup computer system into the source computer system. | 10-15-2015 |
20150293819 | METHOD AND SYSTEM FOR PROVIDING HIGH AVAILABILITY TO DISTRIBUTED COMPUTER APPLICATIONS - Method, system, apparatus and/or computer program for achieving transparent integration of high-availability services for distributed application programs. Loss-less migration of sub-programs from their respective primary nodes to backup nodes is performed transparently to a client which is connected to the primary node. Migration is performed by high-availability services which are configured for injecting registration codes, registering distributed applications, detecting execution failures, executing from backup nodes in response to failure, and other services. High-availability application services can be utilized by distributed applications having any desired number of sub-programs without the need of modifying or recompiling the application program and without the need of a custom loader. In one example embodiment, a transport driver is responsible for receiving messages, halting and flushing of messages, and for issuing messages directing sub-programs to continue after checkpointing. | 10-15-2015 |
20150301898 | CONDITIONAL SAVING OF INPUT DATA - This document relates to preserving input data. One example includes obtaining a request that a service perform processing on input data to produce an output representation of the input data. This example also includes applying criteria to the request, and preserving the input data responsive to determining that the criteria are met. | 10-22-2015 |
20150301899 | SYSTEMS AND METHODS FOR ON-LINE BACKUP AND DISASTER RECOVERY WITH LOCAL COPY - Systems and methods are disclosed for rapidly restoring client data set for a computer by storing the client data and one or more pat sets required to revert to one or more version of the client data on a remote server; storing a local copy of the replicated client data on a local data storage device coupled to the computer; receiving a request to revert to a predetermined version of the client data; using the local copy as a seed, receiving a patch set corresponding to a predetermined version; and updating the local copy using the patch set to generated the predetermined version. | 10-22-2015 |
20150301902 | Systems, Methods, and Computer Program Products for Instant Recovery of Image Level Backups - Systems, methods, and computer program products are provided for instant recovery of a virtual machine (VM) from a compressed image level backup without fully extracting the image level backup file's contents to production storage. The method receives restore parameters and initializes a virtual storage. The method attaches the virtual storage to a hypervisor configured to launch a recovered VM. The method stores virtual disk data changes inflicted by a running operating system (OS), applications, and users in a changes storage. The method provides the ability to migrate the actual VM disk state (taking into account changed disk data blocks accumulated in changes storage) so as to prevent data loss resulting from the VM running during the recovery and accessing virtual storage, to production storage without downtime. In embodiments, the method displays receives restore parameters in an interactive interface and delivers the recovery results via an automated message, such as an email message. | 10-22-2015 |
20150301906 | Resolving Failed Mirrored Point-in-Time Copies with Minimum Disruption - When the mirrored point in time copy fails, at that point in time all the data for making the source and target of the point in time copy consistent is available on secondary volumes at disaster recovery site. The data for the source and target of the failed point in time copy are logically and physically equal at that point in time. This logical relationship can be maintained, and protected against ongoing physical updates to the affected tracks on the source secondary volume, by first reading the affected tracks from the source secondary volume, copying the data to the target secondary volume, and then writing the updated track to the source secondary volume. | 10-22-2015 |
20150301908 | SYSTEM DESIGN METHOD, SYSTEM DESIGN APPARATUS, AND SYSTEM DESIGN PROGRAM - Provided are a system design method, a system design system, and a system design reform assistance program. A system design apparatus includes a unit for receiving an analysis model which represents a system failure restoration sequence, a unit for identifying, from the received analysis model, a minimum combination of component failure which does not satisfy either a restoration time requisite or a necessary cost requisite and a unit for outputting the identified minimum combination of component failure. The unit for identifying the minimum combination of component failure further includes a unit for estimating the restoration time of a system, and a unit for estimating the cost required for restoration of the system. | 10-22-2015 |
20150309879 | INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM - An information processing apparatus protects data stored in a storage device by saving setting data stored on a first storage device to a second storage device when the first storage device fails, or when an encryption function is enabled or disabled. The process of protecting the date includes, after a reservation for saving setting data is made, saving the setting data, cancelling the reservation, and making a reservation for restoring the setting data. If the setting data is to be restored, it is determined whether the reservation for restoring the setting data has been made, and if the reservation has been made, the setting data is restored to the first storage device. | 10-29-2015 |
20150309883 | Recording Activity of Software Threads in a Concurrent Software Environment - A technique for failure monitoring and recovery of a first application executing on a first virtual machine includes storing machine state information during execution of the first virtual machine at predetermined checkpoints. An error message that includes an application error state at a failure point of the first application is received, by a hypervisor, from the first application. The first virtual machine is stopped in response to the error message. The hypervisor creates a second virtual machine and a second application from the stored machine state information that are copies of the first virtual machine and the first application. The second virtual machine and the second application are configured to execute from a checkpoint preceding the failure point. In response to receipt of a failure interrupt by the second application, one or more recovery processes are initiated in an attempt to avert the failure point. | 10-29-2015 |
20150309884 | RECOVERY OF A TRANSACTION AFTER XA END - Embodiments of the present invention disclose a method for recovery of a two-phase commit transaction. A computer transmits a first transaction identifier to a data store, wherein the first transaction identifier defines a two-phase commit transaction. The computer transmits a prepare command for the first transaction identifier to a first resource manager. The computer determines if a failure and restart occurred within a distributed data processing environment, wherein the failure and restart occurs after the first resource manager receives an end command, but prior to completing execution of the prepare command for the first transaction identifier. Responsive to determining the failure and restart did occur within the distributed data processing environment, the computer retrieves the first transaction identifier from the data store. The computer transmits a rollback command for the retrieved first transaction identifier to the first resource manager. | 10-29-2015 |
20150309886 | FLASH MEMORY CONTROLLER AND DATA STORAGE DEVICE AND FLASH MEMORY CONTROL METHOD - A flash memory control technique with high reliability is provided. A flash memory controller provides a volatile storage area for temporary storage of logical-to-physical address mapping data between a host and a flash memory as well as error detection codes encoded from the logical-to-physical address mapping data. When reading from the volatile storage area, the microcontroller of the flash memory controller is configured to perform an error detection procedure based on the error detection codes. The microcontroller is further configured to restore the logical-to-physical address mapping data in the volatile storage area based on a backup of the logical-to-physical address mapping data. | 10-29-2015 |
20150309887 | Automatic Failure Recovery Using Snapshots and Replicas - In one embodiment, a method of data recovery in a storage system includes, upon failure to fulfill an I/O request for requested data to a primary volume, consulting a change set to determine whether the requested data are current in a snapshot or replica. Further, such an embodiment includes providing the requested data using the snapshot or replica without further accessing the change set, if the requested data are current in the snapshot or replica, or issuing an error or failure status, if the requested data are not current. | 10-29-2015 |
20150309888 | Cooperative Data Recovery In A Storage Stack - Example embodiments respond to input/output (I/O) requests to a storage stack having a hierarchy of layers. In one such embodiment, responsive to an I/O request for data from a higher layer of the stack to a lower of the stack in hierarchy order, a first help response is generated at the lower layer and sent to the higher layer to recover the data. In turn, at the higher layer, it is determined whether a recover mechanism can fulfill the I/O request and, if not, a second help response is generated and sent to a next higher layer in the hierarchy. At the next higher layer, it is determined whether a recovery mechanism can fulfill the I/O request and, if not, a third help response is generated and sent to an even next higher layer in the hierarchy. | 10-29-2015 |
20150317207 | Field-Repair System and Method - The present invention discloses a field-repair system and method for three-dimensional mask-programmed memory (3D-MPROM). Unlike a conventional mask-ROM which is fully factory-tested and contains no bad data at shipping, the 3D-MPROM is not fully factory-tested and contains bad data at shipping. Most of the 3D-MPROM data are checked and repaired in the field. | 11-05-2015 |
20150317214 | RESOURCE INTEGRITY DURING PARTIAL BACKOUT OF APPLICATION UPDATES - In response to failure of an application that initiated updates to a group of operational system resources without the updates being successfully committed, available non-committed pending update operations are ignored for any of the group of operational system resources determined to be in a fully functional data indexing and access state. For each physically inconsistent operational system resource that was left in a non-fully functional data indexing and access state as a result of the failure of the application, a portion of available pending updates are performed to change the respective physically inconsistent operational system resource to a partially backed out operational system resource with the fully functional data indexing and access state. Remaining available pending updates are ignored for the respective partially backed out operational system resource after the respective fully functional data indexing and access state is achieved to expedite system restart. | 11-05-2015 |
20150317215 | SYSTEMS AND METHODS FOR HOST IMAGE TRANSFER - Methods and systems for transferring a host image of a first machine to a second machine, such as during disaster recovery or migration, are disclosed. In one example, a first profile of a first machine of a first type, such as a first client machine, is compared to a second profile of a second machine, such as a recovery machine or a second client machine of a second type different from the first type, to which the host image is to be transferred, by a first processing device. The first and second profiles each comprise at least one property of the first type of first machine and the second type of second machine, respectively. At least one property of a host image of the first machine is conformed to at least one corresponding property of the second machine. The conformed host image is provided to the second machine, via a network. The second machine is configured with at least one conformed property of the host image by a second processing device of the second machine. | 11-05-2015 |
20150324256 | RESTORING AN APPLICATION FROM A SYSTEM DUMP FILE - An application is identified that was running at a time of a system crash. A system dump file is received that was created responsive to the system crash. A restoration dataset stored in the system dump file is determined. The application is restored based, at least in part, on the restoration dataset. | 11-12-2015 |
20150331760 | PERFORMANCE DURING PLAYBACK OF LOGGED DATA STORAGE OPERATIONS - Technology is disclosed for improving performance during playback of logged data storage operations. The technology can monitor a log to which data storage operations are written before data is committed to a data storage device or a volume; determine counts of various types of data storage operations; and when the counts exceed a specified threshold, cause the data storage operations to be committed to the data storage device or the volume. Some data storage operations can be coalesced during playback to further improve performance. | 11-19-2015 |
20150347240 | USING AN OBJECT RETAIN BLOCK IN A VIRTUAL MACHINE - A method for using a retain block in application code executing on a virtual machine includes identifying an instruction in application code, the instruction pertaining to an object, determining the instruction is part of a retain block, prior to executing the instruction, determining whether the instruction is to cause the object to be modified, and when the instruction is to cause the object to be modified, storing data indicating a first state of the object in a retain block store and causing the first state of the object to be modified using a second state. Also, the method includes in response to an error occurring during an execution of the instruction, returning the object from the second state to the first state using the stored data. | 12-03-2015 |
20150347242 | Disaster Recovery Validation - A method and system for the backup and recovery of a converged infrastructure computer system are provided with the ability to determine if the backup meets requirements of a disaster recovery plan. The method and system provide backup and recovery of the data and applications including backup and recovery of the configuration and mapping information of the converged infrastructure computer system. The backups are periodically tested to determine if they meet predetermined metrics that are specified in the disaster recovery plan. | 12-03-2015 |
20150350316 | DATA TRANSFER SERVICE - In various embodiments, methods and systems for transferring data using a storage medium are provided. A storage medium may be shipped by a customer to a datacenter such that the data on the storage medium is copied to a storage associated with the datacenter or data in the storage is copied to the storage medium. The datacenter may support a cloud computing infrastructure that provides a storage account to the customer that is associated with the data copied from or copied to the storage medium. The storage medium further corresponds to a data transfer manifest that includes at least in part data mapping between storage service infrastructure and data in the storage medium. It is contemplated that embodiments of the present invention may further be implemented with data transfer service components that support a client component, storage service component, and a data transfer management component. | 12-03-2015 |
20150363272 | COMPUTING SYSTEM WITH ADAPTIVE BACK-UP MECHANISM AND METHOD OF OPERATION THEREOF - A computing system includes: an adaptive back-up controller configured to calculate an adaptive back-up time based on a reserve power source for backing up a volatile memory to a nonvolatile memory; and a processor core, coupled to the adaptive back-up controller, configured to back up at least a portion of the volatile memory to the nonvolatile memory within the adaptive back-up time. | 12-17-2015 |
20150363277 | CHECKPOINT TRIGGERING IN A COMPUTER SYSTEM - According to an aspect, a method for triggering creation of a checkpoint in a computer system includes executing a task in a processing node of the computer system and determining whether it is time to read a monitor associated with a metric of the task. The monitor is read to determine a value of the metric based on determining that it is time to read the monitor. A threshold for triggering creation of the checkpoint is determined based on the value of the metric. Based on determining that the value of the metric has crossed the threshold, the checkpoint including state data of the task is created to enable restarting execution of the task upon a restart operation. | 12-17-2015 |
20150363279 | RESTORATION DETECTING METHOD, RESTORATION DETECTING APPARATUS, AND RESTORATION DETECTING PROGRAM - A restoration detecting method includes receiving, by a processor, from a monitoring target virtual machine, file size information indicating a file size of a specific file, the file size of which cumulatively increases as the virtual machine runs, and detecting, by a processor, restoration of the virtual machine on a basis of the received file size information. | 12-17-2015 |
20150363414 | PROCESSING LARGE XML FILES BY SPLITTING AND HIERARCHICAL ORDERING - A computer processor determines a schema that enables splitting of one or more elements of an XML file. The computer processor determines an XML file as a split candidate, based on one or more attributes of the one or more elements of the XML file. The computer processor splits the XML file at run-time into a plurality of subsets of the XML file, based on the one or more attributes of the one or more elements of the XML file, and the computer processor distributes the plurality of subsets of the XML file to a plurality of computing nodes of a computer processing system. | 12-17-2015 |
20150370579 | MODULAR SPACE VEHICLE BOARDS, CONTROL SOFTWARE, REPROGRAMMING, AND FAILURE RECOVERY - A space vehicle may have a modular board configuration that commonly uses some or all components and a common operating system for at least some of the boards. Each modular board may have its own dedicated processing, and processing loads may be distributed. The space vehicle may be reprogrammable, and may be launched without code that enables all functionality and/or components. Code errors may be detected and the space vehicle may be reset to a working code version to prevent system failure. | 12-24-2015 |
20150370652 | BACK UP AND RECOVERY IN VIRTUAL MACHINE ENVIRONMENTS - Embodiments of the present invention provide efficient and cost-effective systems and methods for backing up and recovering a virtual machine and application data therein. Embodiments of the present invention can be used to satisfy near-zero RPOs by providing more recovery points for backups in virtual machine environments, while also providing increased granularity for recovery (i.e., single virtual disk, single file, etc.) and maintaining central management capabilities and back up efficiencies offered by virtual machine-level backups. | 12-24-2015 |
20150370653 | REPLACEMENT OF A CORRUPT DRIVER VARIABLE RECORD - A BIOS storage device including driver variable records, a corruption detection engine and a corruption remediation engine, wherein the corruption detection engine is to evaluate a plurality of driver variable records stored in an area of a BIOS storage device for corruption, and a corruption remediation engine is to replace a corrupt driver variable record with a last known good version of the driver variable record. | 12-24-2015 |
20150370654 | FILE CORRUPTION RECOVERY IN CONCURRENT DATA PROTECTION - An incremental backup system that performs the following (not necessarily in the following order): (i) making a plurality of time-ordered journal entries; (ii) determining that a corruption condition exists; (iii) responsive to a corruption condition, constructing a first incremental mirror data set that reflects a backup data set and all journal entries up to a first corrupted journal entry which is the earliest in time journal entry, of the plurality of journal entries, that is a corrupted journal entry; (iv) responsive to a corruption condition, constructing a second incremental mirror data set that reflects the backup data set and all journal entries up to the first corrupted journal entry; and (v) checking for corruption in the first and second incremental mirror data sets to determine the latest uncorrupted version of the data set. | 12-24-2015 |
20150378832 | PERFORMING A REMOTE POINT-IN-TIME COPY TO A SOURCE AND TARGET STORAGES IN FURTHER MIRROR COPY RELATIONSHIPS - Provided are a computer program product, system, and method for performing a remote point-in-time copy to a source and target storages in further mirror copy relationships. Each of a plurality of source copy relationships is from the source storage to one corresponding source copy storage. Each of a plurality of target copy relationships is from the target storage to one corresponding target copy storage, where in each relationship an indicator indicates whether to use a remote first type copy operation. The first type copy operation is used to copy data from the source storage to the target storage and copy data from the source copy storage to the target copy storage for the determined source and target copy relationships having the indicator set. A second type of copy operations is used for source and target relationships not having the indicator set. | 12-31-2015 |
20150378833 | BACKUP AND NON-STAGED RECOVERY OF VIRTUAL ENVIRONMENTS - Methods for creating backup of data of a virtual environment to allow non-staged recovery are described. The described method may include receiving data of a virtual environment through one or more data streams for backup. The method also includes generating metadata corresponding to the received data and storing the received data at a first location of a backup storage unit. Further, the method includes storing the generated metadata at a second location of the backup storage unit, where the second location is different from the first location of the backup storage unit. The method further includes mapping the at least one predefined file to the stored data to create a mapping table to allow direct access to the stored data for non-staged recovery. | 12-31-2015 |
20150378839 | RECOVERY SYSTEM AND METHOD FOR PERFORMING SITE RECOVERY USING REPLICATED RECOVERY-SPECIFIC METADATA - A recovery system and method for performing site recovery utilizes recovery-specific metadata and files of protected clients at a primary site to recreate the protected clients at a secondary site. The recovery-specific metadata is collected from at least one component at the primary site, and stored with the files of protected clients at the primary site. The recovery-specific metadata and the files of the protected clients are replicated to the secondary site so that the protected clients can be recreated at the secondary site using the replicated information. | 12-31-2015 |
20150378844 | Systems And Methods For Out-Of-Band Backup And Restore of Hardware Profile Information - Systems and methods are provided that may be implemented for out-of-band backup and/or restore of information handling system components. Such out-of-band backup and restore operations may be performed, in one embodiment, to backup and/or restore hardware profile information such as firmware images and corresponding system configuration information. | 12-31-2015 |
20150378846 | METHOD, COMPUTER PROGRAM, AND COMPUTER FOR RESTORING SET OF VARIABLES - A set of variables referred to by unified extensible firmware interface (UEFI) firmware is restored. The UEFI firmware stored in a read-only memory (ROM) is firstly executed after power-up. The UEFI firmware writes a variable set related to boot, into a variable area. As an operating system (OS) also writes a set of variables into the variable area, the boot-related variable set may be altered. The variable set is saved into a prescribed area, such as a universal serial bus (USB) memory key, when the computer boots normally. If alteration of the variable set in the reference area is detected during a boot of the computer, the variable set in the reference area is replaced with the saved variable set. The variable set alteration may be detected using a detection flag which is set immediately after a boot is started and reset immediately before an OS is loaded. | 12-31-2015 |
20150378847 | MAINTAINING CONSISTENCY USING REVERSE REPLICATION DURING LIVE MIGRATION - Examples maintain consistency of writes for a plurality of VMs during live migration of the plurality from a source host to a destination host. The disclosure intercepts I/O writes to a migrated VM at a destination host and mirrors the I/O writes back to the source host. This “reverse replication” ensures that the CG of the source host is up to date, and that the source host is safe to fail back to if the migration fails. | 12-31-2015 |
20150378848 | MANAGEMENT COMPUTER AND MANAGMENT METHOD OF COMPUTER SYSTEM - A management computer stores an operation requirement of a virtual machine and a scheme of a first configuration change executed by a host computer or a storage apparatus. The management computer determines whether a second configuration change configured so as to be executed automatically in the host computer or the storage apparatus is executed. If it is determined that the second configuration change is executed, the management computer predicts a performance index value concerning a prescribed performance index for the computer or the storage apparatus when executing the second configuration change. The management computer determines whether an anticipated effect value of the configuration change scheme is satisfied based on the predicted performance index value, and creates a substitution plan satisfying both an operation requirement and an anticipated effect value of the virtual machine where it is determined that the anticipated effect value is not satisfied. | 12-31-2015 |
20160004602 | COPY-ON-READ PROCESS IN DISASTER RECOVERY - Systems, methods, and computer products for copy-on-read processes in disaster recovery include: making a disaster recovery storage volume available for read access before all data from a corresponding primary storage volume has been copied to a disaster recovery storage volume; maintaining a record of regions of the disaster recovery storage volume; in response to receiving a read request for data at the disaster recovery system: looking up the record of regions of the disaster recovery storage volume to determine available data for the read request; reading any available data from the disaster recovery storage volume; obtaining data unavailable at the disaster recovery storage volume from the corresponding primary storage volume; updating the disaster recovery storage volume with the obtained data; supplying the obtained data to the read request; and updating the record of regions of the disaster recovery storage volume for the regions of the obtained data. | 01-07-2016 |
20160004606 | METHOD, SYSTEM AND DEVICE FOR VALIDATING REPAIR FILES AND REPAIRING CORRUPT SOFTWARE - A system and method for repairing corrupt software components of a computer system. Corrupt software is detected and repaired utilizing an automated component repair service. Repair files are downloaded from an external storage location and used to repair the corruption. The downloaded files are preferably the smallest amount of data necessary to repair the identified corruption. The process of repairing corrupt files is used in conjunction with a software updating service to resolve problems that occur when corrupt software is updated by allowing a corrupt component to be repaired and then uninstalled such that an updated component can be properly installed. | 01-07-2016 |
20160004607 | INFORMATION PROCESSING APPARATUS AND INFORMATION PROCESSING METHOD - An information processing apparatus includes a storage unit that stores a location and a file name of a change target file which is changed in a prescribed process, and a processor that executes a process including obtaining and saving the change target file by using the location and the file name of the change target file; conducting the installation; detecting a failure that has occurred during the installation; obtaining progress information which represents progress of the installation, and identifying the prescribed process at occurrence of the failure as a failure time process on the basis of the progress information when the failure has been detected; restoring a file changed in the failure time process by using the saved change target file that corresponds to the changed file; and resuming the installation from a point in time at which the failure time process started. | 01-07-2016 |
20160011946 | File Level Recovery Using Virtual Machine Image Level Backup With Selective Compression | 01-14-2016 |
20160019107 | MANAGING A CHECK-POINT BASED HIGH-AVAILABILITY BACKUP VIRTUAL MACHINE - A technique for failure monitoring and recovery of a first application executing on a first virtual machine includes storing machine state information during execution of the first virtual machine at predetermined checkpoints. An error message that includes an application error state at a failure point of the first application is received, by a hypervisor, from the first application. The first virtual machine is stopped in response to the error message. The hypervisor creates a second virtual machine and a second application from the stored machine state information that are copies of the first virtual machine and the first application. The second virtual machine and the second application are configured to execute from a checkpoint preceding the failure point. In response to receipt of a failure interrupt by the second application, one or more recovery processes are initiated in an attempt to avert the failure point. | 01-21-2016 |
20160019116 | APPARATUS AND METHOD FOR RECOVERING AN INFORMATION HANDLING SYSTEM FROM A NON-OPERATIONAL STATE - A method recovers an information handling system (IHS) from a non-operational state. The method includes determining if the non-operational state of the IHS has occurred. In response to determining that the non-operational state of the IHS has occurred, a basic input-output (BIOS) recovery device is identified as being coupled to an embedded controller. In response to identifying that the BIOS recovery device is coupled to the embedded controller, an IHS type is transmitted to the BIOS recovery device. The BIOS recovery device is signaled to determine if the BIOS recovery device contains a BIOS payload corresponding to the IHS type. In response to determining that the BIOS recovery device contains the BIOS payload corresponding to the IHS type, the BIOS recovery device is triggered to transmit the BIOS payload to the embedded controller. The IHS is triggered to restart using the new BIOS payload. | 01-21-2016 |
20160019121 | DATA TRANSFERS BETWEEN CLUSTER INSTANCES WITH DELAYED LOG FILE FLUSH - Techniques for processing changes in a cluster database system are provided. A first instance in the cluster transfers a data block to a second instance in the cluster before a redo record that stores one or more changes that the first instance made to the data block is durably stored. The first instance also transfers, to the second instance, a block change timestamp that indicates when a redo record for the one or more changes was generated by the first instance. The first instance also separately sends, to the second instance, a last store timestamp that indicates when the last redo record that was durably stored was generated by the first instance. The block change timestamp and the last store timestamp are used by the second instance when creating redo records for changes (made by the second instance) that depend on the redo record generated by the first instance. | 01-21-2016 |
20160019122 | SYSTEM AND METHOD FOR MAINTAINING SERVER DATA INTEGRITY - The System Integrity Guardian can protect any type of object and repairs and restores the system back to its original state of integrity. The Client component is the user interface for administering the System Integrity Guardian environment. An administrator can determine which servers to protect, which objects to protect, and what actions will be taken when an event that breaches integrity occurs. The Monitor Agent component is the watchdog of the System Integrity Guardian that captures and addresses any event that occurs on any object being protected. The Server component includes the server and the Protected Object Central Repository. The authoritative copies are maintained, digital signatures are created and stored, objects are validated, and communication between the three units is performed. | 01-21-2016 |
20160019123 | FAULT TOLERANCE FOR COMPLEX DISTRIBUTED COMPUTING OPERATIONS - A method for enabling a distributed computing system to tolerate system faults during the execution of a client process. The method includes instantiating an execution environment relating to the client process; executing instructions within the execution environment, the instructions causing the execution environment to issue further instructions to the distributing computing system, the further instructions relating to actions to be performed with respect to data stored on the distributed computing system. An object interface proxy receives the further instructions and monitors the received to determine if the execution environment is in a desired save-state condition; and, if so, save a current state of the execution environment in a data store. | 01-21-2016 |
20160026535 | TECHNIQUES FOR DYNAMICALLY CONTROLLING RESOURCES BASED ON SERVICE LEVEL OBJECTIVES - Various embodiments are generally directed an apparatus and method for receiving a recovery point objective for a workload, the recovery point objective comprising an amount of time in which information for the workload will be lost if a failure occurs, and determining a service level objective for a replication transfer based on the recovery point objective, the replication transfer to replicate information on a destination node to maintain the recovery point objective. Various embodiments include dynamically controlling one or more resources to replicate the information on the destination node based on the service level objective and communicating information for the replication transfer from the source node to the destination node. | 01-28-2016 |
20160026538 | SYSTEM AND METHOD FOR MANAGING AND PRODUCING A DATASET IMAGE ACROSS MULTIPLE STORAGE SYSTEMS - An application may store data to a dataset comprising a plurality of volumes stored on a plurality of storage systems. The application may request a dataset image of the dataset, the dataset image comprising a volume image of each volume of the dataset. A dataset image manager operates with a plurality of volume image managers in parallel to produce the dataset image, each volume image manager executing on a storage system. The plurality of volume image managers respond by performing requested operations and sending responses to the dataset image manager in parallel. Each volume image manager on a storage system may manage and produce a volume image for each volume of the dataset stored to the storage system. If a volume image for any volume of the dataset fails, or a timeout period expires, a cleanup procedure is performed to delete any successful volume images. | 01-28-2016 |
20160026539 | SYSTEM AND METHOD FOR DETECTING FAILURE OF STORAGE OBJECT IMAGES ON A STORAGE SYSTEM AND INITIATING A CLEANUP PROCEDURE - An application may store data to a dataset comprising a plurality of volumes stored on a plurality of storage systems. The application may request a dataset image of the dataset, the dataset image comprising a volume image of each volume of the dataset. A dataset image manager operates with a plurality of volume image managers in parallel to produce the dataset image, each volume image manager executing on a storage system. The plurality of volume image managers respond by performing requested operations and sending responses to the dataset image manager in parallel. Each volume image manager on a storage system may manage and produce a volume image for each volume of the dataset stored to the storage system. If a volume image for any volume of the dataset fails, or a timeout period expires, a cleanup procedure is performed to delete any successful volume images. | 01-28-2016 |
20160026546 | HARDWARE-ASSISTED APPLICATION CHECKPOINTING AND RESTORING - Technologies for hardware-assisted application checkpointing include a computing device having a processor with hardware checkpoint support. In response to encountering a checkpoint event during execution of an application, the computing device saves the execution state of the application to nonvolatile storage using the hardware checkpoint support. The computing device may also restore the execution state using the hardware checkpoint support. The hardware checkpoint support may save part or all of the virtual memory space of the application in a manner transparent to the executing process. The hardware checkpoint support may be invoked using one or more system hooks such as system calls or processor instructions. The computing device may monitor for checkpoint events using hardware event monitors of the processor, chipset, or other components of the computing device. The computing device may store execution state in a dedicated flash memory cache. Other embodiments are described and claimed. | 01-28-2016 |
20160034359 | METHOD AND SYSTEM FOR PROVIDING AUTOMATED SELF-HEALING VIRTUAL ASSETS - A method and system for performing self-monitoring and self-healing operations from a virtual asset include receiving a first operating policy from an asset management computing environment, according to one embodiment. The method and system includes receiving a library of repairs from the asset management computing environment, according to one embodiment. The method and system includes detecting events, with the virtual asset, at least partially based on operational characteristics of the virtual asset exceeding at least one of the thresholds, according to one embodiment. The method and system includes repairing the virtual asset, with the virtual asset, using the library of repairs to return the virtual asset to the pre-determined state of operation. | 02-04-2016 |
20160041885 | Data Replicating System, Data Replicating Method, Node Device, Management Device and Computer Readable Medium - Each node constituting this data replicating system returns a response to a data operation requesting device upon having written, into a temporary storage device of the node itself, a post-update log of a data record for which an operation requested by a data operation request was executed. Furthermore, when a checkpoint is reached, each node updates a data record storage unit of the node itself on the basis of the post-update log of the data record stored in the temporary storage device of the node itself, writes the post-update log of the data record stored in the temporary storage device of the node itself into an update history storage unit of the node itself, and writes, into a shared storage device shared with other nodes, checkpoint information having information for specifying a latest post-update log written into the update history storage unit. | 02-11-2016 |
20160048351 | MULTI-THREADED TRANSACTION LOG FOR PRIMARY AND RESTORE/INTELLIGENCE - A unified system provides primary storage and in-line analytics-based data protection. Additional data intelligence and analytics gathered on protected data and prior analytics are stored in discovery points. The disclosed system implements multi-threaded log writes across primary and restore nodes with write gathering across file systems; nested directories such as may be used for storing virtual machine files, where every subdirectory has an associated file system for snapshot purposes; and cloning objects on demand with background metadata and data migration. | 02-18-2016 |
20160055062 | Systems and Methods for Maintaining a Virtual Failover Volume of a Target Computing System - Some of the methods provided herein may include periodically revising a mirror of the target computing system, according to a predetermined backup schedule, the mirror being stored on the virtual failover volume resident on an appliance that is operatively associated with the target computing system, by periodically comparing the mirror to a configuration of the target computing system to determine changed data blocks relative to the mirror, storing the changed data blocks as one or more differential files in the virtual failover volume, and incorporating the changed data blocks into the mirror. In some embodiments, the systems and methods may be utilized to resparsify the virtual failover volume. | 02-25-2016 |
20160055066 | FAULT TOLERANCE FOR COMPLEX DISTRIBUTED COMPUTING OPERATIONS - A method for enabling a distributed computing system to tolerate system faults during the execution of a client process. The method includes instantiating an execution environment relating to the client process; executing instructions within the execution environment, the instructions causing the execution environment to issue further instructions to the distributing computing system, the further instructions relating to actions to be performed with respect to data stored on the distributed computing system. An object interface proxy receives the further instructions and monitors the received to determine if the execution environment is in a desired save-state condition; and, if so, save a current state of the execution environment in a data store. | 02-25-2016 |
20160055067 | DATA TRANSFER AND RECOVERY PROCESS - A backup image generator can create a primary image and periodic delta images of all or part of a primary server. The images can be sent to a network attached storage device and a remote storage server. In the event of a failure of the primary server, the failure can be diagnosed to develop a recovery strategy. Based on the diagnosis, at least one delta image may be applied to a copy of the primary image to generate an updated primary image at either the network attached storage or the remote storage server. The updated primary image may be converted to a virtual server in a physical to virtual conversion at either the network attached storage device or remote storage server and users may be redirected to the virtual server. The updated primary image may also be restored to the primary server in a virtual to physical conversion. As a result, the primary data storage may be timely backed-up, recovered and restored with the possibility of providing server and business continuity in the event of a failure. | 02-25-2016 |
20160062842 | SYSTEM, METHOD AND A NON-TRANSITORY COMPUTER READABLE MEDIUM FOR PROTECTING SNAPSHOTS - A method for protecting snapshots related to a logical unit, the method may include retrieving snapshots blocks that were destaged in a storage system; processing, by the storage system, the snapshots blocks to provide, by an information protection module of the storage system, snapshots redundancy information; and storing the snapshots redundancy information in the storage system | 03-03-2016 |
20160062848 | METHODS AND APPARATUS FOR DATA RECOVERY FOLLOWING A SERVICE INTERRUPTION AT A CENTRAL PROCESSING STATION - A method for data processing comprising may include receiving a paper instruction at a CS and stamping the paper instruction with a predetermined batch number at the CS. The method may further include transferring the paper instruction to a data element at the CS and constructing an executable electronic data record. The method may include creating a transaction identification number for the record and appending the transaction identification number to the record. The method may include transmitting the record from the RS to a CPS and receiving the record at the CPS. The method may include storing the transaction identification number and the batch identification number of the record in a CPS-table of records and executing the record at the CPS. The method may include transmitting the executed record from the CPS to a DRS, following a discrete lapse of time from the receipt of the record. | 03-03-2016 |
20160062851 | PREVENTING MIGRATION OF A VIRTUAL MACHINE FROM AFFECTING DISASTER RECOVERY OF REPLICA - To prevent a user from initiating potentially dangerous virtual machine migrations, a storage migration engine is configured to be aware of replication properties for a source datastore and a destination datastore. The replication properties are obtained from a storage array configured to provide array-based replication. A recovery manager discovers the replication properties of the datastores stored in the storage array, and assigns custom tags to the datastores indicating the discovered replication properties. When storage migration of a virtual machine is requested, the storage migration engine performs or prevents the storage migration based on the assigned custom tags. | 03-03-2016 |
20160062852 | Transaction Recovery in a Transaction Processing Computer System Employing Multiple Transaction Managers - A technique for transaction recovery by one transaction manager of another transaction manager's transactions in which each transaction manager is adapted to manage two phase commit transactional operations on transactional resources and to record commit or rollback decisions in a transaction recovery log. The recovery transaction manager detects apparent unavailability of the another transaction manager for transaction processing and initiates a transaction recovery process for the another transaction manager's transactions. This process also determines whether any of the transactions of the another transaction manager have all respective resources prepared to commit without there yet being a pending commit decision record in the another transaction manager's recovery log. If so, the recovery transaction manager writes a rollback record indicating an intention to roll back the identified transaction, in the another transaction manager's recovery log provided no commit decision record has been recorded. | 03-03-2016 |
20160070624 | APPLICATION TRANSPARENT CONTINUOUS AVAILABILITY USING SYNCHRONOUS REPLICATION ACROSS DATA STORES IN A FAILOVER CLUSTER - Disclosed herein is a system and method for automatically moving an application from one site to another site in the event of a disaster. Prior to coming back online the application is configured with information to allow it to run on the new site without having to perform the configuration actions after the application has come online. This enables a seamless experience to the user of the application while also reducing the associated downtime for the application. | 03-10-2016 |
20160077921 | VIRTUAL COMPUTER SYSTEM, PRINTER CONTROL SYSTEM, VIRTUAL COMPUTATION METHOD, PRINTER CONTROL METHOD, AND STORAGE MEDIA - A virtual computer system includes a first saving unit that saves at least one or more snapshots each having recorded therein a state of a virtual machine, the state including an application program installed on the virtual machine, the snapshot being saved as a reference snapshot; an applying unit that applies the reference snapshot to the virtual machine when an execution request for the application program is received; and a second saving unit that saves a state of the virtual machine that executes the application program, the state being saved as a snapshot. | 03-17-2016 |
20160077928 | Parallel Mirrored Copying with Write Consistency - A method, computer program product and/or system for facilitating data access that performs the following steps (not necessarily in the following order): (i) generating a Mirror Write Consistency (MWC) record associated with a data portion stored on a data storage device (ii) saving a dynamic copy of the MWC record in a manner such that the MWC record is more readily accessible for read and write operations than the data portion stored on the data storage device. At least the generating and making step is performed by computer software running on computer hardware. | 03-17-2016 |
20160077930 | SELECTIVELY PERSISTING APPLICATION PROGRAM DATA FROM SYSTEM MEMORY TO NON-VOLATILE DATA STORAGE - Application program data stored in system memory may be selectively persisted. An indication may be provided to an application program that an application data object or a range of application data stored in system memory may be treated as persistent. Data backup may be enabled for the application data object or range of application data in the event of a system failure, copying the application data object or range of application data from system memory to non-volatile data storage. Upon recovery from a system failure, further data backup for the application data object or the range of application data may be disabled. In some embodiments, at least some of the application data object or range of application data may be recovered for the application program to access. Data backup for the application data object or the range of application data may also be re-enabled. | 03-17-2016 |
20160085638 | COMPUTER SYSTEM AND METHOD OF IDENTIFYING A FAILURE - A computer system for realizing increased speed of identifying extent of a failure in a messaging system, provided with: a first computer including a message receiving part, a first log output part, and a first memory part configured to store receiving log data; a second computer including a data store management part configured to manage a data store, a first search part configured to search a message that meets a given condition from among messages stored in the data store, a second log output part, and a second memory part configured to store data store log data; a third computer including a message sending part, a third log output part, and a third memory part configured to store sending log data; and a fourth computer including a monitoring part, a log collecting part, and a second search part configured to search for lost message. | 03-24-2016 |
20160103731 | SMART ERROR RECOVERY FOR DATABASE APPLICATIONS - A database server includes logic that is operable to monitor and analyze at least events occurring within an environment of the database server and/or execution errors generated by the database server in order to detect whether a problem condition exists. The database server further includes logic that is operable to send one or more commands to a database driver of a client that is communicatively connected to the database server, the one or more commands specifying one or more actions to be taken by the database driver in response to the existence of the problem condition. The database driver includes logic that is operable to receive the one or more commands from the database server and logic that is operable to cause the one or more commands to be executed. | 04-14-2016 |
20160103739 | O(1) VIRTUAL MACHINE (VM) SNAPSHOT MANAGEMENT - Managing a virtual machine snapshot in O(1) time by initially storing data from a virtual machine executing under a host operating system, to a first host operating system managed data block and creating a first pointer that points to the first host operating system managed data block and associates the virtual machine to the data stored in the first host operating system managed data block. A first value, associated with the first host operating system managed data block, is initialized indicating the number of pointers created to associate the virtual machine to the first host operating system managed data block. Receiving, by the computer host operating system, a request to create a snapshot of the virtual machine creates a second pointer replicating the first pointer, and increments, by the computer host operating system, the first value associated with the first host operating system managed data block. | 04-14-2016 |
20160103740 | HANDLING FAILED CLUSTER MEMBERS WHEN REPLICATING A DATABASE BETWEEN CLUSTERS - Data integrity is maintained during failed communications between a member node of a primary cluster and a backup cluster by assigning an assisting member node to run an assisting process that transmits data entered into the member node to the backup cluster. In this way, a replicated database is maintained during a partial communication failure between the primary cluster and the backup cluster. | 04-14-2016 |
20160103741 | TECHNIQUES FOR COMPUTER SYSTEM RECOVERY - Techniques for computer system recovery which remotely restore a default partition to a recent state even when an operating system is functioning abnormally. In an example embodiment, a service center computer establishes a first network connection to a monitored computer system. The service center computer configures the monitored computer system to boot from a bootable image file in the monitored computer system and reboots the monitored computer system into an alternate operating system environment of the bootable image file. The service center computer establishes a second network connection to the monitored computer system to restore a recent backup image of the default partition from a diagnostic partition to a default partition. The service center computer establishes a third network connection to the monitored computer system and reboots the monitored computer system to the default partition. | 04-14-2016 |
20160103742 | BUFFERED CLONED OPERATORS IN A STREAMING APPLICATION - A streams manager clones a portion of a primary flow graph to a virtual machine with a buffer to assure no data is lost if the corresponding portion of the primary flow graph fails. The buffer can be on the input of the cloned portion or on the output of the cloned portion. Cloning a portion of a primary flow graph with a buffer assures no data is lost when the corresponding portion of the primary flow graph fails. When the primary flow graph recovers from the failure, the processing may be switched back to the primary flow graph, which causes the buffer to begin buffering once again. | 04-14-2016 |
20160103743 | METHODS AND APPARATUS FOR RECOVERING ERRORS WITH AN INTER-PROCESSOR COMMUNICATION LINK BETWEEN INDEPENDENTLY OPERABLE PROCESSORS - Methods and apparatus for an inter-processor communication (IPC) link between two (or more) independently operable processors. In one aspect, the IPC protocol is based on a “shared” memory interface for run-time processing (i.e., the independently operable processors each share (either virtually or physically) a common memory interface). In another aspect, the IPC communication link is configured to support a host driven boot protocol used during a boot sequence to establish a basic communication path between the peripheral and the host processors. Various other embodiments described herein include sleep procedures (as defined separately for the host and peripheral processors), and error handling. | 04-14-2016 |
20160103850 | Synchronizing Updates Across Cluster Filesystems - The embodiments described herein relate to synchronization of data in a shared pool of configurable computer resources. One or more consistency points are created in a source filesystem. A first consistency point is compared with a second consistency point to detect a directory change at the source filesystem, which includes identifying at least one difference between the first and second consistency points. A file level change associated with an established directory at a target filesystem is identified responsive to the detection of the directory change. A link is established between the source filesystem and the target filesystem, and the established directory is updated based on the file level change. | 04-14-2016 |
20160110258 | DATA BLOCK BASED BACKUP - The present invention relates to a data block based backup method for a data management system. The data management system comprises a file system that controls access by a database application to at least one database container file stored the data management system. The data management system further comprises a backup client that is connected to a remote backup server, whereby a first version of the database container file is saved in the backup server and a first inode containing information on data blocks of the first version of the database container file. The method may include creating a change tracking table for at least the database container file and adding an entry in the change tracking table, whereby the entry has an indication of the respective data block in association with an information indicating the type of the access. | 04-21-2016 |
20160110267 | CLUSTER FILE SERVER PROXY SERVER FOR BACKUP AND RECOVERY - A remote snapshot is taken of a data associated with a node within a cluster of nodes by using a snapshot facility of an operating system. A set of backup data components is recorded. The data is remotely restored by interpreting the remote snapshot with the set of backup data components. | 04-21-2016 |
20160117227 | DATA RECOVERY TECHNIQUE FOR RECOVERING DATA FROM AN OBJECT STORAGE SERVICE - A system and method for recovering data backed up to an object store are provided. In some embodiments, the method includes identifying an address space of a data set to be recovered. A set of data objects stored by an object-based system is identified that corresponds to the address space and a selected recovery point. The identified set of data objects is retrieved, and data contained in the retrieved set of data objects is stored to at least one storage device at a block address determined by the retrieved set of data objects to recreate the address space. In some embodiments, the set of data objects is retrieved by providing an HTTP request and receiving the set of data objects as an HTTP response. In some embodiments, the set of data objects are retrieved based on the data objects being the target of a data transaction. | 04-28-2016 |
20160117228 | Point in Time Database Restore from Storage Snapshots - Archiving a database and point in time recovery of the database. A method includes taking a first snapshot of a database. The first snapshot of the database includes a first snapshot of the data in the data storage and a first snapshot of the log records in the log storage. The method further includes taking a second snapshot of the database. The second snapshot of the database includes a second snapshot of the data in data storage and a second snapshot of the log records. The method further includes restoring the database to a particular point by applying the first snapshot of the data in the data storage to the database, applying the first snapshot of the log records in the log storage to the database and applying a portion of the second snapshot of the log records in the log storage to the database. | 04-28-2016 |
20160124814 | SYSTEM AND METHOD FOR IMPLEMENTING A BLOCK-BASED BACKUP RESTART - A system and method for block-based restarts are described. A data storage system interfaces with one or more nodes of a network file system on which a volume is provided in order to read data stored on the volume on a block-by-block basis. Backup data sets capable of recreating the data on the volume are generated from the data blocks read from the volume. The system can interface with a backup memory resource and write the backup data sets to the backup memory resource in a sequential order. As the backup data sets are generated and written to the backup memory resource, restart checkpoints for the data set are also regularly generated and stored for use in restarting the backup process in the event of a recoverable failure in the transfer. | 05-05-2016 |
20160132402 | TEST DEVICE AND METHOD FOR CONTROLLING THE SAME - A test device and a method for controlling the test device are disclosed. After a test is interrupted due to a malfunction of the test device, the test device continuously performs the interrupted testing. The test device for testing a biological material includes: a memory configured to store information which relates to progress of a test; and a controller which, if the test is interrupted due to a malfunction of the test device, is configured to continue performance of the test by using the information which relates to the test progress which is stored in the memory. | 05-12-2016 |
20160147612 | METHOD AND SYSTEM TO AVOID DEADLOCKS DURING A LOG RECOVERY - A method, medium, and system to receive a request to perform a log recovery to restore multiple database services; determine log backup entries corresponding to a target log position for a first database service of the multiple database services; read from a sequential stream device, by the first database service, the log backup entries corresponding to the target log position for the first database service; inform a second database service of the multiple database services that the first database service has concluded executing the log backup entries corresponding to the target log position for the first database service from the sequential stream device; assuring that no resources of the streaming device are blocked by the first database service; and read log backup entries of the second database service corresponding to a target log position for the second database service from the sequential stream device. | 05-26-2016 |
20160147615 | DATABASE RECOVERY AFTER SYSTEM COPY - A system includes reception, at a target database system, of a request to recover a backup created by a source database system into the target database system, where the request comprises a system identifier of the source database system, determination of a backup tool configuration file associated with the source database system based on the system identifier of the source database system, request of a recovery of the backup into the target database system using the backup tool configuration file, copying of a backup catalog of the source database system into a storage location associated with the target database system, and appending of a system change marker to the copied backup catalog, wherein the system change marker comprises the system identifier of the source database system. | 05-26-2016 |
20160154701 | ENHANCED RESTART OF A CORE DUMPING APPLICATION | 06-02-2016 |
20160154708 | Safe Storing Data for Disaster Recovery | 06-02-2016 |
20160154710 | LIVE ROLLBACK FOR A COMPUTING ENVIRONMENT | 06-02-2016 |
20160154711 | FLASH COPY FOR DISASTER RECOVERY (DR) TESTING | 06-02-2016 |
20160154712 | FLASH COPY FOR DISASTER RECOVERY (DR) TESTING | 06-02-2016 |
20160162349 | Protection Status Determinations for Computing Devices - Systems, methods, and media that provide backup protection statuses for computing devices are provided herein. Some methods may include determining a backup status for a first computing device, assigning a protection status for the first computing device based upon a comparison of the backup status and a compliance schema for the first computing device, and transmitting the protection status to a monitoring device utilized by an end user. | 06-09-2016 |
20160162374 | SECONDARY STORAGE EDITOR - Systems and methods for storage pruning can enable users to delete, edit, or copy backed up data that matches a pattern. Storage pruning can enable fine-grain deletion or copying of these files from backups stored in secondary storage devices. Systems and methods can also enable editing of metadata associated with backups so that when the backups are restored or browsed, the logical edits to the metadata can then be performed physically on the data to create a custom restore or a custom view. A user may perform operations such as renaming, deleting, modifying flags, and modifying retention policies on backed up items. Although the underlying data in the backup may not change, the view of the backup data when the user browses the backup data can appear to include the user's changes. A restore of the data can cause those changes to be performed on the backup data. | 06-09-2016 |
20160162376 | EVENT LOGGING AND ERROR RECOVERY - A method, computer program product, and system to control event logging and error recovery in a system including adapters, ports, and channels are described. The method includes storing a recovery threshold for each event type among a plurality of event types and storing a level-specific logging threshold for each event type, implementing event handlers for each of the channels, the ports, and the adapters of the system, and implementing a threshold manager for the events identified by the event handlers based on the level-specific logging threshold and the recovery threshold for each of the respective event types of each of the events. For any identified event corresponding with a given event type, the implementing the threshold manager includes considering the recovery threshold and the level-specific logging threshold at every level regardless of a level at which the identified event is identified. | 06-09-2016 |
20160170841 | Non-Disruptive Online Storage Device Firmware Updating | 06-16-2016 |
20160179627 | METHOD AND SYSTEM FOR CHECKPOINTING A GLOBAL STATE OF A DISTRIBUTED SYSTEM | 06-23-2016 |
20160179633 | RECOVERY OF LOCAL RESOURCE | 06-23-2016 |
20160196188 | FAILURE RECOVERY OF A TASK STATE IN BATCH-BASED STREAM PROCESSING | 07-07-2016 |
20160196189 | FAILURE MONITORING DEVICE, COMPUTER-READABLE RECORDING MEDIUM, AND FAILURE MONITORING METHOD | 07-07-2016 |
20160203060 | CLIENT DEPLOYMENT WITH DISASTER RECOVERY CONSIDERATIONS | 07-14-2016 |
20160203061 | DELTA REPLICATION OF INDEX FRAGMENTS TO ENHANCE DISASTER RECOVERY | 07-14-2016 |
20160253234 | PROGRAMMABLE LOGIC CONTROLLER | 09-01-2016 |
20160253246 | AUTO-DIDACTED HIERARCHICAL FAILURE RECOVERY FOR REMOTE ACCESS CONTROLLERS | 09-01-2016 |
20160378611 | TECHNOLOGIES FOR DATA CENTER ENVIRONMENT CHECKPOINTING - Technologies for environment checkpointing include an orchestration node communicatively coupled to one or more working computing nodes. The orchestration node is configured to administer an environment checkpointing event by transmitting a checkpoint initialization signal to each of the one or more working computing nodes that have been registered with the orchestration node. Each working computing node is configured to pause and buffer any presently executing applications, save checkpointing data (an execution state of each of the one or more applications) and transmit the checkpointing data to the orchestration node. Other embodiments are described and claimed. | 12-29-2016 |
20160378615 | Tracking Health Status In Software Components - Tracking health of component in a computer system is disclosed. A health score for software components is determined for each of a plurality of time periods. The computing system determines a problem software component whose health score indicates the unhealthy status at a certain point in time. The computing system determines a set of software components that are linked by dependency relationships to the problem software component. The computing system tracks events at which software components in the set have a health score that went from the healthy status to the unhealthy status. The computing system rolls back in time through the events to locate a software component in the set that was first in time to have its health score go from the healthy status to the unhealthy status. | 12-29-2016 |
20180024891 | SYSTEM AND METHOD FOR MANAGING AND PRODUCING A DATASET IMAGE ACROSS MULTIPLE STORAGE SYSTEMS | 01-25-2018 |