Entries |
Document | Title | Date |
20080209257 | Modeller For A High-Security System - A modeller for a system for determining a residual error probability. The modeller includes a component modeller, which is adapted to receive an error probability and to model a change of the error probability due to a behaviour of a system component, in order to output a changed error probability as residual error probability. | 08-28-2008 |
20080215909 | APPARATUS, SYSTEM, AND METHOD FOR TRANSACTIONAL PEER RECOVERY IN A DATA SHARING CLUSTERING COMPUTER SYSTEM - The invention provides an apparatus, system, and method for cluster-wide peer recovery in the event of a computer failure. A failure of a first computer is detected and a recovery module is registered as the first computer. In one embodiment, the recovery module is a peer computer. The recovery module retrieves a privately held undo log data through the authorized assumption of the failure identity associated with the failed first computer, backs out in-flight transaction updates of the first computer, and frees up data resources locked by the first computer. | 09-04-2008 |
20080229141 | Debugging method - The invention provides a debugging method applicable for an embedded system. The system includes a processor, a main memory and a debugging interface. A debugging program is first provided in the main memory. A debugging interruption is subsequently triggered to cause the processor to read the debugging program from the main memory and execute the debugging program. After execution, an execution result of the debugging program is stored into the main memory. The execution result is read and output via the debugging interface for further analysis. Because the architecture does not require a scan chain of ITR | 09-18-2008 |
20080235532 | REDUCING OVERPOLLING OF DATA IN A DATA PROCESSING SYSTEM - A computer implemented method, apparatus, and computer usable program code for reducing overpolled data in a data processing system is provided. A controller identifies a set of redundant measurements in a cycle. The controller then identifies a number of measurements repeated in the set of redundant measurements. The controller the computes a percentage of redundant polls based on the number of measurements repeated in the set of redundant measurements. The controller then computes a new polling period by reducing an original polling period by the percentage of redundant polls. | 09-25-2008 |
20080244306 | STORAGE SYSTEM AND MANAGEMENT METHOD FOR THE SAME - In a storage system performing remote copy, when a failure occurs in a storage apparatus, optimum redundancy configuration is reestablished promptly. In the storage system performing remote copy, when a storage apparatus detects a failure in its disk drive, a storage apparatus capable of providing a logical unit that can be a replacement for the logical unit affected by the failure in the disk drive is searched for based on storage apparatus performance, and a redundancy configuration is reestablished using a new logical unit the found storage apparatus provides. | 10-02-2008 |
20080288810 | METHOD AND SYSTEM FOR MANAGING RESOURCES DURING SYSTEM INITIALIZATION AND STARTUP - A method for managing a system's computer resources, includes: detecting an error condition in a computer resource; labeling the computer resource as not usable based on the error condition detected; reconfiguring the remaining computer resources to compensate for the detected error condition based on a failure mode policy; and wherein the failure mode policy manages the computer resources by one of: maximizing the amount of the remaining computer resources (mode 1), and maximizing the speed of the remaining computer resources (mode 2). | 11-20-2008 |
20080301486 | Customization conflict detection and resolution - A computer-implemented method is disclosed for managing customization conflicts. The method includes receiving an indication of a conflict. The conflict is indicative of an error created by a customization of a core application. A customization correction is identified as a remedy for the customization conflict. The customization correction is transmitted over a network to a party affiliated with a system affected by the customization conflict. | 12-04-2008 |
20080301487 | VIRTUAL COMPUTER SYSTEM AND CONTROL METHOD THEREOF - When a failure occurs in an LPAR on a physical computer under an SAN environment, a destination LPAR is set in another physical computer to enable migrating of the LPAR and setting change of a security function on the RAID apparatus side is not necessary. When a failure occurs in an LPAR generated on a physical computer under an SAN environment, configuration information including a unique ID (WWN) of the LPAR where the failure occurs is read, a destination LPAR is generated on another physical computer, and the read configuration information of the LPAR is set to the destination LPAR, thereby enabling migrating of the LPAR when the failure occurs, under the control of a management server. | 12-04-2008 |
20080307249 | Digital mixing system with double arrangement for fail safe - A digital mixing system has a console having a display and an operator for transmitting and receiving a control signal, an engine having input channels and output channels for mixing a plurality of audio signals fed from the input channels while exchanging the control signal with the console and feeding the mixed audio signals to the output channels, and peripheral input and output units connected to the input and output channels of the engine, respectively. The console and the engine are located remotely from each other, and a cable connecting therebetween is duplicated for the purpose of fail safe. The engine may be installed in pair. If a main engine fails, a sub engine backs up instantly to continue the mixing operation. The console may be also prepared in pair for the purpose of fail safe. | 12-11-2008 |
20080313489 | FLASH MEMORY-HOSTED LOCAL AND REMOTE OUT-OF-SERVICE PLATFORM MANAGEABILITY - A method, apparatus, and system are disclosed. In one embodiment, the method determines whether one or more manageability conditions are present in a computer system, and then invokes an out-of-service manageability remediation environment stored within a portion of a flash device in the computer system when one or more manageability conditions are present. | 12-18-2008 |
20080313490 | SYSTEM AND ARTICLE OF MANUFACTURE FOR EXECUTING INITIALIZATION CODE TO CONFIGURE CONNECTED DEVICES - Provided are a system and article of manufacture for executing initialization code to configure connected devices. A plurality of segments are provided to configure at least one connected device, wherein each segment includes configuration code to configure the at least on connected device. The segments are executed according to a segment order by executing the configuration code in each segment to perform configuration operations with respect to the at least one connected device. Completion of the segment is indicated in a memory in response to completing execution of the configuration operations for the segment. | 12-18-2008 |
20090024867 | Redundant data path - Disclosed are redundant data path(s) for transmission of graphical data between components in a graphical display system. The redundant data path(s) are used to transmit graphical data by at least two independent means, so that if a failure in one data path occurs, a data transmitted via a separate data path can be used for display. The system is particularly advantageous for multiple-serial-module configurations. The redundant data path(s) minimize disruption of data display and make repair and maintenance of the display system more efficient. The invention includes apparatus for graphical display systems, and also includes methods of data transmission for graphical display systems, and methods of maintenance of graphical display systems. | 01-22-2009 |
20090031163 | SPEEDPATH REPAIR IN AN INTEGRATED CIRCUIT - A circuit comprises a first plurality of transistors of a first channel length disposed along a speedpath, the first plurality of transistors providing a first timing performance. The circuit also comprises a second plurality of transistors of a second channel length having an expected equivalent functionality as the first plurality of transistors and disposed in parallel with the first plurality of transistors along the speedpath, wherein the second channel length is different from the first channel length. In addition, the circuit comprises an element configured to selectively replace the first plurality of transistors with the second plurality of transistors in response to a determination that the first timing performance of the first plurality of transistors fails a timing requirement of the speedpath. In one embodiment, the second channel length is a sub-minimal geometry with respect to the first channel length. | 01-29-2009 |
20090031164 | Method for Self-Diagnosing Remote I/O Enclosures with Enhanced FRU Callouts - A method, apparatus, and computer instructions for self-diagnosing remote I/O enclosures with enhanced FRU callouts. When a failure is detected on a RIO drawer, a data processing system uses the bulk power controller to provide an alternate path, rather than using the existing RIO links, to access registers on the I/O drawers. The system logs onto the bulk power controller, which provides a communications path between the data processing system and the RIO drawer. The communications path allows the data processing system to read all of the registers on the I/O drawer. The register information in the I/O drawer is then analyzed to diagnose the I/O failure. Based on the register information, the data processing system identifies a field replacement unit to repair the I/O failure. | 01-29-2009 |
20090031165 | Method for Self-Diagnosing Remote I/O Enclosures with Enhanced FRU Callouts - A method, apparatus, and computer instructions for self-diagnosing remote I/O enclosures with enhanced FRU callouts. when a failure is detected on a RIO drawer, a data processing system uses the bulk power controller to provide an alternate path, rather than using the existing RIO links, to access registers on the I/O drawers. The system logs onto the bulk power controller, which provides a communications path between the data processing system and the RIO drawer. The communications path allows the data processing system to read all of the registers on the I/O drawer. The register information in the I/O drawer is then analyzed to diagnose the I/O failure. Based on the register information, the data processing system identifies a field replacement unit to repair the I/O failure. | 01-29-2009 |
20090044041 | Redundant Data Bus System - A redundant data bus system has two data buses between which at least two failsafe control devices are connected. The two data buses operate with the same data bus protocol at essentially the same transmission frequency, and safety-related control messages are transmitted in parallel via both data buses and processed in the control devices. Each control device performs a separate control task via assigned control software. Each control device has two microcomputers which operate independently of one another and which have software for both the first and the second control tasks. When one control device fails, the control task can also be performed by the other. One data interface is arranged between the two microcomputers, via which result data calculated from the safety-related control messages can be exchanged and compared with one another. Based on such comparison a decision means determines which microcomputer or control device carries out a control task. | 02-12-2009 |
20090044042 | Device Management Method, Analysis System Used for the Device Management Method, Data Structure Used in Management Database, and Maintenance Inspection Support Apparatus Used for the Device Management Method - Either a complete overhaul for replacing with recommended devices the entire number of devices in a large group of managed devices T, or a partial overhaul for repairing or replacing with recommended devices only those managed devices T that are malfunctioning is selectively performed as an initial overhaul. A complete test involving the entire number of the managed devices T is then periodically performed to determine whether the devices are operating normally or have a malfunction. Any devices found to be malfunctioning during any complete test are repaired or replaced with recommended devices. | 02-12-2009 |
20090049329 | REDUCING LIKELIHOOD OF DATA LOSS DURING FAILOVERS IN HIGH-AVAILABILITY SYSTEMS - A method, system, and computer program product for reducing likelihood of data loss during performance of failovers in a high-availability system comprising a primary system and a standby system are provided. The method, system, and computer program product provide for defining a halt duration, periodically determining a halt end time, halting data modifications at the primary system responsive to failure of data replication to the standby system, resuming data modifications at the primary system responsive to a last determined halt end time being reached or data replication to the standby system resuming, and responsive to the primary system failing prior to a previously determined halt end time, determining that a failover to the standby system will not result in data loss on the standby system with respect to the primary system. | 02-19-2009 |
20090049330 | METHOD AND SYSTEM FOR VIRTUAL REMOVAL OF PHYSICAL FIELD REPLACEABLE UNITS - A method of virtually removing field replaceable units (FRUs) from a computer system during concurrent maintenance operations. Firmware within a flexible service processor (FSP) assigns unique resource identification (RID) numbers to each FRU in the computer system. The firmware collects vital product data (VPD) for each FRU and generates a duplicate test shared library, which is stored in a memory directory corresponding to the FSP. When the firmware receives input from a graphical user interface (GUI) that includes at least a first FRU selected for virtual removal from the computer system, the firmware adds the RID number of the selected FRU to the memory directory and recollects VPD. The FSP subsequently ignores any FRUs corresponding to RID numbers stored in the memory directory during operation of the computer system. | 02-19-2009 |
20090049331 | Error propagation control within integrated circuits - A method of selecting where error detection circuits should be placed within an integrated circuit uses simulation of a reference and test design with errors injected into the test design and then fan out analysis performed upon those injected errors to identify error propagation characteristics. Thus, registers at which propagated errors are highly likely to manifest themselves or which protect key architectural state, or which protect state not otherwise protected can be identified and so an efficient deployment of error detection mechanisms achieved. Within an integrated circuit output signals from inactive circuit elements may be subject to isolation gating in dependence upon a detected current state of the integrated circuit. Thus, inactive circuit elements in which soft errors occur have inappropriate output signals gated from reaching the rest of the integrated circuit and thus reducing erroneous operation. | 02-19-2009 |
20090070621 | BROADCAST RECEIVING DEVICE - A broadcast receiving device includes a card slot, a fan, a temperature sensor, a memory component and a control unit. The card slot accepts an IC card. The fan rotates to cool the IC card. The temperature sensor measures a first temperature. The memory component stores correlation information indicating a correlation between the first temperature and a second temperature of the IC card. The control unit acquires the second temperature based on the first temperature and the correlation information. The control unit determines if the second temperature exceeds a predetermined temperature. The control unit switches from a first output mode, in which an audio-video signal is outputted via the IC card, to a second output mode, in which the audio-video signal is outputted by bypassing the IC card, when the control unit determines that the second temperature exceeds the predetermined temperature. | 03-12-2009 |
20090070622 | Multi nodal Computer System and Method for Handling Check Stops in the Multi nodal Computer System - A new multi nodal computer system comprising a number of nodes on which chips of different types reside. The new multi nodal computer system is characterized in that there is one clock chip per node, each clock chip controlling only the chips residing on that node said chips being appropriate for sending a check stop request to the associated clock chip in case of a malfunction. A new check stop handling method is characterized in that depending on the source of the check stop request the clock chip that received the check stop request initiates a system check stop, a node check up, or a chip check stop. | 03-12-2009 |
20090083573 | Method for detecting sources of faults or defective measuring sensors by positive case modeling and partial suppression of equations - A method establishes a global system model equation including model equations, which contain parameters, of individual components that form the global system. According to said method, the parameters of the individual components are detected using sensor values from the sensors that are allocated to the individual components and it is determined whether it is determined whether it is possible to adapt the parameters to the sensor values and to solve the global system model equation. | 03-26-2009 |
20090083574 | METHOD FOR OPERATING A MANAGEMENT SYSTEM OF FUNCTION MODULES - Methods for operating a management system that manages a large number of first function modules and second function modules. An inhibitor module I sets first control statuses to designating blocking when associated events are detected by an event detecting device, and then the management system no longer makes associated first function modules available for execution. The inhibitor module I sets second control statuses to designating executable when associated events are detected by an event detecting device, and then the management system makes associated second function modules available for execution. | 03-26-2009 |
20090094478 | RECOVERY OF APPLICATION FAULTS IN A MIRRORED APPLICATION ENVIRONMENT - Provided are a method, system, and article of manufacture for recovery of application faults in a mirrored application environment. Application events are recorded at a primary system executing an instruction for an application. The recorded events are transferred to a buffer. The recorded events are transferred from the buffer to a secondary system, wherein the secondary system implements processes indicated in the recorded events to execute the instructions indicated in the events. An error is detected at the primary system. A determination is made of a primary order in which the events are executed by processes in the primary system. A determination is made of a modified order of the execution of the events comprising a different order of executing the events than the primary order in response to detecting the error. The secondary system processes execute the instructions indicated in the recorded events according to the modified order. | 04-09-2009 |
20090100288 | FAST SOFTWARE FAULT DETECTION AND NOTIFICATION TO A BACKUP UNIT - A method and system for quickly informing a backup unit that a primary unit has failed. Normally an exception handler is activated when a software failure occurs and network controller chips or the ASIC interface to a signal bus can operate even though there is a software failure. A software failure notification packet is programmed and stored in a location that is not affected by a software system failure. When a software failure occurs, control is shifted to the exception handler. The exception handler sends a pre-established and pre-addressed packet to the network controller card which transmits this packet to the backup unit. Upon receipt of the packet, the backup unit goes into operation. In some alternate embodiments that include multiple line cards in a single unit, the exception handler sends a signal to a backup unit via a signal bus or a data bus. | 04-16-2009 |
20090125752 | Systems And Methods For Managing A Redundant Management Module - Systems and methods for managing a redundant management module are provided. In this regard, a representative system, among others, includes first and second management modules that are configured to manage a computing device; and a programmable logic device that is configured to: instruct the first management module to manage the computing device responsive to detecting that the first management module is ready to manage the computing device, and instruct the second management module to manage the computing device responsive to detecting that the first management module failed to manage the computing device. | 05-14-2009 |
20090132849 | Method and Computer Program for Selecting Circuit Repairs Using Redundant Elements with Consideration of Aging Effects - A method and computer program for selecting circuit repairs using redundant elements with consideration of aging effects provides a mechanism for raising short-term and long-term performance of memory arrays beyond present levels/yields. Available redundant elements are used as replacements for selected elements in the array. The elements for replacement are selected by BOL (beginning-of-life) testing at a selected operating point that maximizes the end-of-life (EOL) yield distribution as among a set of operating points at which post-repair yield requirements are met at beginning-of-life (BOL). The selected operating point is therefore the “best” operating point to improve yield at EOL for a desired range of operating points or maximize the EOL operating range. For a given BOL repair operating point, the yield at EOL is computed. The operating point having the best yield at EOL is selected and testing is performed at that operating point to select repairs. | 05-21-2009 |
20090132850 | ERROR HANDLING SCHEME FOR TIME-CRITICAL PROCESSING ENVIRONMENTS - As a result of detecting an error, command routing logic for device driver logic is reconfigured so that command processing logic of the device driver is not invoked and to return from commands in a manner indicative of successful completion of command processing. | 05-21-2009 |
20090138750 | REDUNDANT 3-WIRE COMMUNICATION SYSTEM AND METHOD - A redundant communication system and method for providing data communication between a first computing node and a second computing node. A transmitter is provided as part of the first computing node. A receiver is provided as part of the second computing node. A first signal line carries a first data signal. The first signal line electrically couples the transmitter with the receiver. A second signal line carries a second data signal redundant to the first signal. The second signal line electrically couples the transmitter with the receiver. The receiver evaluates the first data signal to determine the presence of an error and the second node uses the second data signal if an error is detected in the first data signal. | 05-28-2009 |
20090144579 | Methods and Apparatus for Handling Errors Involving Virtual Machines - A virtual machine monitor (VMM) in a data processing system handles errors involving virtual machines (VMs) in the processing system. For instance, an error manager in the VMM may detect an uncorrectable error in involving a component associated with a first VM in the processing system. In response to detection of that error, the error manager may terminate the first VM, while allowing a second VM in the processing system to continue operating. In one embodiment, the error manager automatically determines which VM is affected by the uncorrectable error, in response to detecting the uncorrectable error. The error manager may also automatically spawn a new VM to replace the first VM, if the processing system has sufficient resources to support the new VM. Other embodiments are described and claimed. | 06-04-2009 |
20090150715 | DELIVERY OF STREAMS TO REPAIR ERRORED MEDIA STREAMS IN PERIODS OF INSUFFICIENT RESOURCES - In one embodiment, a method includes ingesting a program stream from a program source on a first channel. The method also includes storing the program stream, and receiving notification from a client of unrecoverable error in a stream received at the client. The unrecoverable error corresponds to at least a portion of the stored program stream. The method also includes distributing the corresponding portion of the stored program stream to the client on a second channel in response to the notification. | 06-11-2009 |
20090158081 | Failover Of Blade Servers In A Data Center - Failover of blade servers in a data center including powering off a failing blade server by a system management server through a blade server management module (‘BSMM’) managing the failing blade server, the failing blade server characterized by a machine type, one or more network addresses, and one or more storage addresses, the addresses being virtual addresses; identifying, by the system management server from a pool of standby blade servers, a replacement blade server, the replacement blade server managed by a BSMM; assigning, by the system management server through the BSMM managing the replacement blade server, the one or more network addresses and the one or more storage addresses of the failing blade server to the replacement blade server, including enabling in the replacement blade server the assigned addresses; and powering on the replacement blade server by the system management server through the BSMM managing the replacement blade server. | 06-18-2009 |
20090177911 | APPARATUS, SYSTEM, AND METHOD TO PREVENT QUEUE STALLING - An apparatus, system, and method are disclosed to prevent queue stalling. The apparatus to prevent queue stalling is provided with a plurality of modules configured to functionally execute the necessary steps of detecting a connection failure on a first logical path, wherein the first logical path is associated with a first entry in a queue, and wherein the first logical path is configured to define a communication path between an entity associated with a first entry in the queue and a queue manager, scanning the queue to identify a second entry associated with a second logical path in response to the connection failure, and advancing the second entry to a position within the queue that is ahead of the first entry. These modules in the described embodiments include a detection module, a scanning module, and an advancing module. | 07-09-2009 |
20090177912 | RECONFIGURABLE CIRCUIT WITH REDUNDANT RECONFIGURABLE CLUSTER(S) - A reconfigurable circuit having redundant reconfigurable clusters is described herein. | 07-09-2009 |
20090193287 | Memory management method, medium, and apparatus based on access time in multi-core system - A memory management method and apparatus based on an access time in a multi-core system. In the memory management method of the multi-core system, it is easy to estimate the execution time of a task to be performed by a processing core and it is possible to secure the same memory access time when a task is migrated between processing cores by setting a memory allocation order according to distances from the processing cores to the memories in correspondence with the processing cores, translating a logical address to be processed by one of the processing cores according to the set memory allocation order into a physical address of one of the memories, and allocating a memory corresponding to the translated physical address to the processing core. | 07-30-2009 |
20090222686 | SELF MAINTAINED COMPUTER SYSTEM UTILIZING ROBOTICS - A self-maintained computer system includes a computer system having a plurality of interconnected computer components and a robot associated with the computer system that is configured to carry a spare computer component and further configured to replace a computer component of the computer system with the spare computer component. The robot automatically replaces an individual computer component when a failure of the individual computer component is detected. | 09-03-2009 |
20090235110 | INPUT/OUTPUT CONTROL METHOD, INFORMATION PROCESSING APPARATUS, COMPUTER READABLE RECORDING MEDIUM - An input/output control method for an information processing apparatus that is connected to an input/output device through first and second paths, monitors an input/output response to an input/output request issued to the input/output device through the first path, and performs a timeout process when the input/output response is not present within a timeout time. The input/output control method includes predicting a timeout time to the input/output request on the basis of statistic information that the information processing apparatus obtains by monitoring the input/output response, detecting an error on the first path when an input/output response to the input/output request is not present within the predicted timeout time and disconnecting the first path when the error on the first path is detected. | 09-17-2009 |
20090265577 | Method of managing paths for an externally-connected storage system and method of detecting a fault site - Provided is a method of controlling a computer system that includes: a computer; a first storage device connected to the computer via a first path and a second path; and a second storage device externally-connected to the first storage system via a third path and connected to the computer via a fourth path, the first storage device providing a first storage area to the computer, the second storage device including a second storage area corresponding to the first storage area, the method including: judging whether or not a fault has occurred in at least one of the first to fourth paths; selecting, a path used for access to the first or second storage area; and transmitting the access request for the first or second storage area by using the selected path. Accordingly, in the computer system, an application can be prevented from being stopped despite a fault in a path. | 10-22-2009 |
20090276656 | STORAGE DEVICE AND RECOVERY METHOD - A storage device including a plurality of storage units for storing data dispersively among the storage units, includes: a processor for controlling boot-up of the storage units; and a memory for storing operation history indicative of the sequence of any failure causing any of the storage units to become inoperative, the processor controlling reboot-up of the storage units, when a plurality of the storage units becomes inoperative on account of a plurality of failures, in accordance with process including: determining the order of the reboot up of the storage units that is reversal of the sequence of the failures causing the storage units to become inoperative in reference to the operation history in the memory; rebooting the inoperative storage units successively in accordance with the determined order. | 11-05-2009 |
20090287953 | STORAGE SYSTEM - A storage system encrypts plain text from an external device and stores the cryptogram into a disk unit, decrypts stored data in the disk unit and transmits decrypted text to the external device. The plain and decrypted text must be in agreement when seen from the external device. If a failure occurs in the encrypting or decrypting process, the plain and decrypted text disagree. The storage system includes an encryption unit for encrypting first data, a decryption unit for decrypting the encrypted data into second data, and a comparison unit for comparing the first and second data. When the first and second data do not agree, the first data is encrypted by a different encryption unit and the encrypted data is decrypted into third data, whereupon the first and third data are compared. When the first and third data do not agree, a failure report is sent. | 11-19-2009 |
20090300404 | Managing Execution Stability Of An Application Carried Out Using A Plurality Of Pluggable Processing Components - Methods, apparatus, and products are disclosed for managing execution stability of an application carried out using a plurality of pluggable processing components. Managing execution stability of an application includes: receiving, by an application manager, component stability metrics for a particular pluggable processing component; determining, by the application manager, that the particular pluggable processing component is unstable in dependence upon the component stability metrics for the particular pluggable processing component; and notifying, by the application manager, a system administrator that the particular pluggable processing component is unstable. | 12-03-2009 |
20090300405 | BACKUP COORDINATOR FOR DISTRIBUTED TRANSACTIONS - A primary coordinator generates a prepare message for a two-phase commit distributed transaction, the prepare message including an address of a backup coordinator. The primary coordinator maintains a transaction log of the distributed transaction, wherein the transaction log is accessible to both the primary coordinator and the backup coordinator. The prepare message is sent to a plurality of participants. The primary coordinator fails over to the backup coordinator without interrupting the distributed transaction. | 12-03-2009 |
20090300406 | INFORMATION PROCESSING SYSTEM AND INFORMATION PROCESSING DEVICE - An information processing system includes a plurality of server devices including a main server device and at a standby server device, and a client device coupled to said server devices via a network. The client device includes a monitor unit to asynchronously monitor an operation state of each of the plurality of server devices, and a display control unit to acquire a content from the main server device and display the content in a display area on a screen once the monitor unit detects an operation state of the main server device is active, and to acquire from the standby server device a content for a process that the standby server device has taken over from the main server device and displays the content on the screen once the monitor unit detects an operation state of the standby server device is switched from standby state to active state. | 12-03-2009 |
20090313497 | Failover Enabled Telemetry Systems - The present invention discloses several techniques for providing failover in telemetry systems. The invention allows the continuous and uninterrupted connection between gathering units and a central data collection server, thereby ensuring the proper operation of telemetry systems. | 12-17-2009 |
20090319822 | APPARATUS AND METHOD TO MINIMIZE PERFORMANCE DEGRADATION DURING COMMUNICATION PATH FAILURE IN A DATA PROCESSING SYSTEM - A method to minimize performance degradation during communication path failure in a data processing system, comprising a host computer, a storage controller, and a plurality of physical communication paths in communication with the host computer and the storage controller, where the method establishes a. threshold communication path error rate, and determines an (i)th actual communication path error rate for an (i)th physical communication path, wherein that (i)th communication path is one of the plurality of physical communication paths. If the (i)th actual communication path error rate is greater than the threshold communication path error rate, the method discontinues use of the (i)th physical communication path. | 12-24-2009 |
20090319823 | RUN-TIME FAULT RESOLUTION FROM DEVELOPMENT-TIME FAULT AND FAULT RESOLUTION PATH IDENTIFICATION - Embodiments of the present invention address deficiencies of the art in respect to fault handling and provide a method, system and computer program product for run-time fault resolution from development time fault and fault resolution path identification. In an embodiment of the invention, a method for run-time fault resolution from development time fault and fault resolution path identification can be provided. The method can include detecting a recoverable fault condition in a computing system, selecting a fault resolution path from amongst a multiple development time specified fault resolution paths to match the recoverable fault condition, prompting an operator with the selected fault resolution path, and resuming operation of the computing system without restart subsequent to the operator performing the selected resolution fault path. | 12-24-2009 |
20100005335 | MICROPROCESSOR INTERFACE WITH DYNAMIC SEGMENT SPARING AND REPAIR - A processing device, system, method, and design structure for providing a microprocessor interface with dynamic segment sparing and repair. The processing device includes drive-side switching logic including driver multiplexers to select driver data for transmitting on link segments of a bus, and receive-side switching logic including receiver multiplexers to select received data from the link segments of the bus. The bus includes multiple data link segments, a clock link segment, and at least two spare link segments selectable by the drive-side switching logic and the receive-side switching logic to replace one or more of the data link segments and the clock link segment. | 01-07-2010 |
20100017643 | Cluster system and failover method for cluster system - Provided is a failover method for a cluster system for realizing smooth failover of the guest OS's, even when there are many guest OS's, while reducing consumption of computer resources of a server. Smooth failover is realized by preventing competition during failover even when the number of guest OS's is increased. In a cluster configuration in which a slave/master cluster program is operated in a guest OS/host OS, the master cluster program ( | 01-21-2010 |
20100083030 | REPAIRING HIGH-SPEED SERIAL LINKS - A method and system for repairing high speed serial links is provided. The system includes a first electronic components, connected to at least a second electronic component via at least one link. At least one of the first or second electronic components has a link controller. The link controller is configured to repair serial links by detecting a link error and mapping out individual lanes of a link where the link error is detected. The link controller resumes operation, i.e., transmission of data and continues to monitor the lanes for errors. If and when additional link errors occur, the link controller identifies the lanes in which the link error occurs and deactivates those lanes. The deactivated lane(s) can not be used in further transmissions which, in turn, reduces the occurrence of intermittent link errors. | 04-01-2010 |
20100083031 | METHOD FOR QUEUING MESSAGE AND PROGRAM RECORDING MEDIUM THEREOF - According to an aspect of the embodiment, a message queuing unit of the message processing apparatus stores received messages. A message reception control unit receives a notification of destinations of messages, extracts only the messages for current processes based on a process control table recording current or standby of processes, and transmits the messages to corresponding applications as current processes. On the other hand, the message reception control unit does not transmit the messages to the applications as standby processes. | 04-01-2010 |
20100088539 | System and Method for Providing Fault Tolerant Processing in an Implantable Medical Device - Embodiments herein generally relate to implantable medical devices and, specifically, to a system and method for providing fault tolerant processing in an implantable medical device. In an embodiment a system for providing fault tolerant processing in an implantable medical device is provided. The system can include an implantable medical device comprising a processor and memory store configured to execute a plurality of threads, temporal and spatial constraints assigned to one or more of the threads, and a kernel. The kernel can include a scheduler and a thread monitor configured to monitor execution of threads against the temporal and spatial constraints, and further configured to issue a response upon violation of either of the constraints by one of the plurality of threads. In an embodiment a method for providing fault tolerant processing in an implantable medical device is provided. Other embodiments are also included herein. | 04-08-2010 |
20100095147 | RECONFIGURABLE CIRCUIT WITH REDUNDANT RECONFIGURABLE CLUSTER(S) - Reconfigurable circuits, methods, and systems with reconfigurable interconnect devices, clusters of reconfigurable logic devices, and a programming interface configured to receive configuration data to configure a first combination of the reconfigurable interconnect and logic devices to implement a circuit, and to remap a portion of the received configuration data, corresponding to a defective cluster, from the defective cluster to another non-defective cluster of the plurality of clusters to configure a second combination of the reconfigurable interconnect and logic devices to implement the circuit. | 04-15-2010 |
20100100760 | COMPUTER SYSTEM AND BOOT CONTROL METHOD - When a primary computer is taken over to a secondary computer in a redundancy configuration computer system where booting is performed via a storage area network (SAN), a management server delivers an information collecting/setting program to the secondary computer before the user's operating system of the secondary computer is started. This program assigns a unique ID (World Wide Name), assigned to the fibre channel port of the primary computer, to the fibre channel port of the secondary computer to allow a software image to be taken over from the primary computer to the secondary computer. | 04-22-2010 |
20100122111 | DYNAMIC PHYSICAL AND VIRTUAL MULTIPATH I/O - Embodiments that dynamically manage physical and virtual multipath I/O are contemplated. Various embodiments comprise one or more computing devices, such as one or more servers, having at least two HBAs. At least one of the HBAs may be associated with a virtual I/O server that employs the HBA to transfer data between a plurality of virtual clients and one or more storage devices of a storage area network. The embodiments may monitor the availability of the HBAs, such as monitoring the HBAs for a failure of the HBA or a device coupled to the HBA. Upon detecting the unavailability of one of the HBAs, the embodiments may switch, dynamically, from the I/O path associated with the unavailable HBA to the alternate HBA. | 05-13-2010 |
20100125747 | Hardware Recovery Responsive to Concurrent Maintenance - Disclosed is a computer implemented method, data processing system, and apparatus to respond to detection of a hardware interface error on a system bus, for example, during a concurrent maintenance operation. The service processor may receive an error on the system bus. The error identifies at least one field replaceable unit and may inhibit the suppression of clock signal to the field replaceable unit. The service processor adds an identifier of the field replaceable unit to an eligible Field Replaceable Unit (FRU) list. The service processor recursively adds at least one field replaceable unit that the field replaceable unit depends upon. The service processor suppresses the clock signal to the field replaceable unit. The service processor inhibits tagging the field replaceable unit as unusable for next initial program load. | 05-20-2010 |
20100131793 | SMALL STORE SYSTEM - System and methods of use are discloses that default routing of an ID read by an ID reader as part of a purchase transaction in a retail store, to a first computer system (MCS) instead of the POS computer system for the retail store, the first computer system processes the ID, and the POS computer system receives the results of the processing in the form of a IDs recognizable by the POS computer system and for which the POS computer system has associated costs. | 05-27-2010 |
20100146325 | Systems and methods for correcting software errors - Systems and methods consistent with the invention may include receiving an indication that a software error was detected during operation of the application program, generating an error message based on the software error, the error message including an error signature, comparing the error signature with information stored in a patch library database to identify a corresponding correction patch, and correcting, when the corresponding correction patch is identified, the software error by applying the corresponding correction patch. | 06-10-2010 |
20100162030 | METHOD AND APPARATUS FOR INITIATING CORRECTIVE ACTION FOR AN ELECTRONIC TERMINAL - A method and device are provided for initiating corrective actions for a terminal, such as an ATM. A method of initiating corrective actions for a terminal comprises, monitoring a fault status of a first component, detecting a fault status of the first component with a first trigger plug-in, activating a first action plug-in based upon the detected fault status of the first component, and recycling the first component. | 06-24-2010 |
20100174938 | Method for Operating an Industrial Automation System Comprising a Plurality of Networked Computer Units, and Industrial Automation System - An automation system comprising a plurality of networked computer units, functions of the automation system are provided by services of the computer units in which the services are configured and activated using system configuration data and service configuration data. The system configuration data comprise information for assigning services to providing computer units and for assigning dependencies between services. The system configuration data are accepted and checked by a first service of a control and monitoring unit of the automation system and are forwarded to target computer units. The system configuration data are checked by second services provided by the target computer units and are used to provide resources needed to activate local services. The service configuration data are transmitted to the target computer units following system configuration. A local service is activated by a target computer unit assigned to the service using the service configuration data. | 07-08-2010 |
20100185893 | Topology Collection Method and Dual Control Board Device For A Stacking System - The invention provides a topology collection method and dual control board device applicable to a stacking system comprising dual control board devices. A master control board of a dual control board device advertises through a stack port the topology information of the member device in which the master control board resides, including information about the master control board and, if a slave control board is present, information about the slave control board; and stores the topology information or updates the existing topology information upon receiving the topology information of the stacking system through the stack port, and backs up the stored topology information of the stacking system to the slave control board after the slave control board is inserted. This invention is applicable for collecting the topology information of a stacking system comprising distributed dual control board devices. | 07-22-2010 |
20100205478 | RESOURCE INTEGRITY DURING PARTIAL BACKOUT OF APPLICATION UPDATES - At least one physically inconsistent system resource is identified in response to a failure of an application, where the physically inconsistent system resource was left in a physically inconsistent state as a result of the failure of the application. Available backout operations for any system resources updated by the failed application other than the physically inconsistent system resource are ignored. An automated partial backout of the physically inconsistent system resource is performed. This abstract is not to be considered limiting, since other embodiments may deviate from the features described in this abstract. | 08-12-2010 |
20100205479 | INFORMATION SYSTEM, DATA TRANSFER METHOD AND DATA PROTECTION METHOD - Availability of an information system including a storage system that performs remote copy between two or more storage apparatuses and a host computer using such storage system is improved. A third storage apparatus including a third volume is coupled to a first storage apparatus, a fourth storage apparatus including a fourth volume is coupled to a second storage apparatus, the first and third storage apparatuses perform remote copy of copying data stored in a first volume to the third volume, the first and second storage apparatuses perform remote copy of copying data stored in the first volume to a second volume, and the third and fourth storage apparatuses perform remote copy of copying data stored in the third volume to the fourth volume. | 08-12-2010 |
20100218032 | REDUNDANT SYSTEM, CONTROL APPARATUS, AND CONTROL METHOD - A redundant system includes a redundant apparatus and a control unit that controls power supplied to the redundant apparatus. The redundant apparatus includes a state management unit that manages an operational state of the redundant apparatus, and a response unit that returns the operational state to the control unit. The control unit includes a first requesting unit that requests a redundant apparatus that operates as an operation system for the operational state information, a first determination unit that determines whether the response to the request is returned within a predetermined time, a second determination unit that determines whether the operational state is normal if the response is returned within the predetermined time, and a shutdown unit that shuts down the power supply to the redundant apparatus, if the second determination unit determines that the operational state is not normal. | 08-26-2010 |
20100251004 | VIRTUAL MACHINE SNAPSHOTTING AND DAMAGE CONTAINMENT - Some embodiments provide a system that manages the execution of a virtual machine. During operation, the system takes a series of snapshots of the virtual machine during execution of the virtual machine. If an abnormal operation of the virtual machine is detected, the system spawns a set of snapshot instances from one of the series of snapshots, wherein each of the snapshot instances is executed with one of a set of limitations. Next, the system determines a source of the abnormal operation using a snapshot instance from the snapshot instances that does not exhibit the abnormal operation. Finally, the system updates a state of the virtual machine using the snapshot instance. | 09-30-2010 |
20100251005 | MULTIPROCESSOR SYSTEM AND FAILURE RECOVERING SYSTEM - A multiprocessor system includes a plurality of nodes, each of which includes a plurality of processors, a plurality of memories, and first and second node controllers. Unique identifiers are assigned to all the components. Each of the first and second node controllers includes: each of first and second request control sections configured to determine the identifier of a transmission destination of a request based on a memory address of an access destination of the request; each of first and second registers configured to hold in the first request control section, the identifier of the transmission destination of the request; a first routing table configured to specify one of the first request control section and the second request control section as an output destination of the request based on the identifier held by the first register, the identifier held by the second register, the identifier of the transmission destination of the request, when receiving the request, and a second routing table configured to specify a signal line for the identifier of the transmission destination of the request based on the identifier of the transmission destination which is determined by the first request control section or the second request control section, to transmit the request. | 09-30-2010 |
20100251006 | SYSTEM AND METHOD FOR FAILOVER OF GUEST OPERATING SYSTEMS IN A VIRTUAL MACHINE ENVIRONMENT - A system and method provides for failover of guest operating systems in a virtual machine environment. During initialization of a computer executing a virtual machine operating system, a first guest operating system allocates a first memory region within a first domain and notifies a second guest operating system operating in a second domain of the allocated first memory region. Similarly, the second guest operating system allocates a second region of memory within the second domain and notifies the first operating system of the allocated second memory region. In the event of a software failure affecting one of the guest operating systems, the surviving guest operating system assumes the identity of the failed operating system and utilizes data stored within the shared memory region to replay to storage devices to render them consistent. | 09-30-2010 |
20100268983 | Computer System and Method of Control thereof - A computer system is described having a plurality of hardware resources, a plurality of virtual partitions having allocated thereto some of those of hardware resources or parts thereof, said virtual partitions having an operating system loaded thereon, a partition monitoring application layer, which is capable of determining whether one or more of the partitions has failed, wherein said partition monitoring application layer also includes at least one hardware resource diagnostic function which is capable of interrogating at least one of the hardware resources allocated to a partition after failure of said partition, and a hardware resource reallocation function which is triggered when the hardware diagnostic function determines that one or more particular hardware resources associated with a failed partition is healthy, and which reallocates that healthy resource to an alternate healthy partition. A method of reallocating such hardware resources is also disclosed. | 10-21-2010 |
20100306571 | MULTIPLE MEDIA ACCESS CONTROL (MAC) ADDRESSES - A method for providing multiple media access control (MAC) addresses in a device of a master/slave system may include providing a first MAC address in a MAC address storage of the device. The method may also include providing a second MAC address in a multicast table entry of a multicast hash filter of the device. | 12-02-2010 |
20100318834 | Method and device for avionic reconfiguration - The invention in particular has as an object a method and a device for reconfiguration of an avionic system comprising at least two computers and a software application, in an aircraft, each of the said computers being adapted for running the software application, the aircraft further comprising a module for detection of failure of at least one of the said computers as well as a loading module making it possible to load the software application into each of the computers. After an information item relating to the state of one of the computers has been received from the detection module, a failure of one of the computers is detected according to the information item received. A configuration according to which at least one application run by the faulty computer is run by another of the computers then is determined. The system then is reconfigured according to the determined configuration, the reconfiguration comprising the transmission of a request to the loading module to load the application run by the faulty computer into the other computer. | 12-16-2010 |
20100325471 | HIGH AVAILABILITY SUPPORT FOR VIRTUAL MACHINES - A computer implemented method, a tangible computer storage medium, and a data processing system provide high availability support for virtual machines in a logical partitioned platform. A monitoring system detect a failure in the virtual machine. Partition management firmware then restarts the virtual machine in a consistency failover image node utilizing a consistency failover image. If a subsequent failure of the virtual machine is detected within a predetermined time, partition management firmware restarts the virtual machine in a boot failover image node utilizing a boot failover image. | 12-23-2010 |
20100325472 | Autonomous System State Tolerance Adjustment for Autonomous Management Systems - In general, the techniques of this invention are directed to determining whether a component failure in a distributed computing system is genuine. In particular, embodiments of this invention analyze monitoring data from other application nodes in a distributed computing system to determine whether the component failure is genuine. If the component failure is not genuine, the embodiments may adjust a fault tolerance parameter that caused the component failure to be perceived. | 12-23-2010 |
20110035618 | AUTOMATED TRANSITION TO A RECOVERY KERNEL VIA FIRMWARE-ASSISTED-DUMP FLOWS PROVIDING AUTOMATED OPERATING SYSTEM DIAGNOSIS AND REPAIR - A method (and structure) of operating an operating system (OS) on a computer. When a failure of the OS is detected, the computer automatically performs a diagnosis of the OS failure. The computer also attempts to automatically repair/recover the failed OS, based on the diagnosis, without requiring a reboot. | 02-10-2011 |
20110035619 | INFORMATION PROCESSING APPARATUS AND INFORMATION PROCESSING APPARATUS CONTROL METHOD - An information processing apparatus includes an execution determination unit and a control unit. The execution determination unit determines whether a series of processes including multiple processes is executable at an execution time of the series of processes. The control unit selectively provides at least one recovery device for substituting for the series of processes when it is determined that the series of processes is not executable. | 02-10-2011 |
20110041001 | AUTOMATIC SYSTEM FOR POWER AND DATA REDUNDANCY IN A WIRED DATA TELECOMMUNICATIONS NETORK - Redundancy of data and/or Inline Power in a wired data telecommunications network from a pair of power sourcing equipment (PSE) devices via an automatic selection device is provided by providing redundant signaling to/from each of the pair of PSE devices, and coupling a port of one PSE device and a redundant port of the second PSE device to respective first and second interfaces of a port of the selection device. The selection device initially selects one of the two PSE devices and communicates data and/or Inline Power to a third interface of the selection device. A powered device (PD) coupled to that third interface communicates data and/or Inline Power with the selected one of the first and second PSE device through the selection device. Upon detection of a condition, such as a failure condition, the selection device may select the other of the two interfaces. | 02-17-2011 |
20110060938 | COMPUTER INTERLOCKING SYSTEM AND CODE BIT LEVEL REDUNDANCY METHOD THEREFOR - A code bit level redundancy method for a computer interlocking system is provided. The method includes: (1) controlling the output in parallel, and (2) sharing the collected information. | 03-10-2011 |
20110078488 | HARDWARE RESOURCE ARBITER FOR LOGICAL PARTITIONS - A computer implemented method, data processing system, and apparatus for hardware resource arbitration in a data processing environment having a plurality of logical partitions. A hypervisor receives a request for a hardware resource from a first logical partition, wherein the request corresponds to an operation. The hypervisor determines the hardware resource is free from contention by a second logical partition. The hypervisor writes the hardware resource to a hardware resource pool data structure, as associated with the first logical partition, in response to a determination the hardware resource is free. The hypervisor presents the hardware resource to the first logical partition. The hypervisor determines that the operation is complete. The hypervisor release the hardware resource from a hardware resource pool, responsive to the determination that the operation is complete. | 03-31-2011 |
20110078489 | METHOD, APPARATUS, AND SYSTEM FOR MAINTAINING STATUS OF BOOTSTRAP PEER - A method for maintaining the status of a bootstrap peer includes: selecting a bootstrap peer; obtaining the status information of the bootstrap peer; updating a local bootstrap peer list according to the status information of the bootstrap peer. An apparatus and system for maintaining the status of a bootstrap peer are also disclosed. The bootstrap peer list is updated according to the status information of the selected bootstrap peer, which ensures the validity of the bootstrap peer list on the bootstrap server so that the information in the bootstrap peer list obtained by a joining peer is valid. This improves the success rate of joining the overlay network by the joining peer, shortens the joining process time of the joining peer, and implements load balancing between the bootstrap peers. | 03-31-2011 |
20110107136 | Fault Surveillance and Automatic Fail-Over Processing in Broker-Based Messaging Systems and Methods - An exemplary method includes attempting, by a message broker subsystem, to deliver one or more messages intended for a recipient software application to the recipient software application during a predetermined fault interval, determining, by the message broker subsystem, that the recipient software application is in a fault state after failing to deliver the one or more messages to the recipient software application during the predetermined fault interval, and automatically performing, by the message broker subsystem, a fail-over process on one or more other messages intended for the recipient software application in response to the determination that the recipient software application is in the fault state. Corresponding methods and systems are also disclosed. | 05-05-2011 |
20110138219 | HANDLING ERRORS IN A DATA PROCESSING SYSTEM - A method of managing errors in a data processing system may involve at least one computer system. Each computer system may include a processor that executes an operating system, firmware, and system memory storing instructions for the operating system. A firmware error handler resident in the firmware may identify an error occurring in the computer system. The firmware error handler may determine whether the operating system is required to take an action in response to the error. If the operating system is not required to take an action in response to the error, the firmware error handler may create an error log accessible to the operating system appropriate to cause the operating system to take no action. | 06-09-2011 |
20110145627 | IMPROVING DATA AVAILABILITY DURING FAILURE DETECTION AND RECOVERY PROCESSING IN A SHARED RESOURCE SYSTEM - A system and method for managing shared resources is disclosed. The system includes a primary coherency processing unit which processes lock requests from a plurality of data processing hosts, the primary coherency processing unit also storing a first current lock state information for the plurality of data processing hosts, the first current lock state information including a plurality of locks held by the plurality of data processing hosts. The system further includes a standby coherency processing unit storing fewer locks than the primary coherency processing unit, the locks stored by the standby coherency processing unit being a subset of locks included in the first current lock state information, the standby coherency unit configured to perform a plurality of activities of the primary coherency processing unit using the subset of locks in response to a failure of the primary coherency processing unit. | 06-16-2011 |
20110154097 | FIELD REPLACEABLE UNIT FAILURE DETERMINATION - A system and method for fault management in a computer-based system are disclosed herein. A system includes a plurality of field replaceable units (“FRUs”) and fault management logic. The fault management logic is configured to collect error information from a plurality of components of the system. The logic stores, for each component identified as a possible cause of a detected fault, a record assigning one of two different component failure probability indications. The logic identifies a single of the plurality of FRUs that has failed based on the stored probability indications. | 06-23-2011 |
20110154098 | DCAS HEADEND SYSTEM AND METHOD FOR PROCESSING ERROR OF SECURE MICRO CLIENT SOFTWARE - A Downloadable Conditional Access System (DCAS) headend system and method for processing an error of Secure Micro (SM) Client Software are provided to prevent further transmission of SM Client Software where an error occurred, and to prevent unnecessary traffic due to repeat reinstallation of SM Client Software between an Authentication Proxy (AP) server and a terminal, by changing policy information regarding a transmission of SM Client Software associated with error information, when a number of terminals that transmit the error information exceeds a reference value as a result of analyzing result information regarding a reception and an installation of the SM Client Software received from a terminal corresponding to a DCAS headend system. | 06-23-2011 |
20110167293 | NON-DISRUPTIVE I/O ADAPTER DIAGNOSTIC TESTING - A primary I/O adapter and a redundant I/O adapter of a data processing system are assigned to support access to a system resource. While the primary I/O adapter is in service and the redundant I/O adapter is not in service in providing access to the system resource, a fail over command is issued to remove the primary I/O adapter from service and place the redundant I/O adapter in service in supporting access to the system resource. While the redundant I/O adapter is in service and the primary I/O adapter is not in service in providing access to the system resource, diagnostic testing on the primary I/O adapter is performed. In response to the diagnostic testing revealing no fault in the primary I/O adapter, a fail back command is issued to restore the primary I/O adapter to service and to remove the redundant I/O adapter from service. | 07-07-2011 |
20110231696 | Method and System for Cluster Resource Management in a Virtualized Computing Environment - Methods and systems for cluster resource management in virtualized computing environments are described. VM spares are used to reserve (or help discover or otherwise obtain) a set of computing resources for a VM. While VM spares may be used for a variety of scenarios, particular uses of VM spares include using spares to ensure resource availability for requests to power on VMs as well as for discovering, obtaining, and defragmenting the resources and VMs on a cluster, e.g., in response to requests to reserve resources for a VM or to respond to a notification of a failure for a given VM. | 09-22-2011 |
20110231697 | SYSTEMS AND METHODS FOR IMPROVING RELIABILITY AND AVAILABILITY OF AN INFORMATION HANDLING SYSTEM - In one aspect, a method for improving reliability and availability of an information handling system is disclosed. Operational data associated with an operating margin may be captured. A threshold specified by a pre-defined profile may be identified. The pre-defined profile may be useable in adjusting the operating margin. The captured operational data may be compared to the pre-defined threshold. A parameter specified by the pre-defined profile may be identified. The operation of a component of the information handling system may be modified based, at least in part, on the identified parameter specified by the pre-defined profile. The modification may result in adjusting the operating margin. | 09-22-2011 |
20110231698 | BLOCK BASED VSS TECHNOLOGY IN WORKLOAD MIGRATION AND DISASTER RECOVERY IN COMPUTING SYSTEM ENVIRONMENT - Methods and apparatus involve migrating workloads and disaster recovery. A snapshot is taken of a source volume using a volume shadow service. Depending whether a user seeks a migration or disaster recovery action, blocks of data read from the snapshot are transferred to a target volume in various amounts. The amounts of transfer include all of the blocks, only changed blocks between the volumes, or only blocks incrementally changed since a last transfer operation. Users make indications for transfer on a computing device storing and consuming data on the volumes and optionally do so in the context of Novell's Platespin® products. Other features contemplate kernel drivers to monitor the blocks of the volumes, as well as techniques for comparing them. Still other features involve computing systems, volume devices, such as readers, writers and filters, and computer program products, to name a few. | 09-22-2011 |
20110239037 | System And Method For Providing Indexing With High Availability In A Network Based Suite of Services - A suite of network-based services, such as the services corresponding to Microsoft® SharePoint™, are provided to users with high availability. The suite of network-based services may include browser-based collaboration functions, process management functions, index and search functions, document-management functions, and/or other functions. In particular, the indexing service associated with the suite of network-based services may be provided with high availability. | 09-29-2011 |
20110239038 | MANAGEMENT APPARATUS, MANAGEMENT METHOD, AND PROGRAM - When a fault occurs in a guest machine | 09-29-2011 |
20110246813 | REPURPOSABLE RECOVERY ENVIRONMENT - A reconfiguration manager is operable to reconfigure a repurposable recovery environment between a recovery environment for a production environment and a second environment different from the recovery environment. A storage system in the repurposable recovery environment periodically saves production information from the production environment while the repurposable recovery environment is operating as the second environment. The production information in the storage system is used to reconfigure the repurposable recovery environment from the second environment to the recovery environment. | 10-06-2011 |
20110271138 | SYSTEM AND METHOD FOR HANDLING SYSTEM FAILURE - A system and a method for handling a system failure are disclosed. The method is adapted for an information handling system having a basic input and output system and a micro-controller. The method includes the following steps: sending, via the micro-controller, a signal; checking, via the micro-controller, whether an acknowledgement is received from the basic input and output system responsive to the signal; and scanning, via the micro-controller, a type of a system failure in response to the acknowledgement being not received. | 11-03-2011 |
20110276821 | METHOD AND SYSTEM FOR MIGRATING DATA FROM MULTIPLE SOURCES - An approach is provided for migrating data. Data is received from a plurality of source systems. The received data is processed for conversion to a target system. A failure condition associated with the processing is detected. An action is selectively initiated from a point of failure corresponding to the detected failure condition. The action includes either retrying the processing, aborting the processing, initiating simulation of the process, forcing completion of the processing, or a combination thereof. | 11-10-2011 |
20110296230 | MAINTAINING A COMMUNICATION PATH FROM A HOST TO A STORAGE SUBSYSTEM IN A NETWORK - Provided are a method, system, computer storage device, and storage area network for maintaining a communication path from a host to a storage subsystem in a network. A storage subsystem controls data transfer and access to a storage devices in a network, wherein the storage subsystem is coupled to a switch and the switch is coupled to a host in the network. A topological storage is coupled to the host, the switch and the storage subsystem, for storing a topological coupling relationship between the host and the switch and a topological coupling relationship between the switch and the storage subsystem. In response to determining a failed path between the storage subsystem and the switch coupled to the storage subsystem, the storage subsystem determines a first port on the storage subsystem in the failed path. The storage subsystem determines from the topology storage the topological coupling relationship between the host and the switch and the topological coupling relationship between the switch and the storage subsystem. The storage subsystem redirects, based on the topological coupling relationships, a message sent to the first port of the storage subsystem to an operational second port in the storage subsystem coupled to the switch. | 12-01-2011 |
20110307734 | BIOLOGICALLY INSPIRED HARDWARE CELL ARCHITECTURE - Disclosed is a system comprising: a reconfigurable hardware platform; a plurality of hardware units defined as cells adapted to be programmed to provide self-organization and self-maintenance of the system by means of implementing a program expressed in a programming language defined as DNA language, where each cell is adapted to communicate with one or more other cells in the system, and where the system further comprises a converter program adapted to convert keywords from the DNA language to a binary DNA code; where the self-organisation comprises that the DNA code is transmitted to one or more of the cells, and each of the one or more cells is adapted to determine its function in the system; where if a fault occurs in a first cell and the first cell ceases to perform its function, self-maintenance is performed by that the system transmits information to the cells that the first cell has ceased to perform its function, and then the self-organisation is performed again in order to provide that a second cell undertakes the function of the first cell. | 12-15-2011 |
20120042195 | MANAGING OPERATING SYSTEM DEPLOYMENT FAILURE - A method for managing operating system deployment failure includes, with an operating system deployment server, running an operating system deployment process that comprises running a progressive hardware discovery process of a target machine to which an operating system is deployed, the discovery process to capture inventory information related to the target machine, monitoring the operating system deployment to detect failure in a pre-operating system environment running on the target machine for a predefined period of time, and executing a remediation action in response to generation of a failure code during the period of time, the remediation action related to a Basic Input Output System (BIOS) of the target machine. | 02-16-2012 |
20120047392 | DISASTER RECOVERY REPLICATION THROTTLING IN DEDUPLICATION SYSTEMS - Various embodiments for disaster recovery (DR) replication throttling in a computing environment by a processor device are provided. Communication is arrested between a source data entity and a replicated data entity at a location declared in a DR mode. The DR mode is negotiated to a central replication management component as a DR mode entry event. The DR mode entry event is distributed, by the central replication management component, to each member in a shared group. The DR mode is enforced using at least one replication policy. | 02-23-2012 |
20120047393 | DYNAMICALLY REASSIGNING A CONNECTED NODE TO A BLOCK OF COMPUTE NODES FOR RE-LAUNCHING A FAILED JOB - Methods, systems, and products for dynamically reassigning a connected node to a block of compute nodes for re-launching a failed job that include: identifying that a job failed to execute on the block of compute nodes because connectivity failed between a compute node assigned as at least one of the connected nodes for the block of compute nodes and its supporting I/O node; and re-launching the job, including selecting an alternative connected node that is actively coupled for data communications with an active I/O node; and assigning the alternative connected node as the connected node for the block of compute nodes running the re-launched job. | 02-23-2012 |
20120060047 | COMBINATION OF AN ELECTRIC ROTARY MACHINE AND OF AN ELECTRIC CONTROL UNIT IN AN AUTOMOBILE - A system including at least one electric rotary machine and an integrated control circuit and an electronic control unit, the system being embarked in an automobile. The integrated control circuit of the system includes a RAM connected to the electronic control unit via a data communication link, and the electronic control unit includes a rewritable memory. The system further includes a configuration data permanent storage of the system in the rewritable memory as well as an upload of the configuration data into the RAM during a configuration phase of the system. The system herein enables the integrated control circuit of the electric rotary machine to be standardized by virtue of the fact that the configuration data are no longer written in a read-only memory but reside in a RAM of this circuit. | 03-08-2012 |
20120066541 | CONTROLLED AUTOMATIC HEALING OF DATA-CENTER SERVICES - Subject matter described herein is directed to reallocating an application component from a faulty data-center resource to a non-faulty data-center resource. Background monitors identify data-center resources that are faulty and schedule migration of application components from the faulty data-center resources to non-faulty data-center resources. Migration is carried out in an automatic manner that allows an application to remain available. Thresholds are in place to control a rate of migration, as well as, detect when resource failure might be resulting from data-center-wide processes or from an application failure. | 03-15-2012 |
20120072765 | JOB MIGRATION IN RESPONSE TO LOSS OR DEGRADATION OF A SEMI-REDUNDANT COMPONENT - A computer program product and method of managing the workload in a computer system having one or more semi-redundant hardware components are provided. The method comprises detecting loss or degradation of the level of performance of one or more of the semi-redundant hardware components, identifying hardware components that are affected by the loss or degradation of the one or more semi-redundant components, migrating a critical job from an affected hardware component to an unaffected hardware component, and performing less-critical jobs on an affected hardware component. Loss or degradation of the semi-redundant component reduces the capacity of affected hardware components in the computer system without entirely disabling the computer system. Jobs identified as being critical are run on hardware components having the most capacity and reliability, while allowing less-critical jobs to make use of the remaining capacity of affected hardware components. Optionally, the semi-redundant hardware component may be selected from a memory module, CPU core, Ethernet port, power supply, fan, disk drive, and an input output port. | 03-22-2012 |
20120137163 | MULTI-CORE SYSTEM, METHOD OF CONTROLLING MULTI-CORE SYSTEM, AND MULTIPROCESSOR - A multi-core system | 05-31-2012 |
20120144228 | Apparatus and Method to Read Information from an Information Storage Medium - A method to read information from an information storage medium using a read channel, where that read channel includes a data cache, where the method generates an analog waveform comprising the information, provides that analog waveform to a read channel generates a digital signal from that analog waveform using one or more first operating parameters, corrects that digital signal at an actual error correction rate, determines if the actual error correction rate is greater than an error correction rate threshold. If the actual error correction rate exceeds the error correction rate threshold, then the method captures the digital signal, stores that captured data in a data cache, reads that digital signal from the cache, generates one or more second operating parameters, provides those one or more second operating parameters to the read channel. | 06-07-2012 |
20120159232 | FAILURE RECOVERY METHOD FOR INFORMATION PROCESSING SERVICE AND VIRTUAL MACHINE IMAGE GENERATION APPARATUS - An information processing service is allowed to be immediately recovered from failure without clustering information apparatuses that provide an information processing service. A failure recovery method for an information processing service provided by an information apparatus, the method includes: preparing a virtual machine image generation apparatus which generates a virtual machine image, and a virtual machine execution apparatus which runs a virtual machine based on the virtual machine image; generating and storing, by the virtual machine image generation apparatus, the virtual machine image based on system data and hardware configuration information which enable implementation of the information processing service at a time of normal operation of the information processing service; and running, by the virtual machine execution apparatus, a virtual machine which provides a function of the information processing service based on the virtual machine image when a failure occurs in the information processing service. | 06-21-2012 |
20120173918 | COMMUNICATIONS PATH STATUS DETECTION SYSTEM - Virtual network interface selection manager in a client-server system, in which client and server are connectable through a plural alternate networks. System includes plural interfaces connectable to server through the plural networks, current interface indicator identifying a current interface through which data is transmitted to and/or received from the server, and prioritized listing of plural interfaces ranked in a descending order. Event detector detects occurrence of an event including time-out condition; successful interface test; and change in the plurality of interfaces, and a tester tests each plural interface in prioritized listing in a ranked order to test whether server is reachable. Marker identifies which of plural interfaces successfully pass the test, and switch switches from current interface to a higher priority interface when the marker identifies a higher priority interface as having successfully passed the test, whereby current interface indicator will identify the higher priority interface as current interface. | 07-05-2012 |
20120216069 | Data Transfer and Recovery Process - A backup image generator can create a primary image and periodic delta images of all or part of a primary server. The images can be sent to a network attached storage device and a remote storage server. In the event of a failure of the primary server, the failure can be diagnosed to develop a recovery strategy. Based on the diagnosis, at least one delta image may be applied to a copy of the primary image to generate an updated primary image at either the network attached storage or the remote storage server. The updated primary image may be converted to a virtual server in a physical to virtual conversion at either the network attached storage device or remote storage server and users may be redirected to the virtual server. The updated primary image may also be restored to the primary server in a virtual to physical conversion. As a result, the primary data storage may be timely backed-up, recovered and restored with the possibility of providing server and business continuity in the event of a failure. | 08-23-2012 |
20120216070 | METHOD AND APPARATUS FOR REALIZING APPLICATION HIGH AVAILABILITY - A method, apparatus, and computer program product for realizing application high availability are provided. The application is installed on both a first node and a second node, the first node being used as an active node, and the second node being used as a passive node. The method includes: monitoring access operations to files by an application during its execution on the active node; replicating the monitored updates to the file by the application from the active node to a storage device accessible to the passive node if the application performs updates to a file during the access operations; sniffing the execution of the application on the active node; and switching the active node to the second node and initiating the application on the second node in response to sniffing a failure in the execution of the application on the active node. | 08-23-2012 |
20120221885 | MONITORING DEVICE, MONITORING SYSTEM AND MONITORING METHOD - A monitoring device including: a receiving unit configured to receive a malfunction notice of a data processing device, the data processing device being connected to the monitoring device which monitors running condition through a network; a malfunction device identification unit configured to identify a data processing device that is malfunctioning based on the received malfunction notice; a data obtaining unit configured to obtain running data and device data of the data processing device that is malfunctioning and an another data processing device; and a malfunction cause identification unit configured to identify a cause of the malfunction, based on the obtained running data and the obtained device data. | 08-30-2012 |
20120233491 | MAINTAINING A COMMUNICATION PATH FROM A HOST TO A STORAGE SUBSYSTEM IN A NETWORK - Provided are a method, system, computer storage device, and storage area network for maintaining a communication path from a host to a storage subsystem in a network. A storage subsystem controls data transfer and access to a storage devices in a network including a switch and a host. A topological storage stores topological coupling relationship between the host and the switch and a topological coupling relationship between the switch and the storage subsystem. In response to determining a failed path, the storage subsystem determines a first port on the storage subsystem in the failed path. The storage subsystem determines from the topology storage the topological coupling relationships between the host and the switch and the switch and the storage subsystem. The storage subsystem redirects, based on the topological coupling relationships, a message sent to the first port of the storage subsystem to an operational second port in the storage subsystem. | 09-13-2012 |
20120246508 | METHOD AND SYSTEM FOR CONTINUOUSLY PROVIDING A HIGH PRECISION SYSTEM CLOCK - A method is presented for continuously providing a high precision system clock associated with a processing core, wherein the system clock includes a host clock register that is incremented via a high precision oscillator, the method includes: providing a firmware clock register, incrementing the firmware clock register based on the host clock register being incremented, monitoring for failures of the host clock register, and during a failure of the host clock register continuously incrementing the firmware clock register by means of timing signals of the processing core, and upon receipt of a request to provide a clock value, providing the content of the host clock register if no failure was detected, or if failure was detected, providing the content of the firmware clock register. | 09-27-2012 |
20120246509 | GLOBAL DETECTION OF RESOURCE LEAKS IN A MULTI-NODE COMPUTER SYSTEM - A process is disclosed for identifying and recovering from resource leaks on compute nodes of a parallel computing system. A resource monitor stores information about system resources available on a compute node in a clean state. After the compute node runs a job, the resource monitor compares the current resource availability to the clean state. If a resource leak is found, the resource monitor contacts a global resource manger to remove the resource leak. | 09-27-2012 |
20120260122 | Video conferencing with multipoint conferencing units and multimedia transformation units - In one embodiment, a method includes receiving at a multimedia transformation unit, media streams from a plurality of endpoints, transmitting audio components of the media streams to a multipoint conferencing unit, receiving an identifier from the multipoint conferencing unit identifying one of the media streams as an active speaker stream, processing at the multimedia transformation unit, a video component of the active speaker stream, and transmitting the active speaker stream to one or more of the endpoints without transmitting the video component to the multipoint conferencing unit. An apparatus is also disclosed. | 10-11-2012 |
20120272090 | SYSTEM RECOVERY USING EXTERNAL COMMUNICATION DEVICE - A method for computer system recovery is presented. In one embodiment, the method includes establishing a connection, via an interface, to a computer system to support the system recovery of the computer system. The method includes executing an emulation application as a recovery agent. The method includes retrieving, based on identifiers associated with the computer system, remote data via another interface. The method further includes performing the system recovery by using at least a part of the remote data. | 10-25-2012 |
20120272091 | PARTIAL FAULT PROCESSING METHOD IN COMPUTER SYSTEM - As regards a hardware fault which has occurred in a computer, a hypervisor notifies an LPAR which can continue execution, of a fault occurrence as a hardware fault for which execution can be continued. Upon receiving the notice, the LPAR notifies the hypervisor that it has executed processing to cope with a fault. The hypervisor provides an interface for acquiring a situation of a notice situation. It is made possible to register and acquire a situation of coping with a hardware fault allowing continuation of execution through the interface, and it is made possible to make a decision as to the situation of coping with a fault in the computers as a whole. | 10-25-2012 |
20120297236 | HIGH AVAILABILITY SYSTEM ALLOWING CONDITIONALLY RESERVED COMPUTING RESOURCE USE AND RECLAMATION UPON A FAILOVER - In one embodiment, a method attempts, by a computing device, to determine a placement of a set of virtual machines on available hosts upon failure of a host. The placement considers the set of virtual machines as being not powered on any of the available hosts. The method further determines, by the computing device, a placed list of virtual machines in the set of virtual machines as a recommendation to power on to the available hosts. The determination of the placed list of virtual machines is used to determine a power off list of virtual machines in the set of virtual machines to power off, wherein virtual machines in the power off list of virtual machines are currently powered on available hosts but were considered to be powered off to determine the placement. | 11-22-2012 |
20120303997 | Flexible Bus Architecture for Monitoring and Control of Battery Pack - A method for diagnosing a control system for a stacked battery. The control system comprises a plurality of processors, a plurality of controllers, and a monitoring unit (control unit). The method comprises sending a diagnostic information from the central unit to a top processor of the plurality of processors, transmitting a return information from the top processor of the plurality of processors to the central unit, comparing the diagnostic information sent from the central unit with the return information received by the central unit, and indicating a communication problem if the diagnostic information sent from the central unit is different from the return information received by the central unit. The steps are repeated by eliminating the top processor from a previous cycle and assigning a new top processor if there is no problem with the reconfigurable communication system. | 11-29-2012 |
20120317437 | Ranking Service Units to Provide and Protect Highly Available Services Using the Nway Redundancy Model - Presented are methods and apparatus for protecting a plurality of High Availability (HA) Service Instances (SIs) with a plurality of Service Units (SUs) with an Nway redundancy model. Any of the SUs associated with the Nway redundancy model can simultaneously be assigned an active HA state for some of the SIs and a standby HA state for other SIs. However, only one SU can have the active state for any given SI. The Nway redundancy model is a configured prior to runtime operation. | 12-13-2012 |
20120317438 | METHOD AND SYSTEM FOR PROVIDING IMMUNITY TO COMPUTERS - A method and system for providing immunity to a computer system wherein the system includes an immunity module, a recovery module, a maintenance module, an assessment module, and a decision module, wherein the immunity module, the recovery module, the maintenance module and the assessment module are each linked to the decision module. The maintenance module monitors the system for errors and sends an error alert message to the assessment module, which determines the severity of the error and the type of package required to fix the error. The assessment module sends a request regarding the type of package required to fix the error to the recovery module. The recovery module sends the package required to fix the error to the maintenance module, which fixes the error in the system. | 12-13-2012 |
20120324272 | OPTICAL COMMUNICATION SYSTEM, INTERFACE BOARD AND CONTROL METHOD PERFORMED IN INTERFACE BOARD - An embodiment of the invention is an optical communication system including: a plurality of interface boards which transmit and receive optical signals to and from interface boards facing the plurality of interface boards; and a monitoring control device which monitors states of the plurality of interface boards. A first interface board of the plurality of interface boards includes: a replacement unit capable of monitoring the states of the plurality of interface boards on behalf of the monitoring control device and independently receiving supply of power; and a control unit configured to start the replacement unit in a case where a fault occurs in the monitoring control device and stop or halt the replacement unit in a case where there is no fault in the monitoring control device. | 12-20-2012 |
20130007502 | REPURPOSING DATA LANE AS CLOCK LANE BY MIGRATING TO REDUCED SPEED LINK OPERATION - Methods and apparatus relating to repurposing a data lane as a clock lane by migrating to reduced speed link operation are described. In one embodiment, speed of a link is reduced upon detection of failure on a clock lane of the link and one of a plurality of data lanes of a link is repurposed as a replacement clock lane. Other embodiments are also disclosed and claimed. | 01-03-2013 |
20130013954 | Detecting Browser Failure - Embodiments are configured to improve the stability of a Web browser by identifying plug-in modules that cause failures. Data in memory at the time of a failure is analyzed, and a failure signature is generated. The failure signature is compared to a database of known failure signatures so that the source of the failure may be identified. If a plug-in module to a Web browser is identified as the source of a failure, options are presented to the user who may update the plug-in module with code that does not produce a failure or disable the plug-in module altogether. | 01-10-2013 |
20130061086 | FAULT-TOLERANT SYSTEM, SERVER, AND FAULT-TOLERATING METHOD - To provide a fault-tolerant system requiring only one new server when the number of jobs to he processed concurrently exceeds the number of jobs processable by the current servers and requiring no standby servers. Servers | 03-07-2013 |
20130080821 | PROACTIVELY REMOVING CHANNEL PATHS IN ERROR FROM A VARIABLE SCOPE OF I/O DEVICES - A channel path error correction system includes a processor with one or more channels and a switch operatively coupled to the one or more channels of the processor. The system also includes an I/O device including one or more ports, the I/O device being operatively coupled to the switch by the one or more ports; a plurality of control units. Each control unit includes at least one of the channels and at least one of the ports and a memory operable for storing information relating to detected channel path errors associated with each of the plurality of control units. | 03-28-2013 |
20130080822 | PROACTIVELY REMOVING CHANNEL PATHS IN ERROR FROM A VARIABLE SCOPE OF I/O DEVICES - A method includes detecting a channel path error event on an identified channel path; recording channel path error data associated with the detected channel path error event; identifying an scope of the channel path error associated with the identified channel path; determining if the identified channel path is a defective channel path based on the scope of the channel path error; and removing the defective channel path from one or more devices. | 03-28-2013 |
20130086411 | HARDWARE CONSUMPTION ARCHITECTURE - Various exemplary embodiments relate to a method and related network node including one or more of the following: identifying a hardware failure of a failed component of a plurality of hardware components; determining a set of agent devices currently configured to utilize the failed component; reconfiguring an agent device to utilize a working component of the plurality of hardware components. Various exemplary embodiments additionally or alternatively relate to a method and related network node including one or more of the following: projecting a failure date for the hardware module; determining whether the projected failure date is acceptable based on a target replacement date for the hardware module; if the projected failure date is not acceptable: selecting a parameter adjustment for a hardware component, wherein the parameter adjustment is selected to move the projected failure date closer to the target replacement date, and applying the parameter adjustment to the hardware component. | 04-04-2013 |
20130086412 | CIPHER-CONTROLLING METHOD, NETWORK SYSTEM AND TERMINAL FOR SUPPORTING THE SAME, AND METHOD OF OPERATING TERMINAL - Disclosed are a cipher control method which supports to maintain a cipher mode between a network system and a terminal. The method of controlling an encryption includes: attempting a connection for operating a communication channel between a terminal and a network system; providing cipher information about a cipher algorithm operation of the terminal to the network system; determining whether the terminal is a problematic terminal operating an abnormal cipher algorithm by the networking system; and when the terminal is determined to be operating abnormal, instructing the terminal to perform a communication channel operation based on a normally operable cipher algorithm by the network system. | 04-04-2013 |
20130091376 | SELF-REPAIRING DATABASE SYSTEM - A method, system, and computer program product include generating a database copy from a database of a primary virtual machine (VM), provisioning a standby VM with the database copy, detecting a failure associated with the database, and promoting the standby VM to replace the primary VM. | 04-11-2013 |
20130097455 | METHOD AND SYSTEM FOR IMPLEMENTING INTERCONNECTION FAULT TOLERANCE BETWEEN CPU - A system for implementing interconnection fault tolerance between CPUs, a first CPU and a second CPU implements interconnection through a first CPU interconnect device and a second CPU interconnect device. The system adds a data channel between a first SerDes interface of the first CPU interconnect device and a second SerDes interface of the second CPU interconnect device, and transmits link connection state information and a link control signal through the added data channel. The system monitors a link state of any one link in a CPU interconnection system, transmits the link state through the added data channel, recovers any one of the connection links when determining whether any one of the first connection link, the second connection link and the third connection link is faulty. | 04-18-2013 |
20130103974 | Firmware Management In A Computing System - Managing firmware in a computing system storing a plurality of different firmware images for the same firmware includes: calculating, for each firmware image in dependence upon a plurality of predefined factors, a preference score; responsive to a failure of a particular firmware image, selecting a firmware image having a highest preference score; and failing over to the selected firmware image. | 04-25-2013 |
20130145203 | DYNAMICALLY CONFIGUREABLE PLACEMENT ENGINE - A stream application may allocate processing elements to one or more compute nodes (or hosts) to achieve a desired optimization goal. Each optimization mode may define processing element selection criteria and/or host selection criteria. When allocating a processing element to a host, a scheduler may place each processing element individually. Accordingly, the scheduler may use the processing element selection criteria for selecting which processing element in the stream application to allocate next. The scheduler may then determine, based on one or more constraints, which host the processing element can be placed on. If the scheduler determines that multiple hosts are suitable candidates for the processing element, it may use the host selection criteria to pick one of the candidate hosts that further optimize the stream application to meet the desired goal. | 06-06-2013 |
20130145204 | SUPERVISING AND RECOVERING SOFTWARE COMPONENTS ASSOCIATED WITH MEDICAL DIAGNOSTICS INSTRUMENTS - A system for applying a recovery mechanism to a network of medical diagnostics instruments is provided herein. The system includes the following: a plurality of medical diagnostics instruments, each associated with a network connected component; a plurality of communication modules, each associated with a corresponding one of the plurality of network connected components, wherein each one of the plurality of communication modules is arranged to report on malfunctioning components that are network connected with the corresponding component, and a recovery module, configured to: (i) obtain reports from the communication modules; (ii) re-establish the malfunctioning components; and (iii) notify all communication modules of the re-establishment of the malfunctioning components, wherein the communication modules are further configured to re-establish connection between the corresponding components and the re-established components. | 06-06-2013 |
20130166942 | UNFUSING A FAILING PART OF AN OPERATOR GRAPH - Techniques for managing a fused processing element are described. Embodiments receive streaming data to be processed by a plurality of processing elements. Additionally, an operator graph of the plurality of processing elements is established. The operator graph defines at least one execution path and wherein at least one of the processing elements of the operator graph is configured to receive data from at least one upstream processing element and transmit data to at least one downstream processing element. Embodiments detect an error condition has been satisfied at a first one of the plurality of processing elements, wherein the first processing element contains a plurality of fused operators. At least one of the plurality of fused operators is selected for removal from the first processing element. Embodiments then remove the selected at least one fused operator from the first processing element. | 06-27-2013 |
20130173952 | ELECTRONIC DEVICE AND METHOD FOR LOADING FIRMWARE - An electronic device includes an internal storage module, a baseboard management controller (BMC) and a port. The internal storage module stores a first firmware and a boot application. The port connects to an external storage for storing a second firmware which is a backup of the first firmware. After the electronic device is powered on, the BMC runs the boot application to load the first firmware from the internal storage module. If the first firmware fails to load, the BMC copies the second firmware from the external storage to the internal storage module to replace the first firmware. | 07-04-2013 |
20130198557 | DATA TRANSFER AND RECOVERY - A backup image generator can create a primary image and periodic delta images of all or part of a primary server. The images can be sent to a network attached storage device and one or more remote storage servers. In the event of a failure of the primary server, an updated primary image may be used to provide an up-to-date version of the primary system at a backup or other system. As a result, the primary data storage may be timely backed-up, recovered and restored with the possibility of providing server and business continuity in the event of a failure. | 08-01-2013 |
20130219211 | Elastic Cloud-Driven Task Execution - A method, an apparatus and an article of manufacture for cloud-driven application execution. The method includes determining a plurality of attributes of a failed application, wherein the plurality of attributes comprises at least one policy context attribute and at least one context attribute, correlating each of the plurality of attributes to at least one alternative asset, wherein the at least one alternative asset is a part of an environment on which the failed application can be executed, using the plurality of attributes correlated to the at least one alternative asset to identify an alternative asset set of alternative assets, wherein the alternative asset set is capable of enabling an alternative environment on which to execute the failed application, and provisioning the alternative assets in the alternative asset set from at least one cloud network to create the alternative environment on which the failed application is executed. | 08-22-2013 |
20130227334 | Progressive Network Recovery - Technologies are generally described for systems and methods effective to schedule repair (e.g., allocate repair resources, determine a repair sequence, etc.) of a system effected by a large-scale failure caused by a natural disaster, malicious attack, faulty components, or the like. In an example, the system can generate a schedule that indicates amounts of repair resources allocated for repair of specific components of a disrupted system as well as a time or sequence in which the components are to be repaired. The schedule, in some instance, can operate to maximize an amount of restoration, at each stage of a recovery process, relative to the characteristic of the disrupted system. For example, with a communications network as the disrupted system, the schedule can maximize the amount of total traffic flow capacity recovered after respective steps of the recovery process. | 08-29-2013 |
20130246838 | DISCOVERING BOOT ORDER SEQUENCE OF SERVERS BELONGING TO AN APPLICATION - A survey tool for use in a Recover to Cloud (R2C) replication service environment that determines configuration information automatically (such as through SNMP messaging or custom APIs) and stores it in a survey database. A Virtual Data Center (VDC) representation is then instantiated from the survey database, with the VDC being a virtual replica of the production environment including dormant Virtual Machine (VM) definition files, applications, storage requirements, VLANs firewalls, and the like. The survey tool determines the order in which the replicas are brought on line to ensure orderly recovery, determining the order in which each machine makes requests for connections to other machines. | 09-19-2013 |
20130254586 | FAULT RECOVERY - Various embodiments are described herein with regards to performing a selective fault recovery for an electronic device having a plurality of subsystems in which one of the subsystems has a fault. The selective fault recovery techniques described herein allow a user to use non-faulty subsystem of the electronic device while selective fault recovery is being conducted on the subsystem having the fault. | 09-26-2013 |
20130262912 | MANAGING HARDWARE CONFIGURATION OF A COMPUTER NODE - A computer node includes an integrated management module, a field-programmable gate array, and a plurality of individual hardware devices. The integrated management module receives a user identification and identifies an associated hardware configuration, wherein the hardware configuration identifies hardware devices to be powered off. The integrated management module may instruct the field-programmable gate array to use switches to power off the identified hardware devices without powering off other hardware devices. Optionally, a default hardware configuration may be implemented in the absence | 10-03-2013 |
20130262913 | INFORMATION PROCESSING APPARATUS, SYSTEM TIME SYNCHRONIZATION METHOD AND COMPUTER READABLE MEDIUM - The time in the chipset of backup resources is synchronized easily at the system time. | 10-03-2013 |
20130268798 | Microprocessor System Having Fault-Tolerant Architecture - The invention relates to a microprocessor system for executing software modules, at least some of which are security critical, within the scope of controlling functions or tasks assigned to the software modules, comprising an intrinsically safe microprocessor module having at least two microprocessor cores. At least one further intrinsically safe microprocessor module having at least two microprocessor cores is provided. At least two microprocessor modules are connected via a bus system, at least two software modules are provided which execute functions, at least some of which overlap, the software modules having at least partially overlapping functions are distributed on a microprocessor module or n at least two microprocessor modules, and means for comparing or arbitrating events generated with the software modules for the identical functions are provided in order to detect software or hardware faults. | 10-10-2013 |
20130275801 | RECONFIGURABLE RECOVERY MODES IN HIGH AVAILABILITY PROCESSORS - A computer program product for performing error recovery is configured to perform a method that includes creating, by a processor, a recovery checkpoint. The processor is dynamically switched into a non-recoverable processing mode of operation based on creating the software recovery checkpoint. The non-recoverable processing mode of operation is a mode in which a subset of hardware error recovery resources are powered-down or re-purposed for instruction processing. It is determined, during the non-recoverable processing mode of operation, that a new software recovery checkpoint is required. Based on the determining that a new software recovery checkpoint is required, the processor is dynamically switched into a recoverable processing mode of operation. The recoverable processing mode of operation is a mode in which hardware error recovery resources, including at least one of the hardware error recovery resources in the subset, are purposed for hardware error recovery operations. | 10-17-2013 |
20130283091 | METHODS, APPARATUS, AND SYSTEMS FOR ELECTRONIC DEVICE RECOVERY - Methods, apparatus, and systems for electronic device recovery are disclosed. An example method includes determining that a software request received from a computing device includes an indication of a repair mode of an electronic device, determining a characteristic of the electronic device, determining software to be provided to the electronic device based on the characteristic, and in response to determining that the software request includes the indication of the repair mode, transmitting location information for the software to the computing device. | 10-24-2013 |
20130283092 | DATA CONSISTENCY BETWEEN VIRTUAL MACHINES - Data consistency between a primary virtual machine and a recovery virtual machine may employ a resync engine to detect differences in data blocks stored on both virtual machines. For example, the resync engine may calculate a signature (e.g., hash value) for a primary data block and a corresponding signature for a recovery data block, and compare the signature and the corresponding signature to identify a difference between the primary data block and the recovery data block. In some instances, by identifying a difference between the primary data block and the recovery data block, a data block (e.g., primary data block or recovery data block) may be identified to be transferred from a virtual machine to another virtual machine. | 10-24-2013 |
20130290770 | Match Server for a Financial Exchange Having Fault Tolerant Operation - Fault tolerant operation is disclosed for a primary match server of a financial exchange using an active copy-cat instance, a.k.a. backup match server, that mirrors operations in the primary match server, but only after those operations have successfully completed in the primary match server. Fault tolerant logic monitors inputs and outputs of the primary match server and gates those inputs to the backup match server once a given input has been processed. The outputs of the backup match server are then compared with the outputs of the primary match server to ensure correct operation. The disclosed embodiments further relate to fault tolerant failover mechanism allowing the backup match server to take over for the primary match server in a fault situation wherein the primary and backup match servers are loosely coupled, i.e. they need not be aware that they are operating in a fault tolerant environment. | 10-31-2013 |
20130290771 | COMPUTER SYSTEM - A computer system has a plurality of computer nodes, and each computer node has a plurality of virtual computers and a control base unit controlling the virtual computers. Each virtual computer constitutes a multiplexing group with another virtual computer operating on another computer node different from its own computer node, with one operating as the master and the other as the slave. The control base unit controls whether each virtual computer is operating as either the master or the slave, and monitors the respective states of each virtual computer. The control base unit, when it has detected in its own node a failure of the virtual computer operating as the master virtual computer, makes a decision whether to also switch the other virtual computers operating on its own computer node from master virtual computers to slave virtual computers with the virtual computer in which the failure occurred. | 10-31-2013 |
20130326260 | Automated Disaster Recovery System and Method - Methods and systems for recovering a host image of a client machine to a recovery machine comprise comparing a profile of a client machine of a first type to be recovered to a profile of a recovery machine of a second type different from the first type, to which the client machine is to be recovered, by a first processing device. The first and second profiles each comprise at least one property of the first type of client machine and the second type of recovery machine, respectively. At least one property of a host image of the client machine is conformed to at least one corresponding property of the recovery machine. The conformed host image is provided to the recovery machine, via a network. The recovery machine is configured with at least one conformed property of the host image by a second processing device of the recovery machine. | 12-05-2013 |
20140006842 | STORAGE SYSTEM AND METHOD FOR CONTROLLING STORAGE SYSTEM | 01-02-2014 |
20140019796 | Supervisor System Resuming Control - Embodiments herein relate to a computing device ( | 01-16-2014 |
20140040656 | SYSTEM AND METHOD FOR MANAGING CLOUD DEPLOYMENT CONFIGURATION OF AN APPLICATION - A system is provided to manage cloud deployment configuration of a computing application. The system comprises a request detector, a retrieving module, a manager loader, a configuration change request detector, and a configuration module. The request detector may be configured to detect a request to install a manager agent on an instance of a virtual machine executing a computing application within a virtualization service. The retrieving module may be configured to obtain a manager agent object for loading the manager agent, and install the manager agent on the instance. The manager loader may be configured to invoke the manager agent to collect metrics for the computing application. The configuration change request detector may be configured to receive an instruction to alter cloud deployment configuration of the computing application. The configuration module may be configured to automatically alter the cloud deployment configuration of the computing application in response to the instruction. | 02-06-2014 |
20140108852 | PROCESSING MAIN CAUSE ERRORS AND SYMPATHETIC ERRORS IN DEVICES IN A SYSTEM - Provided are a computer program product, system, and method for processing main cause errors and sympathetic errors in devices in a system. Error data for the devices in the system are analyzed to determine a main cause error for one of the devices that cause at least one sympathetic error in the system. A main cause event object for the determined main cause error and at least one sympathetic event object for the determined at least one sympathetic error resulting from the main cause error are generated. A determination is made from the at least one sympathetic event object of at least one sympathetic event action to perform. The determined at least one sympathetic event action is performed to recover from the at least one sympathetic error represented by the at least one sympathetic event object providing the at least one sympathetic event action. | 04-17-2014 |
20140108853 | PROCESSING MAIN CAUSE ERRORS AND SYMPATHETIC ERRORS IN DEVICES IN A SYSTEM - Provided are a computer program product, system, and method for processing main cause errors and sympathetic errors in devices in a system. Error data for the devices in the system are analyzed to determine a main cause error for one of the devices that cause at least one sympathetic error in the system. A main cause event object for the determined main cause error and at least one sympathetic event object for the determined at least one sympathetic error resulting from the main cause error are generated. A determination is made from the at least one sympathetic event object of at least one sympathetic event action to perform. The determined at least one sympathetic event action is performed to recover from the at least one sympathetic error represented by the at least one sympathetic event object providing the at least one sympathetic event action. | 04-17-2014 |
20140201563 | Match Server for a Financial Exchange Having Fault Tolerant Operation - Fault tolerant operation is disclosed for a primary match server of a financial exchange using an active copy-cat instance, a.k.a. backup match server, that mirrors operations in the primary match server, but only after those operations have successfully completed in the primary match server. Fault tolerant logic monitors inputs and outputs of the primary match server and gates those inputs to the backup match server once a given input has been processed. The outputs of the backup match server are then compared with the outputs of the primary match server to ensure correct operation. The disclosed embodiments further relate to fault tolerant failover mechanism allowing the backup match server to take over for the primary match server in a fault situation wherein the primary and backup match servers are loosely coupled, i.e. they need not be aware that they are operating in a fault tolerant environment. As such, the primary match server need not be specifically designed or programmed to interact with the fault tolerant mechanisms. Instead, the primary match server need only be designed to adhere to specific basic operating guidelines and shut itself down when it cannot do so. By externally controlling the ability of the primary match server to successfully adhere to its operating guidelines, the fault tolerant mechanisms of the disclosed embodiments can recognize error conditions and easily failover from the primary match server to the backup match server. | 07-17-2014 |
20140258770 | INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING APPARATUS, AND COMPUTER PROGRAM PRODUCT - The system includes: an extracting part extracting one or more devices where the same phenomenon as occurred in a target device had occurred; an index-value calculating part acquiring device information of the target device and calculating an index value thereof, and acquiring pieces of device information of the devices and calculating index values thereof; a first-similarity calculating part calculating a first similarity between the index values of the target device and each of the devices; a second-similarity calculating part acquiring environment information of the target device and pieces of environment information of the devices, and calculating a second similarity between the environment informations of the target device and each of the devices; and a presuming part determining one or more reference devices based on the similarities, and presuming a replacement part of the target device based on replacement parts that the reference devices used for elimination of the phenomenon. | 09-11-2014 |
20140281662 | DYNAMICALLY ADAPTIVE BIT-LEVELING FOR DATA INTERFACES - A circuit and method for implementing a adaptive bit-leveling function in an integrated circuit interface is disclosed. During a calibration operation, a pre-loaded data bit pattern is continuously sent from a sending device and is continuously read from an external bus by a receiving device. A programmable delay line both advances and delays each individual data bit relative to a sampling point in time, and delay counts relative to a reference point in time are recorded for different sampled data bit values, enabling a delay to be determined that best samples a data bit at its midpoint. During the advancing and delaying of a data bit, jitter on the data bit signal may cause an ambiguity in the determination of the midpoint, and solutions are disclosed for detecting jitter and for resolving a midpoint for sampling a data bit even in the presence of the jitter. | 09-18-2014 |
20140281663 | RE-FORMING AN APPLICATION CONTROL TREE WITHOUT TERMINATING THE APPLICATION - A reconnection system re-forms a control tree for an application that is executed in parallel without terminating execution of the application. The reconnection system detects when a node of a control tree has failed and directs the nodes that have not failed to reconnect to effect the re-forming of the control tree without the failed node and without terminating the application. Upon being directed to reconnect, a node identifies new child nodes that are to be its child nodes in the re-formed control tree. The node maintains the existing connection with each of its current child nodes that is also a new child node, terminates the existing connection with each of its current child nodes that is not also a new child node, establishes a new connection with any new child node that is not a current child node, and directs each new child node to reconnect. | 09-18-2014 |
20140281664 | METHOD AND SYSTEM FOR DETERMINING DEVICE CONFIGURATION SETTINGS - A method and system for determining and updating configuration settings on a device are provided herein. In some embodiments, a method for updating configuration settings on a device may include detecting an error condition produced by executing an app on the device, collecting information associated with the error condition, the app and the device responsive to the detected error condition, sending a request for new configuration settings, wherein the request includes the collected information, receiving one or more new configuration settings in response to the request, and updating one or more configuration settings of at least one of the device or the app using the new configuration settings received. | 09-18-2014 |
20140281665 | AUTOMATED PATCH GENERATION - A computer-implemented method, computer program product, and computing system is provided for generating software patches. In an implementation, a method may include receiving an indication of a software product and a product level of the software product. An indication of a specific defect associated with the software product and the product level may be received. A defect change-set associated with a correction of the specific defect may be identified. An overlapping change-set may be determined based on, at least in part, a source control history associated with the software product. The overlapping change set may occur between the product level and the defect change-set in the source control history and may implicate at least one common with the defect change-set. A software patch correcting the specific defect may be generated based on the defect change-set and the overlapping change-set. | 09-18-2014 |
20140281666 | METHODS FOR DYNAMICALLY ADAPTIVE BIT-LEVELING BY INCREMENTAL SAMPLING, JITTER DETECTION, AND EXCEPTION HANDLING - A circuit and method for implementing a adaptive bit-leveling function in an integrated circuit interface is disclosed. During a calibration operation, a pre-loaded data bit pattern is continuously sent from a sending device and is continuously read from an external bus by a receiving device. A programmable delay line both advances and delays each individual data bit relative to a sampling point in time, and delay counts relative to a reference point in time are recorded for different sampled data bit values, enabling a delay to be determined that best samples a data bit at its midpoint. During the advancing and delaying of a data bit, jitter on the data bit signal may cause an ambiguity in the determination of the midpoint, and solutions are disclosed for detecting jitter and for resolving a midpoint for sampling a data bit even in the presence of the jitter. | 09-18-2014 |
20140298077 | Scalable Relational Database Replication - A relational database replication system includes a client, at least one primary database, a plurality of secondary databases and replication agents which coordinate database transactions. The system provides a high level of performance, reliability, and scalability with an end result of efficient and accurate duplication of transactions between the primary and secondary databases. In one implementation, the client transmits sets of database update statements to the primary database and primary agent in parallel; the primary agent replicates the statements to at least one secondary agent. A transaction prepare and commit process is coordinated between the primary database and the primary agent, which in turn coordinates with the at least one secondary agent. Databases can be partitioned into individual smaller databases, called shards, and the system can operate in a linearly scalable manner, adding clients, databases and replication agents without requiring central coordination or components that cause bottlenecks. | 10-02-2014 |
20140317435 | ELECTRONIC DEVICE AND METHOD FOR TESTING CAPACITORS OF MOTHERBOARD OF ELECTRONIC DEVICE - In a method for testing the locations and identities of capacitors of a motherboard of an electronic device, register addresses of the capacitors of the motherboard are detected and a notification is generated if all the detected register addresses are correct. Each capacitor having an incorrect register address is reconfigured with a correct register address. The configuration report is generated recording the reconfiguration of the capacitors, and the electronic device is then powered off. | 10-23-2014 |
20140317436 | PROCESSING APPARATUS, METHOD, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM - A processing apparatus includes a first memory, a second memory, a capacitor, and a processor coupled to the first memory and the second memory. The processor is configured to cause power feeding from the capacitor, and execute a first processing to cause the first memory to hold data, after the power feeding is caused from the capacitor, cause a battery to start power feeding in at least one of a case where the power feeding from an external power source is not started after being halted and an output voltage of the capacitor has fallen below a first value, and a case where the power feeding from the external power source is not started after being halted and a first time period has elapsed, and execute a second processing to write the data from the first memory into the second memory during the power feeding from the battery. | 10-23-2014 |
20140325255 | FIELD CONTROL DEVICES HAVING PRE-DEFINED ERROR-STATES AND RELATED METHODS - Control apparatus having pre-defined error-states and related methods are described. An example non-transitory computer-readable medium disclosed herein comprises instructions that, when executed, cause a machine to analyze, via a controller coupled to a field device, a communication from a control system remotely located from the controller, the control system to operate the field control device during a non-error condition; detect an error condition while the field control device is communicatively coupled to and receives the communication from the control system; and override the communication between the control system and the controller to operate the field control device based on a pre-defined error-state instruction stored in the controller when the error condition is detected to cause the field control device to move to a first position for a first amount of time and subsequently move the field control device to a second position for a second amount of time, the first position being different than the second position. | 10-30-2014 |
20140331076 | ENCODER WITH ACCURACY CORRECTION FUNCTION - An encoder ( | 11-06-2014 |
20140380085 | MACHINE CHECK ARCHITECTURE EXECUTION ENVIRONMENT FOR NON-MICROCODED PROCESSOR - A technology for implementing a method for a machine check architecture environment. A method of the disclosure includes obtaining an occurrence of an error. The occurrence of the error causes a non-microcoded processing device to enter an error monitoring state. The method further processes the error using a dedicated memory portion for the error monitoring state while the non-microcoded processing device is in the error monitoring state. The error monitoring state is dedicated to error processing. The method further determines information associated with the error. The information associated with the error is in a predefined format. | 12-25-2014 |
20140380086 | LOAD DRIVING DEVICE - A load driving device including a converter, an output circuit and a timer circuit is provided. The converter receives a communication frame including a data signal through a serial communication and performs parallel conversion on the data signal to output an instruction signal instructing the output circuit to transition to a first state when the data signal includes first serial data and output an instruction signal instructing the output circuit to transition to a second state when the data signal includes second serial data. A timer circuit measures a duration time during which the converter receives the first serial data. When the measuring duration time arrives at an abnormality determination time, the timer circuit forces the output circuit to transition to the second state when the measuring duration time arrives at an abnormality determination time. | 12-25-2014 |
20150026506 | ERROR DETECTING APPARATUS FOR GATE DRIVER, DISPLAY APPARATUS HAVING THE SAME AND METHOD OF DETECTING ERROR OF GATE DRIVER - Exemplary embodiments of present invention relate to an error detecting apparatus for a gate driver improving a reliability of a display apparatus, a display apparatus having the error detecting apparatus, and a method of detecting an error of the gate driver using the error detecting apparatus. An exemplary embodiment discloses an error detecting apparatus including an error detecting part configured to receive a gate signal of a gate driver and determine whether a status of the gate driver is in a normal status or an error status based on the gate signal, a memory configured to store the status of the gate driver, and a signal outputting part configured to selectively output a clock signal and an error signal based on the status of the gate driver stored in the memory. | 01-22-2015 |
20150089270 | USER-DIRECTED DIAGNOSTICS AND AUTO-CORRECTION - A method, system, and computer program product for performing user-initiated logging and auto-correction in hardware/software systems. Embodiments commence upon identifying a set of test points and respective instrumentation components, then determining logging capabilities of the instrumentation components. The nature and extent of the capabilities and configuration of the components aid in generating labels to describe the various logging capabilities. The labels are then used in a user interface so as to obtain user-configurable settings which are also used in determining auto-correction actions. A measurement taken at a testpoint may result in detection of an occurrence of a certain condition, and auto-correction steps can be taken by retrieving a rulebase comprising a set of conditions corresponding to one or more measurements, and corrective actions corresponding to the one or more conditions. Detection of a condition can automatically invoke any number of processes to apply a corrective action and/or emit a recommendation. | 03-26-2015 |
20150100816 | ANTICIPATORY PROTECTION OF CRITICAL JOBS IN A COMPUTING SYSTEM - Anticipatory protection of critical jobs in a computing system, including: identifying, by a system management module, a problem computing component in the computing system; identifying, by the system management modules, all proximate computing components in the computing system, wherein each proximate computing component is within a predetermined physical proximity of the problem computing component; determining, by the system management module, whether the proximate computing components are executing one or more critical jobs; and responsive to determining that the proximate computing components are executing one or more critical jobs migrating, by the system management module, the one or more critical jobs to distant computing components in the computing system, wherein each distant computing component is not within the predetermined physical proximity of the problem computing component. | 04-09-2015 |
20150100817 | Anticipatory Protection Of Critical Jobs In A Computing System - Anticipatory protection of critical jobs in a computing system, including: identifying, by a system management module, a problem computing component in the computing system; identifying, by the system management modules, all proximate computing components in the computing system, wherein each proximate computing component is within a predetermined physical proximity of the problem computing component; determining, by the system management module, whether the proximate computing components are executing one or more critical jobs; and responsive to determining that the proximate computing components are executing one or more critical jobs migrating, by the system management module, the one or more critical jobs to distant computing components in the computing system, wherein each distant computing component is not within the predetermined physical proximity of the problem computing component. | 04-09-2015 |
20150113311 | STORAGE CONTROL APPARATUS, STORAGE APPARATUS, INFORMATION PROCESSING SYSTEM, AND STORAGE CONTROL METHOD THEREFOR - A storage control apparatus includes an uncorrectable error generation flag management section configured to manage an uncorrectable error generation flag in a memory configured to store a first error detection and correction code corresponding to a first data unit, and a second error detection and correction code corresponding to a second data unit including first data units, the uncorrectable error generation flag representing whether or not an uncorrectable error with the first code has occurred, the uncorrectable error generation flag being managed for each second data unit, a controller configured to prohibit access to the second data unit representing that the uncorrectable error has occurred when a command for the access with data change is issued, and a correction section configured to use the second code to correct the second data unit when the second data unit representing that the uncorrectable error has occurred is restored. | 04-23-2015 |
20150121121 | APPARATUS AND METHOD TO RECOVER A DATA SIGNAL - Embodiments of the present invention disclose an apparatus and method for recovering a data signal in a digital transmission. A computer processor receives a data signal from a data signal input wire. The computer processor receives an external clock signal. The computer processor samples a binary bit of the data signal multiple times per clock cycle. The computer processor determines, for each sampling group, a sample and a quality measurement. The computer processor stores, for each sampling group, the sample and the quality measurement into a set of memory elements. The computer processor stores the sample from each sampling group into a first and a second delay chain. The computer processor determines a current sampling point. The computer processor transmits output corresponding to a content of the current sampling point to a data signal output wire. | 04-30-2015 |
20150331763 | HOST SWAP HYPERVISOR THAT PROVIDES HIGH AVAILABILITY FOR A HOST OF VIRTUAL MACHINES - A host swap hypervisor provides a high availability hypervisor for virtual machines on a physical host computer during a failure of a primary hypervisor on the physical host computer. The host swap hypervisor resides on the physical host computer that runs the primary hypervisor, and monitors failure indicators of the primary hypervisor. When the failure indicators exceed a threshold, the host swap hypervisor is then autonomically swapped to become the primary hypervisor on the physical host computer. The original primary hypervisor may then be re-initialized as the new host swap hypervisor. | 11-19-2015 |
20150363258 | DEVICE AND SYSTEM INCLUDING ADAPTIVE REPAIR CIRCUIT - A device, system, and/or method includes an internal circuit configured to perform at least one function, an input-output terminal set and a repair circuit. The input-output terminal set includes a plurality of normal input-output terminals connected to an external device via a plurality of normal signal paths and at least one repair input-output terminal selectively connected to the external device via at least one repair signal path. The repair circuit repairs at least one failed signal path included in the normal signal paths based on a mode signal and fail information signal, where the mode signal represents whether to use the repair signal path and the fail information signal represents fail information on the normal signal paths. Using the repair circuit, various systems adopting different repair schemes may be repaired and cost of designing and manufacturing the various systems may be reduced. | 12-17-2015 |
20160004600 | DATA PROCESSING DEVICE - A data processing device includes a digital data processing unit and a control unit. The digital data processing unit includes a computing unit that computes digital data and a power source management unit that transceives commands with the computing unit and manages a power supply to the computing unit. The control unit controls a user interface unit that provides a user interface function. The control unit diagnoses an operation of the digital data processing unit by monitoring a transfer state of the commands between the computing unit and the power source management unit. When determining an abnormality occurrence in the operation of the digital data processing unit, the control unit resets all parts of the digital data processing unit or a part of the digital data processing unit without interrupting an operation of the user interface unit. | 01-07-2016 |
20160026522 | Method for Fault Handling in a Distributed IT Environment - An improved method provides fault handling in a distributed IT environment. The distributed IT environment executes a workflow application interacting with at least one application by using interface information about the at least one application. In response to receiving a first instance of a fault response, a fault handler performing a first lookup of a fault handling policy corresponding to the fault response within a fault handling descriptions catalogue. The fault handler loads a first one or more fault handling descriptions that are pointed to by the fault handling policy in order to continue execution of the workflow application. After a second instance of the fault response, the fault handler performs a second lookup of the fault handling policy which now points to a second one or more fault handling descriptions which are loaded in order to continue execution of the workflow application. | 01-28-2016 |
20160034339 | Error Recovery Within Integrated Circuit - An integrated circuit includes one or more portions having error detection and error correction circuits and which is operated with operating parameters giving finite non-zero error rate as well as one or more portions formed and operated to provide a zero error rate. | 02-04-2016 |
20160378601 | ADAPTIVE RECOVERY FOR SCM-ENABLED DATABASES - A system includes determination of a plurality of secondary data structures of a database to be rebuilt, determination, for each of the plurality of secondary data structures, of a current ranking based on a pre-crash workload, a crash-time workload, the post-crash workload, and a rebuild time of the secondary data structure, determination to rebuild one of the plurality of secondary data structures based on the determined rankings, and rebuilding of the one of the plurality of secondary data structures in a dynamic random access memory based on primary data of a database stored in non-volatile random access memory. | 12-29-2016 |
20160378610 | FAILURE RECOVERY OF DISTRIBUTED CONTROL OF POWER AND THERMAL MANAGEMENT - Component power consumption is collected from each of a plurality of controllers of a node having a plurality of components. The component power consumption is provided to each of the plurality of controllers. A power differential is determined as a difference between a power cap for an apparatus and a total power consumption for the apparatus based, at least in part, on the component power consumption. A proportion of the total power consumption corresponding to the at least one component associated with the at least one component controller is determined. A local power budget is computed for the at least one component based, at least in part, on the power differential and the proportion of the total power consumption corresponding to the at least one component. A failure associated with the at least one component controller or the at least one component is determined. | 12-29-2016 |
20160378623 | HIGH PERFORMANCE PERSISTENT MEMORY - Embodiments are generally directed to high capacity energy backed memory with off device storage. A memory device includes a circuit board; multiple memory chips that are installed on the circuit board; a controller to provide for backing up contents of the memory chips when a power loss condition is detected; a connection to a backup energy source; and a connection to a backup data storage that is separate from the memory device. | 12-29-2016 |
20190146547 | SEMICONDUCTOR INTEGRATED CIRCUIT DEVICE | 05-16-2019 |