Hot swapping (i.e., while network is up)

Subclass of:

714 - Error detection/correction and fault detection/recovery

714100000 - DATA PROCESSING SYSTEM ERROR OR FAULT HANDLING

714001000 - Reliability and availability

714002000 - Fault recovery

714003000 - By masking or reconfiguration

714400100 - Of network

714400110 - Backup or standby (e.g., failover, etc.)

Patent class list (only not empty are listed)

Deeper subclasses:

Class / Patent application number	Description	Number of patent applications / Date published
714400120	Hot swapping (i.e., while network is up)	74
20110022885	Methods and Equipment for Fault Tolerant IP Service - An Internet Protocol (IP) terminal, comprises communication means for communicating via an IP network, a processor and memory. The memory contains an operating software for the IP terminal and the processor is configured to execute the operating software. The operating software comprises a normal mode logic for implementing a normal mode operation and a restricted mode logic for implementing a restricted mode operation. The normal mode logic comprises program code for initiating a call of a first type under control of instructions from one or more dedicated servers. The restricted mode logic comprises program code for collecting connection information of other IP terminals and for initiating a call of a second type without instructions from the one or more dedicated servers.	01-27-2011
20110083038	COMMUNICATION SYSTEM AND A METHOD AND CALL PROCESSOR FOR USE IN THE SYSTEM - A communication system (	04-07-2011
20110087918	DISASTER RECOVERY - File system disaster recovery techniques provide automated monitoring, failure detection and multi-step failover from a primary designated target to one of a designated group of secondary designated targets. Secondary designated targets may be prioritized so that failover occurs in a prescribed sequence. Replication of information between the primary designated target and the secondary designated targets allows failover in a manner that maximizes continuity of operation. In addition, user-specified actions may be initiated on failure detection and/or on failover operations and/or on failback operations.	04-14-2011
20110093740	Distributed Intelligent Virtual Server - A intelligent distributed virtual server for providing distributed services to a plurality of clients, including one or more servers units, each server units storing data and providing service for accessing by one or more clients; one or more switches and routers for connecting the clients to the server units and to provide a communication link; and a distribution control station connected to the clients and the server units via the switches and routers, wherein the distribution control station receives a request for a service from a client, and selectively establishes a data link between that client and an server unit, which stores the requested data and provides service, such that the server unit provides the data stream to the client via the communication link, independent of other server units. Therefore, it provides distributed computing cross intra-net or Internet and it provides scalability and all intelligent services such as fault handling, security and others.	04-21-2011
20110173493	CLUSTER AVAILABILITY MANAGEMENT - A first logical partition in a first processing complex of a server cluster is operated in an active mode and a second logical partition in the processing complex is operated in a standby mode. Upon detection of a failure in a second processing complex of the server cluster. the standby mode logical partition in the first processing complex is activated to an active mode. In one embodiment, partition resources are transferred from an active mode logical partition to the logical partition activated from standby mode. Other embodiments are described and claimed.	07-14-2011
20110191626	Fault-tolerant network management system - The fault-tolerant network management system is a hierarchical system having two Manager-of-Managers (MoM) that are implemented at the highest layer in an active-passive mode. A middle layer includes Mid-Level Managers (MLMs), which are used to manage agents disposed throughout different areas of the network at the lowest layer. The MLMs relieve the MoM from dealing with individual agents, and hence enhance the scalability of the whole Network Management Systems. MLMs are configured to work in pairs, where each pair includes two MLMs working in an active-active mode. The MoMs and MLMs have the capability of backing each other up in the case of a failure.	08-04-2011
20110246816	CONFIGURING A SYSTEM TO COLLECT AND AGGREGATE DATASETS - Methods for configuring a system to collect and aggregate datasets are disclosed. One embodiment includes, identifying a data source in the system from where dataset is to be collected, configuring a machine in the system that generates the dataset to be collected, to send the dataset to the data source, identifying an arrival location where the dataset that is collected is to be aggregated or written, and/or configuring an agent node by specifying a source for the agent node as the data source in the system and specifying a sink for the agent node as the arrival location.	10-06-2011
20120030505	Method and Apparatus for Calendaring Reminders - An electronic calendar includes such features as recurring reminders, dividing unpredictable work loads into equal pieces, template free parsing, a reminders scheduling algorithm to reduce spikes, dynamic delivery and recovery algorithms, methods for splitting the work load between controllers and workers and for monitoring progress, all within the context of a calendar architecture for a large enterprise.	02-02-2012
20120036394	DATA RECOVERY METHOD, DATA NODE, AND DISTRIBUTED FILE SYSTEM - A data recovery method includes: by a first data node, obtaining a notification that a second data node fails; and storing specified data to a third data node, recording information of the specified data stored in the third data node in backup information stored in the first data node, and providing a metadata node and other data nodes storing the specified data with the information of the specified data stored in the third data node, where the specified data is data stored in the first and second data nodes. A data recovery method, two data nodes, and a distributed file system are also provided. In embodiments of the present invention, the data recovery is mainly performed among the data nodes, and the metadata node does not need to perform a lot of operations. Therefore, the load of the metadata node is reduced.	02-09-2012
20120042197	METHOD FOR RESOURCE INFORMATION BACKUP OPERATION BASED ON PEER TO PEER NETWORK AND PEER TO PEER NETWORK THEREOF. - The present invention provides a method for resource information backup operation based on peer to peer network, comprising: an initiating node sending an out-of-domain backup node determining request to a connecting node in backup domain, and said out-of-domain backup node determining request including the resource global identifier of said resource information to be backed up, and the connecting node in said backup domain and said host node in which the resource information is saved have different domain identifiers; the connecting node in said backup domain determining an out-of-domain backup node according to information of said resource global identifier and out-of-domain backup rules, and sending routing information of said out-of-domain backup node to said initiating node; and said initiating node sending an out-of-domain backup operation request to said out-of-domain backup node according to said routing information, and said out-of-domain backup node implementing corresponding processing according to said out-of-domain backup operation request.	02-16-2012
20120060050	DISASTER RECOVERY - File system disaster recovery techniques provide automated monitoring, failure detection and multi-step failover from a primary designated target to one of a designated group of secondary designated targets. Secondary designated targets may be prioritized so that failover occurs in a prescribed sequence. Replication of information between the primary designated target and the secondary designated targets allows failover in a manner that maximizes continuity of operation. In addition, user-specified actions may be initiated on failure detection and/or on failover operations and/or on failback operations.	03-08-2012
20120066544	REMOTE MONITORING APPARATUS, WIND TURBINE GENERATOR SYSTEM, AND METHOD OF CONTROLLING REMOTE MONITORING APPARATUS - A SCADA system includes a main switching hub and a backup switching hub that relay transmission data between a wind turbine generator and terminals provided in another SCADA system and client terminals, and a network switch for performing switching between the main switching hub and the backup switching hub for relaying transmission data between the wind turbine generator and the terminals. A backup remote I/O connected to the backup switching hub causes the network switch to perform switching based on a switching command from a SCADA terminal input via the backup switching hub. This serves to solve a data transmission problem caused by a problem in a switching hub on the wind turbine generator side from a remote location.	03-15-2012
20120117417	Systems and Methods of High Availability Cluster Environment Failover Protection - A transparent high-availability solution utilizing virtualization technology is presented. A cluster environment and management thereof is implemented through an automated installation and setup procedure resulting in a cluster acting as a single system. The cluster is setup in an isolated virtual machine on each of a number of physical nodes of the system. Customer applications are run within separate application virtual machines on one physical node at a time and are run independently and unaware of their configuration as part of a high-availability cluster. Upon detection of a failure, traffic is rerouted through a redundant node and the application virtual machines are migrated from the failing node to another node using live migration techniques.	05-10-2012
20120144232	Generation of Standby Images of Applications - Embodiments that generate checkpoint images of an application for use as warm standby are contemplated. The embodiments may monitor accesses of external references by threads. An external reference may comprise a connection or use of services of an entity that is external to the set of processes that constitute the application, to which a process of the application attempts to connect by means of a socket or inter-process communication (IPC). Various embodiments comprise two or more computing devices, such as two or more servers. One of the computing devices may generate a checkpoint image of an application at a suitable point in time during initialization, when the state of the application is not yet dependent on interactions with external references. The second computing device may preload checkpoint image for the application and activate the checkpoint images when needed, following the specific resource management rules of the distributed subsystem.	06-07-2012
20120159236	HOLISTIC TASK SCHEDULING FOR DISTRIBUTED COMPUTING - Embodiments of the invention include a method for fault tolerance management of workers nodes during map/reduce computing in a computing cluster. The method includes subdividing a computational problem into a set of sub-problems, mapping a selection of the sub-problems in the set to respective nodes in the cluster, directing processing of the sub-problems in the respective nodes, and collecting results from completion of processing of the sub-problems. During a first early temporal portion of processing the computational problem, failed nodes are detected and the sub-problems currently being processed by the failed nodes are re-processed. Conversely, during a second later temporal portion of processing the computational problem, sub-problems in nodes not yet completely processed are replicated into other nodes, processing of the replicated sub-problems directed, and the results from completion of processing of sub-problems collected. Finally, duplicate results are removed and remaining results reduced into a result set for the problem.	06-21-2012
20120166866	Fault-Tolerance And Fault-Containment Models For Zoning Clustered Application Silos Into Continuous Availability And High Availability Zones In Clustered Systems During Recovery And Maintenance - A cluster recovery and maintenance technique for a server cluster having plural nodes implementing a server tier in a client-server computing architecture. A first group of N active nodes each run a software stack comprising a cluster management tier and a cluster application tier that actively provides services on behalf of one or more client applications running in a client application tier on the clients. A second group of M spare nodes each run a software stack comprising a cluster management tier and a cluster application tier that does not actively provide client application services. First and second zones in the cluster are determined in response to an active node membership change involving one or more active nodes departing from or being added to the first group as a result of an active node failing or becoming unreachable or as a result of a maintenance operation involving an active node.	06-28-2012
20120210160	METHOD OF DYNAMIC ALLOCATION ON A STATICALLY ALLOCATED AND EMBEDDED SOFTWARE ARCHITECTURE - A method of dynamically allocating a task or a signal on a statically allocated and embedded software architecture of a vehicle includes identifying a faulty component. The faulty component may include a software component, a hardware component or a signal or communications link between components. Once the faulty component is identified, any tasks performed by or signals associated with the faulty component are identified, and the tasks performed by or the signals associated with the faulty component are re-allocated to an embedded standby component so that performance of the re-allocated task and/or signal for future system operations is performed by the standby component.	08-16-2012
20120311377	REPLAYING JOBS AT A SECONDARY LOCATION OF A SERVICE - Jobs submitted to a primary location of a service within a period of time before and/or after a fail-over event are determined and are resubmitted to a secondary location of the service. For example, jobs that are submitted fifteen minutes before the fail-over event and jobs that are submitted to the primary network before the fail-over to the second location is completed are resubmitted at the secondary location. After the fail-over event occurs, the jobs are updated with the secondary network that is taking the place of the primary location of the service. A mapping of job input parameters (e.g. identifiers and/or secrets) from the primary location to the secondary location are used by the jobs when they are resubmitted to the secondary location. Each job determines what changes are to be made to the job request based on the job being resubmitted.	12-06-2012
20120324273	DATA ROUTING FOR POWER OUTAGE MANAGEMENT - In one embodiment, a particular node in a computer network, that is, one receiving electrical power from a grid source, may determine routing metrics to a plurality of neighbor nodes of the particular node in the computer network. In addition, the node also determines power grid connectivity of the plurality of neighbor nodes. Traffic may be routed from the particular node to one or more select neighbor nodes having preferred routing metrics, until a power outage condition at the particular node is detected, at which time the traffic (e.g., last gasp messages) may be routed from the particular node to one or more select neighbor nodes having diverse power grid connectivity from the particular node. In this manner, traffic may be routed via a device that is not also experiencing the power outage condition.	12-20-2012
20130007506	MANAGING RECOVERY VIRTUAL MACHINES IN CLUSTERED ENVIRONMENT - Techniques involving replication of virtual machines of virtual machines in a clustered environment are described. One representative technique includes receiving a replication request to replicate a primary virtual machine. A clustering broker is configured to act on the replication request on behalf of a cluster of recovery nodes, by at least placing a replicated virtual machine corresponding to the source virtual machine on a recovery node and facilitate tracking the migration of the replicated virtual machine within the cluster. The clustering broker returns an address of the recovery node that has been placed or found through tracking for the particular virtual machine.	01-03-2013
20130013956	REDUCING IMPACT OF A REPAIR ACTION IN A SWITCH FABRIC - Techniques are disclosed for reducing impact of a repair action in a switch fabric. In one embodiment, a server system is provided that includes a first interposer card that operatively connects one or more server cards to a midplane. The first interposer card may include a switch module that switches network traffic for the one or more server cards. The first interposer card may be hot-swappable from the midplane, and the one or more server cards may be hot-swappable from the first interposer card.	01-10-2013
20130013957	REDUCING IMPACT OF A SWITCH FAILURE IN A SWITCH FABRIC VIA SWITCH CARDS - Techniques are disclosed for reducing impact of a switch failure in a switch fabric. In one embodiment, a server system is provided that includes a midplane, one or more server cards and one or more switch cards. The midplane may include a fabric interconnect for a switch fabric. The one or more server cards may be coupled with the midplane, where each server card is hot-swappable from the midplane. The one or more switch cards may also be coupled with the midplane, where each switch card is also hot-swappable from the midplane. Each switch card includes one or more switch modules, and each switch module is configured to switch network traffic for at least one server card.	01-10-2013
20130067269	OBJECT BASED STORAGE SYSTEM AND METHOD OF OPERATING THEREOF - A method and a storage system for managing logical objects, wherein the storage system includes a plurality of control servers and the method includes: (i) defining a plurality of object pools and associating each logical object, hosted in the storage system, with one of the plurality of object pools; (ii) configuring each control server to have a primary responsibility over at least two of the object pools, such that each object pool is controlled by one primary control server, configured to handle requests directed to logical objects associated with the object pool; and (iii) in response to a failure of one of the plurality of control servers, configuring each operational server of the plurality of control servers to take over primary responsibility for at least one object pool, originally defined under the primary responsibility of the failed control server.	03-14-2013
20130159763	Dynamic Allocation of Network Security Credentials For Alert Notification Recipients - Methods, apparatuses, and computer program products for dynamic allocation of network security credentials for alert notification recipients are provided. Embodiments include receiving from a managed system, by an alert management system, an alert indicating one of a failure in the managed system and a pending failure in the managed system; selecting, by the alert management system, a remote device from a plurality of remote devices registered for remote access with the alert management system; preapproving, by the alert management system, network security clearance of the selected remote device, the network security clearance for remote access to the management system via a virtual private network (VPN) interface; and transmitting to the selected remote device, by the alert management system, an alert notification that includes an internet address corresponding to the VPN interface.	06-20-2013
20130179723	DUAL-CHANNEL HOT STANDBY SYSTEM AND METHOD FOR CARRYING OUT DUAL-CHANNEL HOT STANDBY - A dual-channel hot standby system and a method for carrying out dual-channel hot standby, the system comprises a hot standby status management layer including two hot standby management units, an application processing layer including two application processors, and a data communication layer including two communicators; the hot standby status management layer is used for controlling the setting and switching between a active status and a standby status of the two application processors, monitoring the working status of the data communication layer, and carrying out synchronization of the control cycles for the two channels of the system; wherein one of the hot standby management units controls one of the application processors, and together constitute a channel of the system therewith; the data communication layer is used for receiving data from outside, and forwarding the data to the application processing layer. The present invention avoids the occurrence of “dual-channel-active” or “dual-channel-standby” status; ensures synchronization of the control cycles of two channels; reduces the time of the system for responding to breakdowns; meets the real-time requirements; enhances the reliability and availability of the system; and ensures a seamless switching between active and standby statuses.	07-11-2013
20130227340	FAULT TOLERANT ROUTING IN A NON-HOT-STANDBY CONFIGURATION OF A NETWORK ROUTING SYSTEM - Methods and systems for facilitating fault tolerance in a non-hot-standby configuration of a network muting system are provided. According to one embodiment, a failover method is provided. One or more processing engines of a network routing system are configured to function as active processing engines, each of which having one or more software contexts. A control blade is contoured to monitor the active processing engines. One or more of the processing engines are identified to function as non-hot-standby processing engines, each of which having no pre-created software contexts corresponding, to the software contexts of the active processing engines. The control blade monitors the active processing engines. Responsive to detecting a fault associated with an active processing engine the active processing engine is dynamically replaced with a non-hot-standby processing engine by creating one or more replacement software contexts within the non-hot-standby processing engine corresponding to those of the active processing engine.	08-29-2013
20130346790	NON-DISRUPTIVE CONTROLLER REPLACEMENT IN NETWORK STORAGE SYSTEMS - A network-based storage system includes multiple storage devices and system controllers. Each storage device in multiple aggregates of storage devices can include ownership portion(s) that are configured to indicate a system controller to which it belongs. First and second system controllers can form an HA pair, and can be in communication with each other, the storage devices, and a separate host server. A first system controller controls an aggregate of storage devices and can facilitate an automated hotswap replacement of a second system controller that controls another aggregate of storage devices with a separate third system controller that subsequently controls the other aggregate of storage devices. The first system controller can take over control of the second aggregate of storage devices during the automated hotswap replacement of the second system controller, and can exchange system identifiers and ownership portion information with the separate third system controller automatically during the hotswap.	12-26-2013
20140095925	CLIENT FOR CONTROLLING AUTOMATIC FAILOVER FROM A PRIMARY TO A STANDBY SERVER - A primary server and a standby server operating according as a redundant server pair are connected to a common network, and the operational state of each is monitored by a first and a second client function each of which run on a device connected to the common network. Each of the client functions operate to notify the standby server in the event that the primary server ceases to be operational. The standby server determining whether the primary server is operational based upon notification received from both of the first and second client functions.	04-03-2014
20140173336	CASCADING FAILOVER OF BLADE SERVERS IN A DATA CENTER - Cascading failover of blade servers in a data center that includes transferring by a system management server a data processing workload from a failing blade server to an initial replacement blade server, with the data processing workload characterized by data processing resource requirements and the initial replacement blade server having data processing resources that do not match the data processing resource requirements; and transferring the data processing workload from the initial replacement blade server to a subsequent replacement blade server, where the subsequent replacement blade server has data processing resources that better match the data processing resource requirements than do the data processing resources of the initial replacement blade server, including transferring the workload to the subsequent replacement blade server only if the data processing cost of the transfer of the workload to the subsequent replacement blade is less than the value of a transfer cost threshold.	06-19-2014
20140317441	MANAGEMENT SYSTEM FOR MANAGING COMPUTER SYSTEM, METHOD FOR MANAGING COMPUTER SYSTEM, AND STORAGE MEDIUM - An exemplary management system stores computer performance management information, I/O performance management information regarding communication by I/O adapters, and priority management information associating a priority of a failover destination pair candidate including failover destination computer and destination I/O adapter candidates with a relationship between a failover source computer and destination computer candidate and with an I/O performance relationship between failover source and destination I/O adapter candidates. The management system determines failover destination pair candidates, each including failover destination computer and I/O adapter candidates. The management system determines a priority of each failover destination pair candidates with reference to the computer and I/O performance management information, and the priority management information based on a performance relationship between the active computer and the failover destination computer candidate and on a performance relationship between the active I/O adapter and the failover destination I/O adapter candidate.	10-23-2014
20140359343	Method, Apparatus and System for Switching Over Virtual Application Two-Node Cluster in Cloud Environment - A method for switching over a virtual application two-node cluster in a cloud environment, including: sending an association state of a shared EBS volume to the standby virtual machine; receiving a request for removing an association between the active virtual machine and the shared EBS volume; removing the association between the active virtual machine and the shared EBS volume; receiving a request for associating the shared EBS volume sent by the standby virtual machine; and associating the standby virtual machine with the shared EBS volume. A brain-split problem can be completely solved using the method and an apparatus disclosed in the embodiments of the present invention. In addition, dependence on a reference active node is no longer required, which can simplify deployment of an application two-node cluster and improve reliability of the application two-node cluster.	12-04-2014
20150331761	HOST SWAP HYPERVISOR THAT PROVIDES HIGH AVAILABILITY FOR A HOST OF VIRTUAL MACHINES - A host swap hypervisor provides a high availability hypervisor for virtual machines on a physical host computer during a failure of a primary hypervisor on the physical host computer. The host swap hypervisor resides on the physical host computer that runs the primary hypervisor, and monitors failure indicators of the primary hypervisor. When the failure indicators exceed a threshold, the host swap hypervisor is then autonomically swapped to become the primary hypervisor on the physical host computer. The original primary hypervisor may then be re-initialized as the new host swap hypervisor.	11-19-2015
20150347246	AUTOMATIC-FAULT-HANDLING CACHE SYSTEM, FAULT-HANDLING PROCESSING METHOD FOR CACHE SERVER, AND CACHE MANAGER - The relationship between cache servers and backup cache servers is dynamically managed, and when a fault has arisen, a second cache server that is close in terms of distance to a PBR router that is forwarding traffic to a first cache server at which the fault has arisen is used as a backup cache server. Also, a module or a device having functionality as a cache manager and a cache agent is prepared, and with the trigger being the detection of a fault in the first cache server, the cache agent automatically alters the traffic forwarding destination of the PBR router, which is forwarding traffic to the first cache server at which the fault has arisen, to be the second cache server that is close in terms of distance to the PBR router.	12-03-2015
20150350374	COMMUNICATION METHOD AND NODE DEVICE - A selection device is selected from a first group in a first cluster. The node devices of the first group can communicate with a node device in a second cluster adjacent to the first cluster. The selection device performs transmission and reception of a report frame that reports an identifier of a node device included in the first group. The selection device selects from the first group a first relay device that relays a relay frame used for a communication between a node device in the first cluster and a node device in the second cluster. The first relay device determines anode device adjacent to the first relay device to be a second relay device that relays the relay frame to a node device in the second cluster.	12-03-2015
20150363282	RESILIENCY DIRECTOR - Systems and methods of orchestrating recoveries of virtual machines protected by a data management systems from a primary system to a secondary system, such that performing the recoveries depends on relationships between the virtual machines. First data indicative of a recovery plan associated with a failover of at least one group of virtual machines is received. The recovery plan includes an application group with data indicative of a hierarchical relationship between the virtual machines wherein each of the virtual machines is associated with an order based on the second data. A plurality of sequences is created in the application group to designate an order of executing a plurality of recoveries for each of the virtual machines. A first recovery is executed in parallel for each of the virtual machines associated with a first sequence and a subsequent recovery is executed in parallel for each of a subsequent set of sequences.	12-17-2015
20160004609	FAULT TOLERANT COMMUNICATIONS - Apparatuses, systems and methods are disclosed for tolerating fault in a communications grid. Specifically, various techniques and systems are provided for detecting a fault or failure by a node in a network of computer nodes in a communications grid, adjusting the grid to avoid grid failure, and taking action based on the failure. In an example, a system may include receiving grid status information at a backup control node, the grid status information including a project status, storing the grid status information within the backup control node, receiving a failure communication including an indication that a primary control node has failed, designating the backup control node as a new primary control node, receiving updated grid status information based on the indication that the primary control node has failed, and transmitting a set of instructions based on the updated grid status information.	01-07-2016
20160011949	PREVENTING REMOVAL OF HOT-SWAPPABLE COMPONENTS	01-14-2016
20160020943	Automotive neural network - Network node modules within a vehicle are arranged to form a reconfigurable automotive neural network. Each network node module includes one or more subsystems for performing one or more operations and a local processing module for communicating with the one or more subsystems. A management system enables traffic from the one or more subsystems of a particular network node module to be re-routed to an external processing module upon failure of the local processing module of that particular network node module.	01-21-2016
20160020965	METHOD AND APPARATUS FOR DYNAMIC MONITORING CONDITION CONTROL - Example implementations described herein are directed to predict the target elements that could be potentially affected by operations and incidents for one or more computer systems involving a server, a network and a storage system, by using topology information and redundant technology information. Example implementations described herein are further directed to changing the monitoring condition of the elements for some period of time and correlate elements, events, and monitored data to help the administrator to analyze impact of the event.	01-21-2016
20160034361	DISTRIBUTED EVENT CORRELATION SYSTEM - According to an example, a master node is to divide an event field in events into partitions including ordered contiguous blocks of values for the event field. Each partition may be assigned to a pair of cluster nodes. A partition map is determined from the partitions and may identify for each partition, the block of the event field values for the partition, a primary cluster node, and a failover cluster node for the primary cluster node.	02-04-2016
20160043938	DATA PROCESSING LOCK SIGNAL TRANSMISSION - In accordance with one aspect of the present description, a node of the distributed computing system has multiple communication paths to a data processing resource lock which controls access to shared resources, for example. In this manner, at least one redundant communication path is provided between a node and a data processing resource lock to facilitate reliable transmission of data processing resource lock signals between the node and the data processing resource lock. Other features and aspects may be realized, depending upon the particular application.	02-11-2016
20160048434	CONTROL AND DATA TRANSMISSION SYSTEM, PROCESS DEVICE, AND METHOD FOR REDUNDANT PROCESS CONTROL WITH DECENTRALIZED REDUNDANCY - There is provided a control and data transmission system, comprising at least one control device which, in normal operation, is connected by means of a communication network to at least one process device designed as an input and/or output device, wherein the process device comprises an evaluation unit designed to detect a failure in the control system, an emergency control program which can be parameterized and which is stored in a memory of the process device, and a runtime system designed to execute the emergency control program, and wherein the process device is designed to switch to emergency operation in response to a failure in the control system detected by the evaluation unit, in which emergency operation the process device executes the emergency control program. The invention further provides a process device for use in such a control and data transmission system and a method for redundant process control.	02-18-2016
20160062855	VIRTUAL APPLICATION DELIVERY CHASSIS SYSTEM - A method for electing a master blade in a virtual application distribution chassis (VADC), includes: sending by each blade a VADC message to each of the other blades; determining by each blade that the VADC message was not received from the master blade within a predetermined period of time; in response, sending a master claim message including a blade priority by each blade to the other blades; determining by each blade whether any of the blade priorities obtained from the received master claim messages is higher than the blade priority of the receiving blade; in response to determining that none of the blade priorities obtained is higher, setting a status of a given receiving blade to a new master blade; and sending by the given receiving blade a second VADC message to the other blades indicating the status of the new master blade of the given receiving blade.	03-03-2016
20160062856	TECHNIQUES FOR MAINTAINING COMMUNICATIONS SESSIONS AMONG NODES IN A STORAGE CLUSTER SYSTEM - Various embodiments are generally directed to techniques for preparing to respond to failures in performing a data access command to modify client device data in a storage cluster system. An apparatus may include a processor component of a first node coupled to a first storage device; an access component to perform a command on the first storage device; a replication component to exchange a replica of the command with the second node via a communications session formed between the first and second nodes to enable at least a partially parallel performance of the command by the first and second nodes; and a multipath component to change a state of the communications session from inactive to active to enable the exchange of the replica based on an indication of a failure within a third node that precludes performance of the command by the third node. Other embodiments are described and claimed.	03-03-2016
20160070627	BACKUP MANAGEMENT CONTROL IN A SERVER SYSTEM - A server rack includes a rack management controller (RMC) configured to manage a first function and a backplane including a backplane controller (BPC). The BPC is configured to monitor the RMC, determine that the RMC is unavailable, and manage the first function, in response to determining that the RMC is unavailable.	03-10-2016
20160077935	VIRTUAL MACHINE NETWORK LOSS DETECTION AND RECOVERY FOR HIGH AVAILABILITY - Exemplary methods, apparatuses, and systems determine that a first physical network interface controller of a first host computer has lost a client traffic network connection. At least one data compute node running on the first host computer has client traffic transmitted via the client traffic network connection. In response to the loss of the client traffic network connection, one or more host computers each having a physical network interface controller with a functioning network connection for the client traffic are identified. Further in response to the loss of the client traffic network connection, the data compute node is moved to one of the identified host computers. The first host computer utilizes a second physical network interface controller to move data compute node.	03-17-2016
20160085643	MANAGING CONTINGENCY CAPACITY OF POOLED RESOURCES IN MULTIPLE AVAILABILITY ZONES - A network-based services provider may reserve and provision primary resource instance capacity for a given service (e.g., enough compute instances, storage instances, or other virtual resource instances to implement the service) in one or more availability zones, and may designate contingency resource instance capacity for the service in another availability zone (without provisioning or reserving the contingency instances for the exclusive use of the service). For example, the service provider may provision resource instance(s) for a database engine head node in one availability zone and designate resource instance capacity for another database engine head node in another availability zone without instantiating the other database engine head node. While the service operates as expected using the primary resource instance capacity, the contingency resource capacity may be leased to other entities on a spot market. Leases for contingency instance capacity may be revoked when needed for the given service (e.g., during failover).	03-24-2016
20160085647	SYSTEM AND METHOD FOR HANDLING MULTI-NODE FAILURES IN A DISASTER RECOVERY CLUSTER - A system and method for handling multi-node failures in a disaster recovery cluster is provided. In the event of an error condition, a switchover operation occurs from the failed nodes to one or more surviving nodes. Data stored in non-volatile random access memory is recovered by the surviving nodes to bring storage objects, e.g., disks, aggregates and/or volumes into a consistent state.	03-24-2016
20160092321	RECORDING MEDIUM STORING CONTROL PROGRAM, CONTROL DEVICE AND CONTROL METHOD - A non-transitory computer-readable recording medium stores therein a control program. The control program is executed by a control device that controls an access point conducting a communication by using a first identifier. The control device identifies an access point that becomes a target of disaster setting, in which a communication is conducted by using a second identifier different from the first identifier, on the basis of disaster information obtained from a providing source of information. The control device outputs, to a user interface, information for confirming whether or not the disaster setting is to be applied. Further, the control device sends an instruction to apply the disaster setting to the access point when a request to apply the disaster setting has been obtained.	03-31-2016
20160092323	MULTI-PARTITION NETWORKING DEVICE AND METHOD THEREFOR - A multi-partition networking device comprising a primary partition running on a first set of hardware resources and a secondary partition running on a further set of hardware resources. The multi-partition networking device is arranged to operate in a first operating state, whereby the first set of hardware resources are in an active state and the primary partition is arranged to process network traffic, and the further set of hardware resources are in a standby state. The multi-partition networking device is further arranged to transition to a second operating state upon detection of a suspicious condition within the primary partition, whereby the further set of hardware resources are transitioned from a standby state to an active state, and to transition to a third operating state upon detection of a failure condition within the primary partition, whereby processing of network traffic is transferred to the secondary partition.	03-31-2016
20160103744	SYSTEM AND METHOD FOR SELECTIVELY UTILIZING MEMORY AVAILABLE IN A REDUNDANT HOST IN A CLUSTER FOR VIRTUAL MACHINES - Techniques for selectively utilizing memory available in a redundant host system of a cluster are described. In one embodiment, a cluster of host systems, with at least one redundant host system, with each host system having a plurality of virtual machines with associated virtual machine (VM) reservation memory is provided. A portion of a data store is used to store a base file, the base file accessed by all the plurality of virtual machines. A portion of the memory available in the redundant host system is assigned as spare VM reservation memory. A copy of the base file is selectively stored in the spare VM reservation memory for access by all the plurality of virtual machines.	04-14-2016
20160110271	RECOVERY AND FAULT-TOLERANCE UNDER COMPUTATIONAL INDETERMINISM - A method for promoting fault tolerance and recovery in a computing system including at least one processing node includes promoting availability and recovery of a first processing node, by, at the first processing node, generating first spawn using a spawner that has been assigned a first generation-indicator so that its spawn inherits the first generation indicator, beginning a checkpoint interval to generate nodal recovery information, suspending the spawner from generating spawn, assigning, to the spawner, a second generation-indicator that differs from the first one, resuming the spawner, so that it generates second spawn that inherits the second generation-indicator, controlling an extent to which the second spawn writes to memory, and after committing nodal recovery information acquired during the checkpoint to durable storage, releasing control over the extent to which the second spawn can write to memory.	04-21-2016
20160117230	HIGH AVAILABILITY SCHEDULER FOR SCHEDULING SEARCHES OF TIME STAMPED EVENTS - A high availability scheduler of tasks in a cluster of server devices is provided. A server device of the cluster of server devices enters a leader state based upon the results of a consensus election process in which the server device participates with others of the cluster of server devices. Upon entering the leader state, the server device schedules one or more tasks by assigning each of the one or more tasks to a device, wherein the one or more tasks involve initiating a search of time stamped events.	04-28-2016
20160124817	FAULT TOLERANT LISTENER REGISTRATION IN THE PRESENCE OF NODE CRASHES IN A DATA GRID - A processing device to perform operations comprising receive, from a listener of a second node in a data grid system, a filter defined by search criteria of a search query. The operations can include determine, at the first node, that a third node in the data grid has crashed. The operations can further include iterate over backup data of the third node that is stored at a memory of the first node to determine the backup data that matches the filter. The operations can further include communicate, to the listener, the backup data that matches the filter.	05-05-2016
20160124818	DISTRIBUTED FAILOVER FOR MULTI-TENANT SERVER FARMS - A failover manager may be configured to determine a plurality of tenants executable on a server of a plurality of servers, each tenant being a virtual machine executable on the server in communication with at least one corresponding user. The failover manager may include a replicated tenant placement selector configured to dispatch a first replicated tenant for a first tenant of the plurality of tenants to a first standby server of the plurality of servers, and configured to dispatch a second replicated tenant for a second tenant of the plurality of tenants to a second standby server of the plurality of servers. The failover manager also may include a replicated tenant loader configured to activate, based on a failure of the server, the first replicated tenant on the first standby server to replace the first tenant, and the second replicated tenant on the second standby server to replace the second tenant.	05-05-2016
20160124819	NETWORK CONTROLLER FAILOVER REQUEST TO REDUCE NETWORK OUTAGES - A system is described that includes a first network controller and a second network controller. The first controller operates as a master controller and the second controller operates as a standby controller for a set of access points. Using a set of VRRP advertisements between the first and second controllers, the second controller may (1) determine that the first controller has failed independent of any determination by the access points and (2) send a failover request to the access points. The failover request may cause the access points to use previously established tunnels between the second controller and each of the access points. By transmitting a failover request message from the second controller to the access points upon the detection by the second controller that the first controller has failed and independent of any determination by the access points, the system reduces network access downtime for the access points.	05-05-2016
20160132404	METHOD AND APPARATUS FOR REDUNDANCY IN AN ATM USING HOT SWAP HARDWARE UNDERLYING A VIRTUAL MACHINE - A method and apparatus for providing redundancy in an Automatic Teller Machine (ATM) is provided. Application software may be run on top of a virtual environment such as a virtual machine and/or a virtual disk environment. Should a software component fail, the virtual environment will “crash” but the ATM hardware and operating system will remain intact. If the software is fatally flawed—e.g., due to a faulty “upgrade” the older version may be “rolled back” from a previously stored virtual environment.	05-12-2016
20160132405	METHOD AND APPARATUS FOR REDUNDANCY IN AN ATM USING HOT SWAP HARDWARE UNDERLYING A VIRTUAL MACHINE - A method and apparatus for providing redundancy in an Automatic Teller Machine (ATM) is provided. Application software may be run on top of a virtual environment such as a virtual machine and/or a virtual disk environment. Should a software component fail, the virtual environment will “crash” but the ATM hardware and operating system will remain intact. If the software is fatally flawed—e.g., due to a faulty “upgrade” the older version may be “rolled back” from a previously stored virtual environment.	05-12-2016
20160140001	ADAPTIVE DATACENTER TOPOLOGY FOR DISTRIBUTED FRAMEWORKS JOB CONTROL THROUGH NETWORK AWARENESS - Systems, methods, and computer program products to perform an operation comprising receiving a priority of a distributed computing job, an intermediate traffic type of the distributed computing job, and a set of candidate compute nodes available to process the distributed computing job, the candidate compute nodes each available to process at least one input split of the distributed computing job, and selecting a mapper node from the candidate compute nodes, for one of the input splits, wherein the mapper node is selected based on the priority and the intermediate traffic type of the distributed computing job, wherein the mapper compute node is further selected upon determining that the mapper node is not affected by an error, and a resource utilization score for the mapper node does not exceed a utilization threshold.	05-19-2016
20160149773	MULTI-PARTITION NETWORKING DEVICE - A device is described for operating a multi-partition networking system, the device comprising hardware resources for the operation of a primary partition for performing tasks, a primary buffer for holding packets for processing within a partition of the multi-partition system and a reserve buffer. The device is arranged to allocate the primary buffer for use by the primary partition and allocate the reserve buffer for use by the primary partition when at least a suspicious condition is detected in the primary partition. A method of operating a multi-partition networking system is also described.	05-26-2016
20160154713	MANAGING SERVICE AVAILABILITY IN A MEGA VIRTUAL MACHINE	06-02-2016
20160154715	ACCESS POINT CONTROLLER FAILOVER SYSTEM	06-02-2016
20160162378	DISASTER RECOVERY SERVICE - A customer may use a disaster recovery service to generate a disaster recovery scenario in order to make certain resources available to the customer in the event of a data region failure. The customer may specify a recovery point objective, a recovery time objective and a recovery data region for the scenario. Accordingly, the disaster recovery service may coordinate with one or more other services provided by the computing resource service provider to reproduce the customer resources and other resources necessary to support the customer resources. These reproduced resources may be transferred to the recovery data region based at least in part on the parameters specified by the customer. In the event of a data region failure, the disaster recovery service may update the domain name system to resolve any customer requests for the customer resources to the recovery data region.	06-09-2016
20160170848	METHODS, SYSTEMS, AND COMPUTER READABLE STORAGE DEVICES FOR MANAGING FAULTS IN A VIRTUAL MACHINE NETWORK	06-16-2016
20160179635	SYSTEM AND METHOD FOR PERFORMING EFFICIENT FAILOVER AND VIRTUAL MACHINE (VM) MIGRATION IN VIRTUAL DESKTOP INFRASTRUCTURE (VDI)	06-23-2016
20160179636	CLUSTER CREATION AND MANAGEMENT FOR WORKLOAD RECOVERY	06-23-2016
20160179638	SWITCH FAILURE RECOVERY SYSTEM	06-23-2016
20160179642	REPLICATED DATABASE DISTRIBUTION FOR WORKLOAD BALANCING AFTER CLUSTER RECONFIGURATION	06-23-2016
20160188426	SCALABLE DISTRIBUTED DATA STORE - Described is a framework that manages a clustered, distributed NoSQL data store across multiple server nodes. The framework may include daemons running on every server node, providing auto-sharding and unified data service such that user data can be stored and retrieved consistently from any node. The framework may further provide capabilities such as automatic fail-over and dynamic capacity scaling.	06-30-2016
20160203064	PROACTIVE RESOURCE RESERVATION FOR PROTECTING VIRTUAL MACHINES	07-14-2016
20160253249	FAILOVER MECHANISM IN A DISTRIBUTED COMPUTING SYSTEM	09-01-2016
20160253250	Banded Allocation of Device Address Ranges in Distributed Parity Schemes	09-01-2016
20190146888	HANDLING MIGRATION IN A VIRTUALIZATION ENVIRONMENT	05-16-2019
20190146894	PROCESSING A HEALTH CONDITION MESSAGE ON A HEALTH CONDITION TO DETERMINE WHETHER TO PERFORM A SWAP OPERATION	05-16-2019

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Hot swapping (i.e., while network is up)

Subclass of:

714 - Error detection/correction and fault detection/recovery

714100000 - DATA PROCESSING SYSTEM ERROR OR FAULT HANDLING

714001000 - Reliability and availability

714002000 - Fault recovery

714003000 - By masking or reconfiguration

714400100 - Of network

714400110 - Backup or standby (e.g., failover, etc.)

Patent class list (only not empty are listed)

Deeper subclasses: