Entries |
Document | Title | Date |
20110016350 | METHOD FOR COMMANDING AND PERFORMING NETWORK ENTRY - A method for commanding and performing network entry is disclosed. The method for commanding network entry using a non-periodic message in a Base Station (BS) includes broadcasting a first pattern indicating ready for restart once or more times to a Mobile Station (MS), when the BS determines to restart due to a serious error of the BS, and performing a restart procedure, and broadcasting a second pattern indicating network entry to the MS, upon completion of the restart procedure. | 01-20-2011 |
20110022884 | DEFENSE COMMUNICATION MODE FOR AN APPARATUS ABLE TO COMMUNICATE BY MEANS OF VARIOUS COMMUNICATION SERVICES - An appliance communicates via a communication network via various communication services available for transmitting data via said communication network, said appliance comprising means of: detecting an anomaly in a communication that is established with said appliance via one of said communication services, implementing a defense communication mode, wherein the communications to be established with said appliance via a communication service for which a detection has occurred are inhibited, the communications to be established via another communication service being allowed. | 01-27-2011 |
20110029806 | METHOD, DEVICE AND COMMUNICATION SYSTEM TO AVOID LOOPS IN AN ETHERNET RING SYSTEM WITH AN UNDERLAYING 802.3ad NETWORK - A method is provided to be run in a network, the network comprising several network elements that are connected via a ring, wherein one of the network element is a ring master comprising a primary port and a secondary port. The method comprises the steps of (i) a failure is detected by the ring master; and (ii) the ring master checks for a second message and based on the content of the second message unblocks the secondary port. Also an associated device as well as a communication system comprising such device are provided. | 02-03-2011 |
20110047407 | POWER AND DATA REDUNDANCY IN A SINGLE WIRING CLOSET - Redundancy of data and/or Inline Power in a wired data telecommunications network from a first network device and a second network device configured as power sourcing equipment (PSE) devices and coupled together and to a third network device (such as a PD) via a Y device is provided by providing redundant signaling to/from each of the pair of network devices, and coupling a port of each of the network devices to the Y device and from there to a third port where a third network device such as a PD may be coupled. Because the Y device is essentially passive, communications paths between the PSE devices and the PD are provided for negotiating master/slave status and other status and related information among the respective network devices. Dynamic impedance matching is provided to handle situations where not all devices are plugged in and as a communications technique among the devices. | 02-24-2011 |
20110083034 | RESTORING DATA TO A DISTRIBUTED STORAGE NODE - A method is disclosed for operating a data storage system having one or more network interfaces and a plurality of data storage nodes configured to provide redundant storage locations. The method includes storing a set of node partitions on a given storage node of the plurality of data storage nodes. The method also includes, following a recovery by the given storage node from a malfunction, making a determination for a node partition in the set whether the node partition is current or noncurrent, and processing the node partition according to the determination. | 04-07-2011 |
20110083035 | CELL DEPENDENT MULTI-GROUP HYBRID AUTOMATIC REPEAT REQUEST METHOD FOR MULTICAST IN WIRELESS NETWORKS - A method and apparatus are described including determining address using an access point address and a multicast group address, transmitting a recovery request message to a recovery server to request recovery data using the address and receiving the recovery data from the recovery server. Also described are a method and apparatus including receiving a registration message, transmitting a reply to the registration message, receiving a recovery request message, transmitting recovery data responsive to the recovery request message and transmitting a message to a recovery multicast group to determine status of the recovery multicast group. | 04-07-2011 |
20110099413 | SYSTEM AND METHOD FOR LOCOMOTIVE INTER-CONSIST EQUIPMENT SPARING AND REDUNDANCY - In a system and method for communicating data in a locomotive consist or other vehicle consist (comprising at least first and second linked vehicles), a first electronic component in the first vehicle of the vehicle consist is monitored to determine if the component is in (or enters) a failure state. In the failure state, the first electronic component is unable to perform a designated function. Upon determining the failure state, data is transmitted from the first vehicle to a second electronic component on the second vehicle, over a communication channel linking the first vehicle and the second vehicle. The second electronic component is operated based on the transmitted data, with the second electronic component performing the designated function that the first electronic component is unable to perform. | 04-28-2011 |
20110113278 | TUNNEL MANAGEMENT METHOD, TUNNEL MANAGEMENT APPARATUS, AND COMMUNICATIONS SYSTEM - The present invention relates to communications technologies and discloses a tunnel management method, a tunnel management apparatus, and a communications system so that a node that causes failure of a tunnel management request can be determined. According to the present invention, a response returned by a tunnel management node to an initiating node includes not only a cause value of tunnel management request failure but also information of the node that causes failure of the tunnel management request, so that the initiating node can find the node that causes failure of the tunnel management request and determine the error checking direction. The present invention is applicable to network devices in a communications network. | 05-12-2011 |
20110154099 | METHOD AND SYSTEM FOR MASKING DEFECTS WITHIN A NETWORK - A method and system for masking defects within a network are disclosed. In accordance with an embodiment of the present invention, a method for masking defects within a network comprises detecting by a service entity defects within a network. The method further comprises determining a number of detected defects associated with a network component included in the network. The method further comprises generating by the network component a summary alarm if the number of detected defects within the network is greater than a first threshold. | 06-23-2011 |
20110154100 | APPARATUS AND METHOD OF PERFORMING ERROR RECOVERING PROCESS IN ASYMMETRIC CLUSTERING FILE SYSTEM - The present invention relates to an apparatus and method of performing an error recovery process in an asymmetric clustering file system that has higher efficiency in data recovery when a data server in an asymmetric clustering file system fails than a method of processing the data recovery in the metadata server. The present invention includes receiving a chunk list requiring recovery by a data server included in the other data server groups than a data server group including a failed data server among a plurality of data server groups, requesting chunk data necessary for recovering an erroneous chunk from a data server in the other data server groups to the other data servers than the failed data server in the data server group, and recovering the erroneous chunk based on the chunk data by the data server in the other data server group. | 06-23-2011 |
20110173487 | METHOD AND APPARATUS FOR SEAMLESS MANAGEMENT FOR DISASTER RECOVERY - A method, apparatus, article of manufacture, and system are presented for establishing redundant computer resources. According to one embodiment, in a system including a plurality of processor devices and a plurality of storage devices, the processor devices, the storage devices and the management server being connected via a network, the method comprises storing device information relating to the processor devices and the storage devices and topology information relating to topology of the network, identifying at least one primary computer resource, selecting at least one secondary computer resource suitable to serve as a redundant resource corresponding to the at least one primary computer resource based on the device information and the topology information, and assigning the at least one secondary computer resource as a redundant resource corresponding to the at least one primary computer resource. | 07-14-2011 |
20110173488 | NON-VOLATILE MEMORY FOR CHECKPOINT STORAGE - A system, method and computer program product for supporting system initiated checkpoints in high performance parallel computing systems and storing of checkpoint data to a non-volatile memory storage device. The system and method generates selective control signals to perform checkpointing of system related data in presence of messaging activity associated with a user application running at the node. The checkpointing is initiated by the system such that checkpoint data of a plurality of network nodes may be obtained even in the presence of user applications running on highly parallel computers that include ongoing user messaging activity. In one embodiment, the non-volatile memory is a pluggable flash memory card. | 07-14-2011 |
20110239039 | Cloud computing enabled robust initialization and recovery of it services - A system and a method for provisioning of Information Technology (IT) services to a plurality of computers is provided. The system includes a network and transport device and local IT resources. The network and transport device has internet connectivity via a controlled switching interface. One or more of the computers are coupled to the network and transport device via the controlled switching interface. The local IT resources are also coupled to the one or more computers and include data storage and processing capability for providing IT services to the computers including server-based applications for utilization and operation by the computers. In addition, the local IT resources include a network and transport virtual machine generated as a virtual machine equivalent of the network and transport device and coupled to the controlled switching interface of the network and transport device for communication with the network and transport device. | 09-29-2011 |
20110246814 | Facilitating Persistence Of Routing States - In certain embodiments, replicating data elements includes calculating a key value for a data element. The key value is calculated from at least a part of content of the first data element. K computing elements are automatically selected from X computing element nodes according to the key value and a mapping schema. K is a greater than 2 and less than X. The computing element nodes each include computer-readable memory embodied within one or more routers. K replications of the data element are automatically written to the computer-readable memory of the K computing element nodes. | 10-06-2011 |
20110252270 | UPDATING A LIST OF QUORUM DISKS - A node in a server cluster is designated as a quorum disk. The node stores a list of other nodes in the server cluster also designated as quorum disks. The node can replace the first list with a second and more recent list of quorum disks only if the second list is updated on at least a simple majority of quorum disks on the first list. | 10-13-2011 |
20110252271 | Monitoring of Highly Available Virtual Machines - A host controller is coupled to host computers that host virtual machines. At least one of the virtual machines is a highly available virtual machine. The host controller detects a change in system resources and identifies a highly available virtual machine that failed before the change occurs. The host controller re-runs the highly available virtual machine upon detection of the change of the system resources. | 10-13-2011 |
20110258481 | Deploying A Virtual Machine For Disaster Recovery In A Cloud Computing Environment - Deploying a virtual machine in a cloud computing environment, the cloud computing environment including one or more virtual machines (‘VMs’), the VMs being modules of automated computing machinery installed upon cloud computers disposed within data centers, the cloud computing environment also including a cloud operating system and data center administration servers operably coupled to the VMs, including deploying, by the cloud operating system in a local data center, a VM, including flagging the VM for disaster recovery; storing, by the cloud operating system in computer memory in a remote data center, a copy of the flagged VM; and configuring, by the cloud operating system, the remote data center to replace data processing operations of the flagged VM in the local data center with data processing operations of the copy in the remote data center when the flagged VM in the local data center is lost through disaster. | 10-20-2011 |
20110258482 | Memory Management and Recovery for Datacenters - A system including a plurality of servers, a client, and a metadata server is described herein. The servers each store tracts of data, a plurality of the tracts comprising a byte sequence and being distributed among the plurality of servers. To locate the tracts, the metadata server generates a table that is used by the client to identify servers associated with the tracts, enabling the client to provide requests to the servers. The metadata server also enables recovery in the event of a server failure. Further, the servers construct tables of tract identifiers and locations to use in responding to the client requests. | 10-20-2011 |
20110289342 | METHOD FOR THE FILE SYSTEM OF FIGURE 7 FOR THE CLUSTER - In general, an appliance that simplifies the creation of a cluster in a computing environment has a fairly straightforward user interface that abstracts out many of the complexities of the typical configuration processes, thereby significantly simplifying the deployment process. By using such appliance, system administrators can deploy an almost turn-key cluster and have the confidence of knowing that the cluster is well tuned for the application/environment that it supports. In addition, the present disclosure allows for configurations and integrations of specialty engines, such as Q processors or J processors, into the cluster. The disclosure provides systems and methods for configuring a cluster, managing a cluster, managing an MQ in a cluster, a user interface for configuring and managing the cluster, an architecture for using specialty engines in a cluster configuration, and interconnect between cluster components, and a file system for use in a cluster. | 11-24-2011 |
20110289343 | Managing the Cluster - In general, an appliance that simplifies the creation of a cluster in a computing environment has a fairly straightforward user interface that abstracts out many of the complexities of the typical configuration processes, thereby significantly simplifying the deployment process. By using such appliance, system administrators can deploy an almost turn-key cluster and have the confidence of knowing that the cluster is well tuned for the application/environment that it supports. In addition, the present disclosure allows for configurations and integrations of specialty engines, such as Q processors or J processors, into the cluster. The disclosure provides systems and methods for configuring a cluster, managing a cluster, managing an MQ in a cluster, a user interface for configuring and managing the cluster, an architecture for using specialty engines in a cluster configuration, and interconnect between cluster components, and a file system for use in a cluster. | 11-24-2011 |
20110320858 | MONITORING SOFTWARE THREAD EXECUTION - The invention is directed to monitoring execution of software threads, particularly by detecting a lockup or stall in execution of a software thread and initiating a remedial action in response. Advantageously, some embodiments of the invention automatically detect a lockup or stall in execution of a software thread by periodically sampling information corresponding to the thread, and, in accordance with a determination made using the information, initiate an attempt to recover from such a condition in execution without the need for manual intervention. | 12-29-2011 |
20120017111 | KERNEL SWAPPING SYSTEMS AND METHODS FOR RECOVERING A NETWORK DEVICE - In certain embodiments, a method is disclosed for recovering a failed client device in a network. The method includes booting a failed one of a plurality of client devices in the network with a generic image having a generic kernel usable with each of the plurality of client devices. The method further includes downloading, using said generic kernel, from at least one backup server an abbreviated kernel uniquely associated with the failed client device, the abbreviated kernel comprising substantially less data than an original kernel of the failed client device immediately prior to failure of the failed client device, the abbreviated kernel comprising a boot kernel image and at least one device driver. The method includes swapping the abbreviated kernel with the generic kernel; restoring, using said abbreviated kernel, remaining backup data from the at least one backup server to the failed client device; and rebooting the failed client device. | 01-19-2012 |
20120030502 | TENANT RESCUE FOR SOFTWARE CHANGE PROCESSES IN MULTI-TENANT ARCHITECTURES - A multi-tenant system can be switched to a downtime state to implement a transition from a current state to a target state of a core software platform. During a second phase of the transition an error associated with tenant-specific content of a first customer tenant of the plurality of customer tenants of the multi-tenant system can be identified. The second phase can be suspended for the first customer tenant while continuing the second phase for a remainder of the plurality of customer tenants for which an error has not been identified. After a scheduled duration of the downtime state, the multi-tenant system can be reactivated such that the multi-tenant system includes the remainder of the plurality of customer tenants with the transition implemented and the first customer tenant either with the transition implemented if the error has been corrected or without the transition implemented if the error has not been corrected. | 02-02-2012 |
20120096303 | DETECTING AND RECOVERING FROM PROCESS FAILURES - A service is used to process files. The processing of the files is performed by worker services that are assigned to process a portion of the files. Each worker service that is processing a portion of the files is assigned a unique identifier. Using the identifier information, the set of worker services currently active are monitored along with the work assigned to each process. When a worker server determines that a worker service has failed, the work assigned to the failed worker service can be automatically determined and a new worker service can be started to process that work. Any new worker service that is started is assigned a unique identifier, so the work assigned to it can be similarly tracked. | 04-19-2012 |
20120151245 | IN-FLIGHT BLOCK MAP FOR A CLUSTERED REDIRECT-ON-WRITE FILESYSTEM - A cluster server manages allocation of free blocks to cluster clients performing writes in a clustered file system. The cluster server manages free block allocation with a free block map and an in-flight block map. The free block map is a data structure or hardware structure with data that indicates blocks or extents of the clustered file system that can be allocated to a client for the client to write data. The in-flight block map is a data structure or hardware structure with data that indicates blocks that have been allocated to clients, but remain in-flight. A block remains in-flight until the clustered file system metadata has been updated to reflect a write performed to that block by a client. After a consistency snapshot of the metadata is published to the storage resources, the data at the block will be visible to other nodes of the cluster. | 06-14-2012 |
20120151246 | COMMUNICATION INTERFACE APPARATUS, TRANSMISSION CONTROL METHOD, AND CONNECTION SHUTDOWN CONTROL METHOD - A communication interface apparatus | 06-14-2012 |
20120198268 | RE-ESTABLISHING PUSH NOTIFICATION CHANNELS VIA USER IDENTIFIERS - Embodiments enable recovery of push notification channels via session information associated with user identifiers. A proxy service creates session information describing push notification channels (e.g., subscriptions) for a user and associates the session information with a user identifier. The session information is stored in a cloud service or other storage area separate from the proxy service. After failure of a user computing device or the proxy service, the session information is obtained via the user identifiers and the push notification channels are re-created with the session information. In some embodiments, the proxy service enables delivery of the same notification to multiple computing devices associated with the user identifier. | 08-02-2012 |
20120226931 | SERVER, A METHOD, A SYSTEM AND A PROGRAM THEREOF - A server includes a monitor unit configured to monitor a failure of one or more user networks and a recovery support unit configured to support a recovery of the failure of one or more user networks. When detecting an alarm showing occurrence of the failure, the server identifies a user network in which the failure occurs based on both line information for each user network and the alarm, and notifies the alarm to the plurality of management terminals of the identified user network. | 09-06-2012 |
20120233492 | Transmitting network information using link or port aggregation protocols - In one embodiment, a method includes receiving at a network device, a packet from a component in a virtual network device, the packet transmitted across a link aggregation bundle connecting the virtual network device to the network device and indicating if the component is a master component in the virtual network device, and determining if an error exists in operation of the component as the master component or a slave component. An apparatus for assigning services to physical links in an aggregated link bundle is also disclosed. | 09-13-2012 |
20120233493 | PORTABLE DEVICE AND BACKUP METHOD THEREOF - An embodiment of the invention provides a backup method for a portable device to back up a first data to a backup server. The backup method includes steps of determining whether the backup server can be accessed; when the backup server can be accessed, establishing a first data transmission path that the first data would be backed up to the backup server via a third party, a second data transmission path that the first data would be backed up to the backup server via a router, and a third data transmission path that the first data would directly backed up to the backup server; selecting one data transmission path among the first, second and third data transmission paths; and backing up the first data via the selected data transmission path. | 09-13-2012 |
20120233494 | STORAGE ARRAY NETWORK PATH IMPACT ANALYSIS SERVER FOR PATH SELECTION IN A HOST-BASED I/O MULTI-PATH SYSTEM - Systems and methods are provided for selecting a path for an I/O in a storage area network. In one embodiment, a method comprises receiving path configuration information for paths associated with a host device connected to the storage area network, a listing of components within the storage area network, and a notification of a component failure within the storage area network. The method may also comprise correlating the received path configuration information, the received listing of components, and the received notification of component failure to determine one determine one or more paths associated with the host device affected by the component failure. The method may further comprise transmitting to the host device an alert for the one or more affected paths. | 09-13-2012 |
20120239965 | Method and Device for Link Protection in Virtual Private Local Area Network - The present invention discloses a method and device for a link protection in a virtual private local area network, which relates to the network data communication technology. The method of the present invention includes: in a networking process of a VPLS network, a link protection device establishing a main tunnel and a standby tunnel of MPLS TE for a link, and creating a VPLS forwarding table to deal with the information of the established MPLS TE main tunnel and standby tunnel; and when receiving a VPLS message, the link protection device searching the information of the MPLS TE main tunnel of the VPLS message according to a way of accessing the VPLS network of the VPLS message and the VPLS forwarding table, and if the found MPLS TE main tunnel is invalid, then transmitting the received VPLS message by adopting the standby tunnel of the MPLS TE main tunnel. | 09-20-2012 |
20120254651 | ERROR HANDLING IN A PASSIVE OPTICAL NETWORK - When data containing an unrecoverable error is received, instead of discarding the data, a lower protocol layer delivers the data to a higher protocol layer along with an indication that the data contains an error. The higher protocol layer parses the data to recover portions that are not affected by the error. Additionally, the higher protocol layer can choose to accept the data if certain acceptance criteria are met. | 10-04-2012 |
20120254652 | FAULT DETECTION AND RECOVERY AS A SERVICE - The monitoring by a monitoring node of a process performed by a monitored node is often devised as a tightly coupled interaction, but such coupling may reduce the re-use of monitoring resources and processes and increase the administrative complexity of the monitoring scenario. Instead, fault detection and recovery may be designed as a non-proprietary service, wherein a set of monitored nodes, together performing a set of processes, may register for monitoring by a set of monitoring nodes. In the event of a failure of a process, or of an entire monitored node, the monitoring nodes may collaborate to initiate a restart of the processes on the same or a substitute monitored node (possibly in the state last reported by the respective processes). Additionally, failure of a monitoring node may be detected, and all monitored nodes assigned to the failed monitoring node may be reassigned to a substitute monitoring node. | 10-04-2012 |
20120260123 | DECOUPLED APPLICATION PROGRAM-OPERATING SYSTEM COMPUTING ARCHITECTURE - A method of application program-operating system decoupling includes performing, through an application program configured to execute on a client machine, a system call to a first operating system executing on a server machine over an interconnect configured to couple the server machine to the client machine. The method also includes serving the application program configured to execute on the client machine through the first operating system executing on the server machine in accordance with the system call. | 10-11-2012 |
20120284555 | OPTIMIZING DISASTER RECOVERY SYSTEMS DURING TAKEOVER OPERATIONS - Exemplary method, system, and computer program product embodiments for optimizing disaster recovery systems during takeover operations are provided. In one embodiment, by way of example only, a flag is set in a replication grid manager to identify replication grid members to consult in a reconciliation process for resolving intersecting and non-intersecting data amongst the disaster recovery systems for a takeover operation. The replication grid members are consulted for the takeover operation to accommodate a coordination of an ownership synchronization process for cartridges not distributed on-time to the replication grid members. Additional system and computer program product embodiments are disclosed and provide related advantages. | 11-08-2012 |
20120290868 | ASSIGNING A DISPERSED STORAGE NETWORK ADDRESS RANGE IN A MAINTENANCE FREE STORAGE CONTAINER - A method begins by a dispersed storage (DS) processing module determining storage device failure information for a plurality of storage devices within a maintenance free storage container, wherein the maintenance free storage container allows for multiple storage devices of the plurality of storage devices to be in a failure mode without replacement and wherein the storage device failure information indicates storage devices of the plurality of storage devices that are in the failure mode. The method continues with the DS processing module maintaining a dynamic container address space of the maintenance free storage container based on the storage device failure information. The method continues with the DS processing module managing mapping of container addresses of the dynamic container address space to dispersed storage network (DSN) addresses of an assigned DSN address range. | 11-15-2012 |
20120297237 | DATA ACCESS LAYER - An improved data access layer (DAL) architecture enables database connection pooling or multiplexing across machine boundaries. Drivers installed at web servers communicate with servers in a DAL. The DAL servers present a virtual database to the web servers, and the DAL servers in turn open connections to a set of physical databases. DAL servers are able to recycle connections that are no longer needed, or to move available connections from one DAL server to another, so as to provide improved efficiency in connection management, burst management, and peak load management. Scalability is thereby improved, and more efficient use of system resources is facilitated. | 11-22-2012 |
20130007503 | CONTINUOUS WORKLOAD AVAILABILITY BETWEEN SITES AT UNLIMITED DISTANCES - Continuous workload availability between sites at unlimited distances, which includes receiving a unit of work data. Once the unit of work data has been received the workload that the unit of work data is directed to is determined, and a primary site of a plurality of sites to process the unit of work is chosen. If the processing of the unit of work data is successful, then one of one or more processing systems of the primary site are selected to process the unit of work data, and the unit of work data is replicated to at least one other site. The primary site is separated from each of the plurality of sites by a distance greater than a metropolitan area network (MAN) and operations occur within a customer acceptability window. | 01-03-2013 |
20130031402 | MODIFYING DISPERSED STORAGE NETWORK EVENT RECORDS - A method begins by a dispersed storage (DS) processing module identifying a performance anomaly within a dispersed storage network (DSN). The method continues with the DS processing module identifying a set of collections of records corresponding to the performance anomaly, wherein one of the set of collections of records includes an event record including information regarding an event, a first record including information regarding a dispersed storage (DS) processing module processing an event request to produce a plurality of sub-event requests, and a plurality of records including information regarding a set of DS units processing the plurality of sub-event requests. The method continues with the DS processing module determining whether a reliable significance indication of the performance anomaly is determinable, and when the reliable significance indication of the performance anomaly is not determinable, modifying data collection criteria for one or more of the sets collections of records. | 01-31-2013 |
20130036322 | HARDWARE FAILURE MITIGATION - Various exemplary embodiments relate to a method and related network node including one or more of the following: detecting, by a resource allocation device, a failure of server hardware; identifying a first agent device that is configured to utilize the server hardware; and taking at least one action to effect a reconfiguration of the first agent device in response to the server hardware failure. Various embodiments additionally include one or more of the following: identifying a second agent device that is configured to utilize the server hardware; and taking at least one action to effect a reconfiguration of the second agent device in response to the server hardware failure. Various embodiments additionally include one or more of the following: receiving, by the resource allocation device from a second agent device, an indication of the failure of server hardware, wherein the second agent device is different from the first agent device. | 02-07-2013 |
20130055009 | System and Method for Providing Reliable Storage - A system and method for providing reliable storage are provided. A method for initiator operations includes storing information associated with an access attempt in a store, and accessing a storage system responsive to the access attempt, wherein the storage system includes a first storage node and a second storage node arranged in a sequential loop, and where the first storage node is accessed by an initiator. The method also includes determining if the access attempt completed successfully, deleting the information from the store if the access attempt completed successfully, and indicating an error if the access attempt did not complete successfully. | 02-28-2013 |
20130067266 | FAULT-BASED UNIT REPLACEMENT - A method and system are disclosed for determining the end of life condition for a replaceable unit, which may be associated with a multifunction device, and have a useful life expectancy based upon a number of operations. A counter associated with the multifunction device maintains a running count of operations for the unit. Once the number of performed operations of the replaceable unit reaches a given percentage of the expected life of that unit, an analysis is performed on a running history of the fault codes. Once the number of fault codes attributable to any given replaceable unit meets or exceeds a predetermined level, such as a percentage of the running history, a notification is generated indicating a need to replace the replaceable unit. | 03-14-2013 |
20130073892 | METHODS AND APPARATUS FOR REMEDIATION EXECUTION - Disclosed herein are methods, systems, and articles associated with remediation execution. In embodiments, a set of policy test failures may be selected for remediation. The set of policy test failures may be associated with a computer network with a number of nodes. For each failure within the set of policy test failures, a remediation script may be obtained to remediate a corresponding policy test failure. The remediation scripts may be selectively provided to nodes that are affected by policy test failures, for execution by the nodes. A remediation script result for each remediation script executed may be received. Based upon the remediation script results, it may be determined whether or not execution of the remediation scripts was successful. | 03-21-2013 |
20130073893 | METHODS AND APPARATUS FOR REMEDIATION WORKFLOW - Disclosed herein are methods, systems, and articles associated with remediation workflow. A method may include determining one or more test failures related to a policy test within a computer network, and reviewing the one or more test failures. The method may further include, based upon a result of the reviewing, creating a remediation work order that includes at least one of the one or more test failures. Each test failure within the remediation work order may be approved or denied. For each test failure that is approved for remediation, a remediation process may be executed. | 03-21-2013 |
20130080823 | SYSTEM AND METHOD FOR DISASTER RECOVERY - A disaster recovery system can include a plurality of resources arranged in a cloud computing environment. Each of the resources can be assignable to function within the cloud computing environment as part of one or more media systems. A content intake service can be programmed to control delivery of an incoming media asset to the cloud computing environment. A monitoring and recovery process can be programmed to monitor a primary media system to which the incoming media asset is being provided and, in response to detecting a disaster recovery condition, the monitoring and recovery process can intelligently manage selected resources of the plurality of resources based on the incoming media asset being delivered to the primary media system. | 03-28-2013 |
20130086413 | FAST I/O FAILURE DETECTION AND CLUSTER WIDE FAILOVER - A method for fast I/O path failure detection and cluster wide failover. The method includes accessing a distributed computer system having a cluster including a plurality of nodes, and experiencing an I/O path failure for a storage device. An I/O failure message is generated in response to the I/O path failure. A cluster wide I/O failure message broadcast to the plurality of nodes that designates a faulted controller. Upon receiving I/O failure responses from the plurality nodes, an I/O queue message is broadcast to the nodes to cause the nodes to queue I/O through the faulted controller and switch to an alternate controller. Upon receiving I/O queue responses from the plurality nodes, an I/O failover commit message is broadcast to the nodes to cause the nodes to commit to a failure and un-queue their I/O. | 04-04-2013 |
20130111258 | SIDEBAND ERROR SIGNALING | 05-02-2013 |
20130124909 | SELECTIVE MESSAGE LOSS HANDLING IN A CLUSTER OF REPLICATED SERVERS - A computer-implemented method, a computerized system and a product for providing a cluster of replicated servers. The method performed by a computerized server in a cluster of servers, wherein the cluster of servers are executing replicated instances of an application, wherein the replicated instances are configured to perform the same processing of the same input, comprising: detecting a message loss in the server; electively determining a responsive action to the message loss; and notifying the cluster of servers of the responsive action determined by the server, whereby other servers of the cluster of servers are able to mimic operation of the server by simulating the responsive action. | 05-16-2013 |
20130124910 | SYSTEM AND METHOD FOR SIGNALING DYNAMIC RECONFIGURATION EVENTS IN A MIDDLEWARE MACHINE ENVIRONMENT - A system and method can provide fault tolerance in a middleware machine environment. A subnet manager can determine whether there is a path record change when a fault occurs in the middleware machine environment. Furthermore, the subnet manager can signal a dynamic reconfiguration event to at least one host in the middleware machine environment. The at least one host can send a message to the subnet manager to query for a latest path record. Then, the subnet manager can provide a latest path record to the at least one host. | 05-16-2013 |
20130145205 | Visual Outage Management Wizard Plug-In - Described herein are systems and methods related to a plug-in for a visual tool as a real-time knowledge transfer agent between network outages. One embodiment relates to a method comprises retrieving current outage data related to a current operation of a network, retrieving historical outage data related to a prior operation of the network, correlating the current outage data with the historical outage data, and constructing a resolution process plan for repairing a current outage based on correlations between the current outage data and the historical outage data. | 06-06-2013 |
20130173953 | METHOD AND APPARATUS FOR RESTORING A CONNECTION THROUGH A PROVIDER NETWORK UPON REQUEST - A method and related apparatus are provided for restoring a connection through a provider network (PN). In particular, a connection is established along a path (P | 07-04-2013 |
20130212422 | Method And Apparatus For Rapid Disaster Recovery Preparation In A Cloud Network - Various embodiments provide a method and apparatus of providing a rapid disaster recovery preparation in cloud networks that proactively detects disaster events and rapidly allocates cloud resources. Rapid disaster recovery preparation may shorten the recovery time objective (RTO) by proactively growing capacity on the recovery application(s)/resource(s) before the surge of recovery traffic hits the recovery application(s)/resource(s). Furthermore, rapid disaster recovery preparation may shorten RTO by growing capacity more rapidly than during “normal operation” where the capacity is increased by modest growth after the load has exceeded a utilization threshold for a period of time. | 08-15-2013 |
20130232374 | REMOTELY SERVICING AND DIAGNOSING ELECTRONIC DEVICES - Remotely servicing and diagnosing a client device, including: establishing a persistent two-way connection between a server and the client device using a messaging and presence protocol; reading and analyzing statistics and settings of the client device when the persistent two-way connection has been established; detecting any problem with the client device from reading and analyzing statistics and setting of the client device; addressing and fixing the problem with the client device. Keywords include persistent connection, and remote servicing and diagnosing. | 09-05-2013 |
20130232375 | TRANSITIONAL REPLACEMENT OF OPERATIONS PERFORMED BY A CENTRAL HUB - A central hub is coupled to a plurality of computational devices. The central hub stores a data structure that grants locks for accessing common data stored at the central hub, wherein the common data is shared by the plurality of computational devices. Each computational device maintains locally those locks that are held by the computational device in the data structure stored at the central hub. In response to a failure of the data structure stored at the central hub, a selected computational device of the plurality of computational devices is determined to be a manager system. Other computational devices besides the manager system communicate to the manager system all locks held by the other computational devices in the data structure stored at the central hub. The data structure and the common data are generated and stored at the manager system. Transactions are performed with respect to the data structure stored at the manager system, until the data structure stored at the central hub is operational. | 09-05-2013 |
20130283093 | DETECTING THE HEALTH OF AN OPERATING SYSTEM IN VIRTUALIZED AND NON-VIRTUALIZED ENVIRONMENTS - A remote management controller is provided for use in conjunction with a managed host computer. The remote management controller exposes a virtual network interface controller, such as a driverless virtual USB network interface controller, to the managed host computer. Through the in-band connection provided by the virtual network interface controller, the remote management controller can send a command to the host operating system or one or more guest operating systems executing in a virtualized environment. If no reply is received to the command, the remote management controller takes corrective action to restore the operation of the host operating system or the non-responsive guest operating systems. | 10-24-2013 |
20130290772 | SEQUENCE INDICATOR FOR COMMAND COMMUNICATED TO A SEQUENTIAL ACCESS STORAGE DEVICE - A command is communicated by a computer and received by a sequential storage access device. The command includes a sequence indicator. The sequential storage access device uses the sequence indicator, in a communication path failure recovery operation, to at least determine whether a command has been confirmed by the device driver as being processed by the sequential access storage device. | 10-31-2013 |
20130297965 | COMMUNICATIONS DEVICE - A diagnostic processor ( | 11-07-2013 |
20130305082 | RECOVERING INFORMATION - A first memory device receives session information associated with a session between a first network device and a user device. The first memory device outputs the session information associated with the session information. A second memory device receives the session information, associated with the session, from the first memory device. The second memory device receives a communication from the first memory device that the first network device is not functioning. The second memory device sends session information to a second network device, based on receiving the communication from the first network device that the first network device is not functioning, the second network device taking over the session from the first network device. | 11-14-2013 |
20130305083 | CLOUD SERVICE RECOVERY TIME PREDICTION SYSTEM, METHOD AND PROGRAM - A recovery schedule storing means ( | 11-14-2013 |
20130346787 | Tenant Rescue for Software Change Processes in Multi-Tenant Architectures - A multi-tenant system can be switched to a downtime state to implement a transition from a current state to a target state of a core software platform. During a second phase of the transition an error associated with tenant-specific content of a first customer tenant of the plurality of customer tenants of the multi-tenant system can be identified. The second phase can be suspended for the first customer tenant while continuing the second phase for a remainder of the plurality of customer tenants for which an error has not been identified. After a scheduled duration of the downtime state, the multi-tenant system can be reactivated such that the multi-tenant system incldues the remainder of the plurality of customer tenants with the transition implemented and the first customer tenant either with the transition implemented if the error has been corrected or without the transition implemented if the error has not been corrected. | 12-26-2013 |
20140006843 | METHOD AND APPARATUS FOR MANAGING CONNECTION PATH FAILURE BETWEEN DATA CENTERS FOR CLOUD COMPUTING | 01-02-2014 |
20140006844 | METHOD AND SYSTEM FOR AUTOMATICALLY DETECTING AND RESOLVING INFRASTRUCTURE FAULTS IN CLOUD INFRASTRUCTURE | 01-02-2014 |
20140006845 | METHOD AND SYSTEM FOR MANAGING A COMMUNICATION NETWORK | 01-02-2014 |
20140013152 | SYSTEM FOR INJECTING PROTOCOL SPECIFIC ERRORS DURING THE CERTIFICATION OF COMPONENTS IN A STORAGE AREA NETWORK - An apparatus comprising an initiator circuit and a target circuit. The initiator circuit may be configured to (i) communicate with a network through a first interface and (ii) generate testing sequences to be sent to the network. The target circuit may be configured to (i) receive the testing sequences from the network through a second network interface and (ii) respond to the testing sequences. | 01-09-2014 |
20140019797 | RESOURCE MANAGEMENT IN EPHEMERAL ENVIRONMENTS - A system executes a method that includes monitoring the systems that provide the provisioning of computing infrastructure resources to provide services for services executing on the resources, to monitor provisioning changes on the infrastructure resources. It further uses the information gathered by the monitoring of the provisioning resource(s) to provide a tag, which denotes the current assignment of each resource with a service name and an instance of the service associated with the resource identifier and a time stamp, and storing this a relational database, as a part of the record of the state, or performance data for that resource during that time interval. | 01-16-2014 |
20140025984 | AUTOMATING INFRASTRUCTURE WORKFLOWS AS ATOMIC TRANSACTIONS - Information Technology (IT) system configuration is managed using a set of defined flows with atomic execution properties. The instructions to execute a change to one or more infrastructure elements (a “ forward transaction”) are maintained with instructions and/or information needed to execute a corresponding “reverse” transaction that is responsible for returning the element(s) to a pre-transaction action state in the event of a configuration failure or other request originating at a high level flow. | 01-23-2014 |
20140032958 | CLUSTERED FILESYSTEMS FOR MIX OF TRUSTED AND UNTRUSTED NODES - A cluster of computer system nodes share direct read/write access to storage devices via a storage area network using a cluster filesystem. At least one trusted metadata server assigns a mandatory access control label as an extended attribute of each filesystem object regardless of whether required by a client node accessing the filesystem object. The mandatory access control label indicates the sensitivity and integrity of the filesystem object and is used by the trusted metadata server(s) to control access to the filesystem object by all client nodes. | 01-30-2014 |
20140040657 | Method for Transmitting Messages in a Redundantly Operable Industrial Communication Network and Communication Device for the Redundantly Operable Industrial Communication Network - A method for transmitting messages in a redundantly operable communication network includes a first subnetwork with a tree topology and a second subnetwork, wherein messages are transmitted in the first subnetwork in accordance with a spanning tree protocol, communication devices associated with network nodes of the first subnetwork interchange messages containing topology information with one another in order to form a tree topology, messages are transmitted in the second subnetwork in accordance with a parallel or ring redundancy protocol, and a virtual network node which is connected to all network nodes of the second subnetwork via a respective virtual connection which is uninterruptable by an error is configured as the root network node of the first subnetwork. | 02-06-2014 |
20140053013 | HANDLING INTERMITTENT RECURRING ERRORS IN A NETWORK - Embodiments relate to a computer for transmitting data in a network. The computer includes at least one data transmission port configured to be connected to at least one storage device via a plurality of paths of a network. The computer further includes a processor configured to detect recurring intermittent errors in one or more paths of the plurality of paths and to disable access to the one or more paths based on detecting the recurring intermittent errors. | 02-20-2014 |
20140059375 | RECOVERY SYSTEM AND METHOD FOR RECREATING A STATE OF A DATACENTER - Embodiments include a recovery system, a computer-readable storage medium, and a method of recreating a state of a datacenter. The embodiments include a plurality of program modules that is executable by a processor to gather metadata from a first datacenter that includes at least one virtual machine (VM), wherein the metadata includes data representative of a virtual infrastructure of the first datacenter. The program modules are also executable by the processor to recreate a state of the first datacenter within a second datacenter using the metadata upon a determination that a failure occurred within the first datacenter, and to recreate the VM within the second datacenter. | 02-27-2014 |
20140075239 | FAILURE HANDLING IN THE EXECUTION FLOW OF PROVISIONING OPERATIONS IN A CLOUD ENVIRONMENT - A method for handling failures in the execution flow of provisioning operations is disclosed. The method may comprise receiving, by a cloud infrastructure system, an error from the execution flow of provisioning a service from a plurality of service provided by the cloud infrastructure system, the cloud infrastructure system comprising one or more computing devices. Additionally, the method may further comprise determining, by a computing device from the one or more computing devices, a specific service associated with the error and determining an error classification type associated with the error based on the specific service. Subsequently, the method may further comprise performing, by the computing device, a corrective action based on the specific service and the error type. | 03-13-2014 |
20140189418 | EXPANDER TO CONTROL MULTIPATHS IN A STORAGE NETWORK - An identification request is received from a host device. A virtual address that identifies at least the first device and a second device is determined. The first device and the second device are coupled together to form a plurality of redundant paths between the host device and a target device. A plurality of target ports associated with the target device are determined. A single virtual target port address is assigned to the plurality of target ports associated with the target device. A first of the plurality of redundant paths between the host device and the target device is designated as an active path. The first of the plurality of redundant paths between the host device and the target device is associated with a first of the plurality of target ports. The virtual address and the virtual target port address are transmitted to the host device. | 07-03-2014 |
20140208152 | RELAY NODE, CONTROL METHOD OF RELAY NODE AND NETWORK SYSTEM - A first network device includes: a forwarding controller configured to forward received data; and a fault detector configured to detect occurrence of a failure in a remote second relay node. The forwarding controller includes: a forwarding unit configured to forward the received data; and a modifier configured to modify the received data for detection of the occurrence of a failure in the second relay node. The modifier includes (i) a flag option marker configured to attach a flag to data; (ii) a sequence adding unit configured to add a protocol specific number to a sequence number; and (iii) a sequence subtracting unit configured to subtract the protocol specific number from an acknowledgement number. The fault detector detects the occurrence of a failure in the remote second relay node, based on at least one of the flag and the acknowledgement number. | 07-24-2014 |
20140281667 | Fault Detection Method, Gateway, User Equipment, and Communications System - Embodiments of the present invention provide a fault detection method, including discovering that a fault occurs in a DNS server or a service server related to a UE; performing, by a gateway, fault detection on the DNS server or the service server; and, after the fault is rectified, instructing the UE to establish a connection to the DNS server or the service server. Correspondingly, the embodiments of the present invention further provide a gateway, a UE, and a communications system, thereby avoiding frequent air interface release and connections, and frequent bearer deactivation and activation, which reduces the signaling overhead of the system, and enhances stability of a mobile network. | 09-18-2014 |
20140298078 | SYNCHRONOUS MIRRORING OF NVLog TO MULTIPLE DESTINATIONS (ARCHITECTURE LEVEL) - Systems and methods herein are operable to simultaneously mirror data to a plurality of mirror partner nodes. In embodiments, a mirror client may be unaware of the number of mirror partner nodes and/or the location of the plurality of mirror partner nodes, and issue a single mirror command requesting initiation of a mirror operation. An interconnect layer may receive the single mirror command and split the mirror command into a plurality of mirror instances, one for each mirror node partner, wherein the mirror instances may be simultaneously launched. After the plurality of mirror operations has begun, the interconnect layer may manage completion reports indicating the completion status of respective mirror operations, and send a single return to the mirror client indicating whether the mirror command succeeded. | 10-02-2014 |
20140310554 | SYSTEM AND METHOD FOR GRAPH BASED K-REDUNDANT RESILIENCY FOR IT CLOUD - A method for enabling resiliency for cloud computing systems is described. The method includes modifying a topology graph of a network architecture by mapping processes flows onto the topology graph. A resiliency graph is created based on the modified topology graph. The method includes modifying the resiliency graph by translating at least one SLA into the resiliency graph. Overlaps and dependencies in the modified resiliency graph are identified. Apparatus and computer readable instructions are also described. | 10-16-2014 |
20140331077 | NODE FAILURE MANAGEMENT - A method and computer-readable storage media are provided for managing resources of a first node. The method may include detecting a failure in a first node. The first node may include one or more cores and supporting resources. The method may further include determining that one or more cores in the first node survived the failure. The method may further include determining that any supporting resources survived the failure. The method may also include reconfiguring a second node to add the surviving supporting resources of the first node using communication interface between the first and second node if the determinations found a surviving core and surviving supporting resource in the first node. | 11-06-2014 |
20140337661 | COOPERATIVE STORAGE SYSTEM UTILIZING DISPERSED STORAGE - A user device sends a data file, which has already been divided into multiple different data portions, to be stored in a dispersed storage network (DSN). The DSN includes multiple dispersed storage (DS) processing units, which in turn are each used to manage multiple DSN memories. Portion affiliation information, which can be stored in a data structure maintained by a DS managing unit, is used to determine particular DS processing units will handle storage and retrieval of particular data portions. The data portions are encoded by the assigned DS processing unit into at least a write threshold number of encoded slices, and stored in the DSN memories in accordance with the portion affiliation information. | 11-13-2014 |
20140372788 | HYPERVISOR REMEDIAL ACTION FOR A VIRTUAL MACHINE IN RESPONSE TO AN ERROR MESSAGE FROM THE VIRTUAL MACHINE - Exemplary methods, apparatuses, and systems include a hypervisor receiving an error message from an agent within a first virtual machine run by the hypervisor. In response to the error message, the hypervisor determines and initiates a corrective action for the hypervisor to take in response to the error message. An exemplary corrective action includes initiating a reset of the first virtual machine or a reset of a second virtual machine. | 12-18-2014 |
20150082077 | TRACKING PACKETS THROUGH A CLOUD COMPUTING ENVIRONMENT - A device, of a cloud computing environment, receives an instruction to create a virtual packet tracker from a user device associated with a user, and implements the virtual packet tracker in the device based on the instruction. The virtual packet tracker: receives a packet that includes a unique value used to track the packet in a portion of the cloud computing environment associated with the user; provides the packet for routing through the portion; receives an indication that the packet is dropped at a particular resource of the portion; determines whether a problem causing the packet to be dropped can be corrected; and processes the problem based on whether the problem can be corrected. The problem is corrected when it is determined that the problem can be corrected. Information associated with the packet is transmitted to the user device when it is determined that the problem cannot be corrected. | 03-19-2015 |
20150089271 | MANAGEMENT DEVICE, DATA ACQUISITION METHOD, AND RECORDING MEDIUM - A management device includes a processor that executes a process. The process includes: saving a conversion table when an information processing apparatus that performs a memory access by the conversion table, in which an active absolute address that is used by the processor to specify data is associated with an active physical address that indicates a storage area in a memory that stores therein the data, has failed; creating a second conversion table in which a standby absolute address that is different from the active absolute address is associated with the active physical address used at the time of a failure and a standby physical address that is different from the active physical address used at the time of the failure is associated with the active absolute address; setting the second conversion table; and acquiring the data from the storage area that is indicated by the physical address. | 03-26-2015 |
20150089272 | MAINTAINING HIGH AVAILABILITY OF A GROUP OF VIRTUAL MACHINES USING HEARTBEAT MESSAGES - Embodiments maintain high availability of software application instances in a fault domain. Subordinate hosts are monitored by a master host. The subordinate hosts publish heartbeats via a network and datastores. Based at least in part on the published heartbeats, the master host determines the status of each subordinate host, distinguishing between subordinate hosts that are entirely inoperative and subordinate hosts that are operative but partitioned (e.g., unreachable via the network). The master host may restart software application instances, such as virtual machines, that are executed by inoperative subordinate hosts or that cease executing on partitioned subordinate hosts. | 03-26-2015 |
20150149812 | Self-Debugging Router Platform - Exemplary methods for network debugging include a control plane of a first network device generating and injecting debug traffic into a data plane of the first network device such that the debug traffic appears to the data plane as if it originated from an external network device. The methods include the data plane transmitting the debug traffic to a network. In one embodiment, the control plane collects debug information of the debug traffic as it is processed by the data plane and the network. In one embodiment, the first network device is configured to exchange debug information of the debug traffic with a second network device, and to provide the debug information to an operator. | 05-28-2015 |
20150347221 | Fractional Reserve High Availability Using Cloud Command Interception - An approach is provided to provide a high availability (HA) cloud environment. In the approach, an active cloud environment is established in one cloud computing environment using a primary set of resources and a passive cloud environment is established in another cloud computing environment, with the passive cloud environment using fewer resources than are used by the active cloud environment. A workload is serviced by the active cloud environment. While servicing the workload, cloud commands are processed that alter the primary set of resources and the commands are stored in a queue. When a failure of the active cloud environment occurs, the workload is serviced by the passive cloud environment in the second cloud computing environment and the cloud commands stored in the queue are used to alter the resources used by the passive cloud environment. | 12-03-2015 |
20150363281 | METHOD AND SYSTEM FOR AUTOMATICALLY DETECTING AND RESOLVING INFRASTRUCTURE FAULTS IN CLOUD INFRASTRUCTURE - Systems and methods are provided for any party in a cloud ecosystem (cloud providers of such resources, the intermediate management software for such resources, and the end user of such resources) to detect and resolve faulty resources synchronously or asynchronously, before said faults adversely affect the users' workloads. The system requests a service or set of one or more resources within a cloud, automatically checking the infrastructure for various faults that would cause it to be non-functional, including pre-defined and user-defined checks, and resolving them before including the infrastructure in the working service cluster of resources. The system presents an API to the user that returns only functional, production-quality resources that are not in a faulty state. An API that tests and resolves bad infrastructure can be registered during the request or a preceding/subsequent API call, removing the need for the end-user to deal with various types of infrastructure faults. | 12-17-2015 |
20150378851 | MONITORING METHOD, MONITORING DEVICE, AND INFORMATION PROCESSING SYSTEM - A monitoring method that is executed by a monitoring device that monitors communication between an information processing device from among a plurality of information processing devices and a switching device that is coupled to a peripheral device that includes at least one of an input device and an output device, the monitoring method includes storing information on a recovery method for each process of the communication, in a memory; detecting the communication between the information processing device and the switching device; determining whether a failure has occurred in the detected communication by analyzing the detected communication for each of the processes; and executing restoration processing of recovering the detected communication, based on information on the recovery method corresponding to a failed process among the processes, which is stored in the memory when it is determined that the failure has occurred in the detected communication. | 12-31-2015 |
20160021188 | Generic Network Trace with Distributed Parallel Processing and Smart Caching - A method and system of distributed parallel processing. There are a plurality of distributed parallel processing units (DPPUs). Each DPPU is configured to receive data related to a condition of the network. The type of data received by each DPPU is disparate for each DPPU. Each DPPU analyzes its data. Upon determining that a predetermined condition is met or a predetermined threshold is exceeded, the disparate data is transformed into a common format using an appropriate driver of the configuration module. The common format data is sent to a storage device of a first DPPU of the plurality of DPPUs. | 01-21-2016 |
20160034362 | Proactive Failure Recovery Model for Distributed Computing - This disclosure generally describes methods and systems, including computer-implemented methods, computer-program products, and computer systems, for providing a proactive failure recovery model for distributed computing. One computer-implemented method includes building a virtual tree-like computing structure of a plurality of computing nodes, for each computing node of the virtual tree-like computing structure, performing, by a hardware processor, a node failure prediction model to calculate a mean time between failure (MTBF) associated with the computing node, determining whether to perform a checkpoint of the computing node based on a comparison between the calculated MTBF and a maximum and minimum threshold, migrating a process from the computing node to a different computing node acting as a recovery node, and resuming execution of the process on the different computing node. | 02-04-2016 |
20160048433 | SYSTEM, APPARATUS, AND METHOD TO DYNAMICALLY CHANGE SYSTEM RECOVERIES BASED ON SYSTEM LOAD - A method for dynamically changing system recovery actions based on system load. The method includes measuring a value of a workload characteristic of a computer system over a period of time, detecting an error in the computer system, determining a workload level of the computer system, and selecting a set of error recovery actions in response to the system workload analysis module determining the workload level of the computer system. A workload characteristic defines a type of work performed by the computer system. A workload level can be based on user defined parameters or a measurement of the value of one or more workload characteristics. | 02-18-2016 |
20160077933 | Scalable Data Storage Pools - Scalable data storage techniques are described. In one or more implementations, data is obtained by one or more computing devices that describes fault domains in a storage hierarchy and available storage resources in a data storage pool. Operational characteristics are ascertained, by the one or more computing devices, of devices associated with the available storage resources within one or more levels of the storage hierarchy. Distribution of metadata is assigned by the one or more computing devices to one or more particular data storage devices within the data storage pool based on the described fault domains and the ascertained operational characteristics of devices within one or more levels of the storage hierarchy. | 03-17-2016 |