Patent application number | Description | Published |
20100106695 | SCALABLE BLOB STORAGE INTEGRATED WITH SCALABLE STRUCTURED STORAGE - Embodiments of the present invention relate to systems, methods and computer storage media for facilitating the structured storage of binary large objects (Blobs) to be accessed by an application program being executed by a computing device. Generally, the structured storage of Blobs includes a primary structured storage index for indexing Blobs, a secondary hash index that is integrated into the structured storage system, a Blob log stream, and a Blob data stream for storing blocks that include the Blob data. In an embodiment, a block is created and written to a Blob store along with a block list. The block list facilitates the locating of one or more blocks that store the Blob data. In this embodiment, a primary structured storage index and a secondary hash index are updated to facilitate efficient access of the Blob in a structured storage system. | 04-29-2010 |
20100106734 | BLOB MANIPULATION IN AN INTEGRATED STRUCTURED STORAGE SYSTEM - Embodiments of the present invention relate to systems, methods and computer storage media for facilitating the structured storage of binary large objects (Blobs) to be accessed by an application program being executed by a computing device. Generally, the manipulation of Blobs in a structured storage system includes receiving a request for a Blob, which may be located by way of a Blob pointer. The Blob pointer allows for the data, such as properties, of the Blob to be identified and located. Expired properties are garbage collected as a manipulation of the Blob data within a structured storage system. In an embodiment, the Blob is identified by a key that is utilized within a primary structured index to located the requested Blob. In another embodiment, the requested Blob is located utilizing a secondary hash index. In an additional embodiment, the Blob is locate utilizing a file table. | 04-29-2010 |
20100106934 | PARTITION MANAGEMENT IN A PARTITIONED, SCALABLE, AND AVAILABLE STRUCTURED STORAGE - Partition management for a scalable, structured storage system is provided. The storage system provides storage represented by one or more tables, each of which includes rows that represent data entities. A table is partitioned into a number of partitions, each partition including a contiguous range of rows. The partitions are served by table servers and managed by a table master. Load distribution information for the table servers and partitions is tracked, and the table master determines to split and/or merge partitions based on the load distribution information. | 04-29-2010 |
20120303576 | SYNCHRONOUS REPLICATION IN A DISTRIBUTED STORAGE ENVIRONMENT - Embodiments of the present invention relate to synchronously replicating data in a distributed computing environment. To achieve synchronous replication both an eventual consistency approach and a strong consistency approach are contemplated. Received data may be written to a log of a primary data store for eventual committal. The data may then be annotated with a record, such as a unique identifier, which facilitates the replay of the data at a secondary data store. Upon receiving an acknowledgment that the secondary data store has written the data to a log, the primary data store may commit the data and communicate an acknowledgment of success back to the client. In a strong consistency approach, the primary data store may wait to send an acknowledgement of success to the client until it receives an acknowledgment that the secondary has not only written, but also committed, the data. | 11-29-2012 |
20120303577 | ASYNCHRONOUS REPLICATION IN A DISTRIBUTED STORAGE ENVIRONMENT - Embodiments of the present invention relate to asynchronously replicating data in a distributed computing environment. To achieve asynchronous replication, data received at a primary data store may be annotated with information, such as an identifier of the data. The annotated data may then be communicated to a secondary data store, which may then write the data and annotated information to one or more logs for eventual replay and committal at the secondary data store. The primary data store may communicate an acknowledgment of success in committing the data at the primary data store as well as of success in writing the data to the secondary data store. Additional embodiments may include committing the data at the secondary data store in response to receiving an instruction that authorizes committal of data through a identifier. | 11-29-2012 |
20120303578 | Versioned And Hierarchical Data Structures And Distributed Transactions - Presented herein are methods of replicating versioned and hierarchical data structures, as well as data structures representing complex transactions. Due to interdependencies between data entities and a lack of guaranteed message ordering, simple replication methods employed for simple data types cannot be used. Operations on data structures exhibit dependencies between the messages making up the operations. This strategy can be extended to various types of complex transactions by considering certain messages to depend on other messages or on the existence of other entries at the data store. Regardless of origin, these dependencies can be enforced by suspending the processing of messages with unsatisfied dependencies until all of its dependencies have been met. Alternately, transactions can be committed immediately, creating entities that include versioned identifiers for each of their dependencies. These entities can then be garbage collected of the parent objects are not subsequently created. | 11-29-2012 |
20120303581 | REPLICATION PROCESSES IN A DISTRIBUTED STORAGE ENVIRONMENT - Embodiments of the present invention relate to systems, methods, and computer storage media for replicating data in a distributed computing environment utilizing a combination of replication methodologies. A full-object replication may be utilized to replicate a full state of an object from a primary data store to a secondary data store. A checkpoint created after initiating the full-object replication may be parsed to identify changes to the object that have been entered since initiating the full-object replication. This replication process is referred to as a delta-checkpoint replication methodology. Additionally, in an embodiment, a log-based replication methodology may be utilized. The log-based replication may communicate data from a log of the primary data store to the secondary data store. It is also contemplated in an exemplary embodiment that when the log-based replication fails to maintain a throughput threshold, one of the other replication methodologies may be initiated, at least temporarily. | 11-29-2012 |
20120303593 | Geo-Verification And Repair - Presented herein are methods of continuously verifying data and repairing errors introduced during replication. In a particular embodiment, a primary data store sends out information sufficient to create a checkpoint together with a checksum for the data being verified at that checkpoint. At the secondary data store, a checkpoint is created in accordance with the checkpointing information, and a checksum is calculated over the indicated data at the created checkpoint. If the calculated checksum disagrees with the received checksum, additional checksums are calculated over subranges of the indicated data and compared with corresponding checksums over the data at the primary data store. The checksums at the primary data store may be requested from the primary data store or calculated locally based on the received overall checksum. Once an erroneous entry is identified, it can then be re-replicated from the primary data store to restore data consistency. | 11-29-2012 |
20120303791 | LOAD BALANCING WHEN REPLICATING ACCOUNT DATA - Embodiments of the present invention relate to invoking and managing load-balancing operation(s) applied to partitions within a distributed computing environment, where each partition represents a key range of data for a storage account. The partitions affected by the load-balancing operation(s) are source partitions hosted on a primary storage stamp and/or destination partitions hosted on a secondary storage stamp, where the primary and secondary storage stamps are located in geographically distinct areas and are equipped to replicate the storage account's data therebetween. The load-balancing operation(s) include splitting partitions into child partitions upon detecting an increased workload as a result of active replication, merging partitions to form parent partitions upon detecting a reduction in workload as a result of decreased processing-related resource consumption, or offloading partitions based on resource consumption. A service within a partition layer of the storage stamps is responsible for determining when to invoke these load-balancing operation(s). | 11-29-2012 |
20120303912 | STORAGE ACCOUNT MIGRATION BETWEEN STORAGE STAMPS - Embodiments of the present invention relate to invoking and managing migration operations applied to partitions within a distributed computing environment, where each partition represents a key range of data for a storage account. The partitions affected by the migration operations are source partitions hosted on a primary storage stamp and/or destination partitions hosted on a secondary storage stamp, where the primary and secondary storage stamps are equipped to replicate the storage account's data therebetween upon initiating a migration. Upon substantial completion of a bootstrapping phase of replication, one migration operation includes designating the secondary storage stamp as a new primary storage stamp such that the destination partitions commence processing client requests, sending resultant transactions to the source partitions, and providing read and write access thereto. Another migration operation includes designating the primary storage stamp as a new secondary storage stamp such that the source partitions commence replaying the transactions. | 11-29-2012 |
20120303999 | IMPLEMENTING FAILOVER PROCESSES BETWEEN STORAGE STAMPS - Embodiments of the present invention relate to invoking and managing a failover of a storage account between partitions within a distributed computing environment, where each partition represents a key range of data for the storage account. The partitions affected by the failover include source partitions hosted on a primary storage stamp and destination partitions hosted on a secondary storage stamp, where the storage account's data is being actively replicated from the primary to the secondary storage stamp. Upon receiving a manual or automatic indication to perform the failover, configuring the source partitions to independently perform flush-send operations (e.g., distributing pending messages as a group) and then configuring the destination partitions to independently perform flush-replay operations (e.g., aggressively replaying currently pending transactions). Upon completing the flush-replay operations, designating the secondary storage stamp as a new primary storage stamp such that live traffic is directed to the new primary storage stamp. | 11-29-2012 |
20130311521 | BLOB MANIPULATION IN AN INTEGRATED STRUCTURED STORAGE SYSTEM - Embodiments of the present invention relate to systems, methods and computer storage media for facilitating the structured storage of binary large objects (Blobs) to be accessed by an application program being executed by a computing device. Generally, the manipulation of Blobs in a structured storage system includes receiving a request for a Blob, which may be located by way of a Blob pointer. The Blob pointer allows for the data, such as properties, of the Blob to be identified and located. Expired properties are garbage collected as a manipulation of the Blob data within a structured storage system. In an embodiment, the Blob is identified by a key that is utilized within a primary structured index to located the requested Blob. In another embodiment, the requested Blob is located utilizing a secondary hash index. In an additional embodiment, the Blob is locate utilizing a file table. | 11-21-2013 |
20140258499 | LOAD BALANCING WHEN REPLICATING ACCOUNT DATA - Embodiments of the present invention relate to invoking and managing load-balancing operation(s) applied to partitions within a distributed computing environment, where each partition represents a key range of data for a storage account. The partitions affected by the load-balancing operation(s) are source partitions hosted on a primary storage stamp and/or destination partitions hosted on a secondary storage stamp, where the primary and secondary storage stamps are located in geographically distinct areas and are equipped to replicate the storage account's data therebetween. The load-balancing operation(s) include splitting partitions into child partitions upon detecting an increased workload as a result of active replication, merging partitions to form parent partitions upon detecting a reduction in workload as a result of decreased processing-related resource consumption, or offloading partitions based on resource consumption. A service within a partition layer of the storage stamps is responsible for determining when to invoke these load-balancing operation(s). | 09-11-2014 |
20140289554 | IMPLEMENTING FAILOVER PROCESSES BETWEEN STORAGE STAMPS - Embodiments of the present invention relate to invoking and managing a failover of a storage account between partitions within a distributed computing environment, where each partition represents a key range of data for the storage account. The partitions affected by the failover include source partitions hosted on a primary storage stamp and destination partitions hosted on a secondary storage stamp, where the storage account's data is being actively replicated from the primary to the secondary storage stamp. Upon receiving a manual or automatic indication to perform the failover, configuring the source partitions to independently perform flush-send operations (e.g., distributing pending messages as a group) and then configuring the destination partitions to independently perform flush-replay operations (e.g., aggressively replaying currently pending transactions). Upon completing the flush-replay operations, designating the secondary storage stamp as a new primary storage stamp such that live traffic is directed to the new primary storage stamp. | 09-25-2014 |