Patent application number | Description | Published |
20090216774 | VIRTUALIZATION OF METADATA FOR FILE OPTIMIZATION - Mechanisms are provided for optimizing files while allowing application servers access to metadata associated with preoptimized versions of the files. During file optimization involving compression and/or compaction, file metadata changes. In order to allow file optimization in a manner transparent to application servers, the metadata associated with preoptimized versions of the files is maintained in a metadata database as well as in an optimized version of the files themselves. | 08-27-2009 |
20090216788 | MULTIPLE FILE COMPACTION FOR NETWORK ATTACHED STORAGE - Mechanisms are provided for optimizing multiple files in an efficient format that allows maintenance of the original namespace. Multiple files and associated metadata are written to a suitcase file. The suitcase file includes index information for accessing compressed data associated with compacted files. A hardlink to the suitcase file includes an index number used to access the appropriate index information. A simulated link to a particular file maintains the name of the particular file prior to compaction. | 08-27-2009 |
20100094813 | REPRESENTING AND STORING AN OPTIMIZED FILE SYSTEM USING A SYSTEM OF SYMLINKS, HARDLINKS AND FILE ARCHIVES - A data de-duplication system is used with network attached storage and serves to reduce data duplication and file storage costs. Techniques utilizing both symlinks and hardlinks ensure efficient deletion file/data cleanup and avoid data loss in the event of crashes. | 04-15-2010 |
20110071989 | FILE AWARE BLOCK LEVEL DEDUPLICATION - A system provides file aware block level deduplication in a system having multiple clients connected to a storage subsystem over a network such as an Internet Protocol (IP) network. The system includes client components and storage subsystem components. Client components include a walker that traverses the namespace looking for files that meet the criteria for optimization, a file system daemon that rehydrates the files, and a filter driver that watches all operations going to the file system. Storage subsystem components include an optimizer resident on the nodes of the storage subsystem. The optimizer can use idle processor cycles to perform optimization. Sub-file compression can be performed at the storage subsystem. | 03-24-2011 |
20110125722 | METHODS AND APPARATUS FOR EFFICIENT COMPRESSION AND DEDUPLICATION - Mechanisms are provided for performing efficient compression and deduplication of data segments. Compression algorithms are learning algorithms that perform better when data segments are large. Deduplication algorithms, however, perform better when data segments are small, as more duplicate small segments are likely to exist. As an optimizer is processing and storing data segments, the optimizer applies the same compression context to compress multiple individual deduplicated data segments as though they are one segment. By compressing deduplicated data segments together within the same context, data reduction can be improved for both deduplication and compression. Mechanisms are applied to compensate for possible performance degradation. | 05-26-2011 |
20110270809 | HEAT INDICES FOR FILE SYSTEMS AND BLOCK STORAGE - Techniques and mechanisms are provided to allow for selective optimization, including deduplication and/or compression, of portions of files and data blocks. Data access is monitored to generate a heat index for identifying sections of files and volumes that are frequently and infrequently accessed. These frequently used portions may be left non-optimized to reduce or eliminate optimization I/O overhead. Infrequently accessed portions can be more aggressively optimized. | 11-03-2011 |
20110270810 | METHODS AND APPARATUS FOR ACTIVE OPTIMIZATION OF DATA - Techniques and mechanisms are provided to support live file optimization. Active I/O access to an optimization target is monitored during optimization. Active files need not be taken offline or made unavailable to an application during optimization and retain the ability to support file operations such as read, write, unlink, and truncate while an optimization engine performs deduplication and/or compression on active file ranges. | 11-03-2011 |
20120084270 | STORAGE OPTIMIZATION MANAGER - Techniques and mechanisms provide a storage optimization manager. Data may be optimized and maintained on various nodes in a cluster. Particular nodes may be overburdened while other nodes remain relatively unused. Techniques are provided to efficiently optimize data onto nodes to enhance operational efficiency. Data access requests for optimized data are monitored and managed to allow for intelligent maintenance of optimized data. | 04-05-2012 |
20120084527 | DATA BLOCK MIGRATION - Techniques and mechanisms are provided for migrating data blocks around a cluster during node addition and node deletion. Migration requires no downtime, as a newly added node is immediately operational while the data blocks are being moved. Blockmap files and deduplication dictionaries need not be updated. | 04-05-2012 |
20120246127 | VIRTUALIZATION OF METADATA FOR FILE OPTIMIZATION - Mechanisms are provided for optimizing files while allowing application servers access to metadata associated with preoptimized versions of the files. During file optimization involving compression and/or compaction, file metadata changes. In order to allow file optimization in a manner transparent to application servers, the metadata associated with preoptimized versions of the files is maintained in a metadata database as well as in an optimized version of the files themselves. | 09-27-2012 |
20130138607 | RESYNCHRONIZATION OF REPLICATED DATA - Mechanisms are provided for efficient resynchronization of replicated data. A hash value is generated for a chunk of data replicated from a source node to a target node. The chunk of data may be a file deduplicated and compressed at both a source node and a target node. A current sequence number is determined and a sequence number and hash tuple is maintained for the chunk of data at both the source node and the target node. Sequence numbers are modified whenever the data is modified. Current sequence numbers and sequence number and hash values in the sequence number hash tuples at the source node and the target node may be compared to determine whether data is still synchronized at a later point in time or whether data requires resynchronization. | 05-30-2013 |
20130246372 | METHODS AND APPARATUS FOR EFFICIENT COMPRESSION AND DEDUPLICATION - Mechanisms are provided for performing efficient compression and deduplication of data segments. Compression algorithms are learning algorithms that perform better when data segments are large. Deduplication algorithms, however, perform better when data segments are small, as more duplicate small segments are likely to exist. As an optimizer is processing and storing data segments, the optimizer applies the same compression context to compress multiple individual deduplicated data segments as though they are one segment. By compressing deduplicated data segments together within the same context, data reduction can be improved for both deduplication and compression. Mechanisms are applied to compensate for possible performance degradation. | 09-19-2013 |
20130297572 | FILE AWARE BLOCK LEVEL DEDUPLICATION - A system provides file aware block level deduplication in a system having multiple clients connected to a storage subsystem over a network such as an Internet Protocol (IP) network. The system includes client components and storage subsystem components. Client components include a walker that traverses the namespace looking for files that meet the criteria for optimization, a file system daemon that rehydrates the files, and a filter driver that watches all operations going to the file system. Storage subsystem components include an optimizer resident on the nodes of the storage subsystem. The optimizer can use idle processor cycles to perform optimization. Sub-file compression can be performed at the storage subsystem. | 11-07-2013 |
20140095455 | HEAT INDICES FOR FILE SYSTEMS AND BLOCK STORAGE - Techniques and mechanisms are provided to allow for selective optimization, including deduplication and/or compression, of portions of files and data blocks. Data access is monitored to generate a heat index for identifying sections of files and volumes that are frequently and infrequently accessed. These frequently used portions may be left non-optimized to reduce or eliminate optimization I/O overhead. Infrequently accessed portions can be more aggressively optimized. | 04-03-2014 |
20140195748 | EFFICIENT REPLICA CLEANUP DURING RESYNCHRONIZATION - Mechanisms are provided for efficient replica cleanup during resynchronization. According to various embodiments, a plurality of deleted data segment ranges on a first storage node may be identified. The first storage node may be configured to store a plurality of data segments. Each of the plurality of data segments may have associated therewith a respective identifier. Each of the data segment ranges may designate one or more data segments that have been deleted from the first storage node. The plurality of deleted data segment ranges may be transmitted to a second storage node configured to mirror the plurality of data segments stored on the first storage node. The plurality of deleted data segment ranges may be capable of being used to identify one or more data segments to delete from the second storage node. | 07-10-2014 |
20140214760 | SYNCHRONIZED STORAGE SYSTEM OPERATION - Techniques and mechanisms described herein facilitate the performance of duplicate data block instruction identification. According to various embodiments, a data block update operation message may be received at a communications interface in a secondary storage node. The secondary storage node may be configured to store secondary data mirroring primary data stored on a primary storage node. The primary data and the secondary data may each include a respective plurality of data blocks. The data block update operation message may include a data block update instruction for updating a designated one of the plurality of secondary storage node data blocks. The data block update operation message may include a primary storage node data block sequence number designating an update operation status. When it is determined that the data block update instruction is not a duplicate, the data block update instruction may be performed. | 07-31-2014 |
20150019515 | HEAT INDICES FOR FILE SYSTEMS AND BLOCK STORAGE - Techniques and mechanisms are provided to allow for selective optimization, including deduplication and/or compression, of portions of files and data blocks. Data access is monitored to generate a heat index for identifying sections of files and volumes that are frequently and infrequently accessed. These frequently used portions may be left non-optimized to reduce or eliminate optimization I/O overhead. Infrequently accessed portions can be more aggressively optimized. | 01-15-2015 |
20150032978 | TRANSFERRING DIFFERENCES BETWEEN CHUNKS DURING REPLICATION - Techniques and mechanisms described herein facilitate the replication of data between storage nodes. According to various embodiments, a request to provide a data chunk to a target storage node may be received at a source data storage node. A reference data chunk may be identified based on fingerprint information associated with the requested data chunk. The reference data chunk may be stored on the target storage node. The reference data chunk and the requested data chunk may each include a first data portion. Data chunk reconstruction information may be transmitted from the source data storage node to the target data storage node. The data chunk reconstruction information may identify the reference data chunk. The data chunk reconstruction information may include data difference information for constructing the requested data chunk at the target data storage node based on the reference data chunk. | 01-29-2015 |