Entries |
Document | Title | Date |
20100082556 | SYSTEM AND METHOD FOR META-DATA DRIVEN, SEMI-AUTOMATED GENERATION OF WEB SERVICES BASED ON EXISTING APPLICATIONS - Techniques for reusing logic implemented in an existing software application such that the logic can be exposed as a Web service or in any other service-oriented context. In one set of embodiments, a design-time technique is provided that comprises, inter alia, receiving program code for an existing software application, generating metadata based on the program code, and customizing the metadata to align with an intended Web service. Artifacts for the Web service are then generated based on the customized metadata. In another set of embodiments, a run-time technique is provided that comprises, inter alia, receiving a payload representing an invocation of a Web service operation of the generated Web service, processing the payload, and, based on the processing, causing the existing software application to execute an operation in response to the invocation of the Web service operation. | 04-01-2010 |
20100106692 | CIRCUIT FOR COMPRESSING DATA AND A PROCESSOR EMPLOYING SAME - The present application addresses a fundamental problem in the design of computing systems, that of minimising the cost of memory access. This is a fundamental limitation on the design of computer systems as regardless of the memory technology or manner of connection to the processor, there is a maximum limitation on how much data can be transferred between processor and memory in a given time, this is the available memory bandwidth and the limitation of compute power by available memory bandwidth is often referred to as the memory-wall. The solution provided creates a map of a data structure to be compressed, the map representing the locations of non-trivial data values in the structure (e.g. non-zero values) and deleting the trivial data values from the structure to provide a compressed structure. | 04-29-2010 |
20100114843 | Index Compression In Databases - Systems, methods and computer program products for compression of database indexes are described herein. A system embodiment includes a sequence determiner to scan a database index and to determine a start of a range and end of a range of consistently changing values in one or more index pages of said database index and an index updater to update said database index based on a sequence determined by said sequence determiner, while suspending writing of one or more values that lie within start of said range and end of said range of values. A method embodiment includes scanning an index, determining a pattern of changing values in one or more index pages of said index and selectively updating said index based on said determining step to minimize index insertions. The method embodiment further includes determining a start of a range of values and an end of said range of values in an index page, setting appropriate bits to identify said start of range of values and end of range of values, determining if an entry to be inserted can appended to at the end of said range of values, and compressing said index by suspending writing of one or more values that occur between said start of range of values and said end of range of values. | 05-06-2010 |
20100114844 | METHOD AND SYSTEM FOR DATA MASHUP IN COMMENTING LAYER - Disclosed is a method and system for receiving an instruction for data mashup in a commenting layer, generating a list of fields in the commenting layer available for the data mashup, receiving a user selection for a first field from the list of fields, determining a data provider to provide a first data of the first field, obtaining the first data from the data provider, determining an aggregation function for the first data and a second data of a second field in the commenting layer and generating an aggregated data in the commenting layer by performing a data mashup operation on the first field and the second field based on the aggregation function. | 05-06-2010 |
20100114845 | PRIME INDEXING AND/OR OTHER RELATED OPERATIONS - Embodiments of prime indexing and/or other related operations are disclosed. prime indexing and/or other related operations are disclosed. | 05-06-2010 |
20100114846 | OPTIMIZING MEDIA PLAYER MEMORY DURING RENDERING - Optimizing operation of a media player during rendering of media files. The invention includes authoring software to create a data structure and to populate the created data structure with obtained metadata. The invention also includes rendering software to retrieve the metadata from the data structure and to identify media files to render. In one embodiment, the invention is operable as part of a compressed media format having a set of small files containing metadata, menus, and playlists in a compiled binary format designed for playback on feature-rich personal computer media players as well as low cost media players. | 05-06-2010 |
20100121826 | DATA COLLECTION SYSTEM, DATA COLLECTION METHOD AND DATA COLLECTION PROGRAM - It is an object to provide a data collection system that is configured to reduce a communication amount, etc. at the time when data are collected from a plurality of devices, so as to reduce a communication amount attended by the collection of data without increasing processing loads imposed on devices. A symbol classifying unit of a data relay device classifies received data that have been already compressed. A data recompressing unit replaces codes contained in the classified already compressed data with other codes, so as to recompress the already compressed data. A symbol set clustering unit sends a transfer destination renewal device a communication speed at the time when the recompressed data are transferred to other devices, a processing speed at the recompressing time, etc. The transfer destination renewal device generates transfer destination information on the basis of the communication speed, the processing speed, etc. | 05-13-2010 |
20100131475 | COMPUTER PRODUCT, INFORMATION RETRIEVING APPARATUS, AND INFORMATION RETRIEVAL METHOD - A recording medium stores therein an information retrieval program that causes a computer to execute generating a Huffman tree based on an XML tag written in an XML file and an appearance frequency of character data exclusive of the XML tag; compressing the XML file using the Huffman tree; receiving a retrieval condition that includes a retrieval keyword and type information concerning the retrieval keyword; setting a decompression start flag for a compression code that is for an XML start tag related to the type information, the decompression start flag instructing commencement of decompression of a compression code string subsequent to the XML start tag; detecting, in the compressed XML file, the compression code for which the decompression start flag has been set; and decompressing, when the compression code for which the decompression start flag has been set is detected, the compression code string, using the Huffman tree. | 05-27-2010 |
20100131476 | COMPUTER PRODUCT, INFORMATION RETRIEVAL METHOD, AND INFORMATION RETRIEVAL APPARATUS - A computer-readable recording medium stores therein an information retrieval program that causes a computer to execute a retrieval process in which files to be retrieved are narrowed down by using a bit string for each character in the files to find characters making up a retrieval keyword to retrieve a keyword identical to or related to the retrieval keyword in the files to be retrieved. The bit strings indicate the presence of the characters in the files. The information retrieval program causes the computer to execute extracting, from among the bit strings, a bit string of an arbitrary character; and compressing the extracted bit string, by using a special Huffman tree having leaves of plural types of symbol strings covering patterns represented by a predetermined number of bits and a special symbol string having a number of bits greater than the predetermined number of bits. | 05-27-2010 |
20100153349 | Continuous, automated database-table partitioning and database-schema evolution - Embodiments of the present invention are directed to methods and computational subsystems employed in database-management systems that continuously partition relational-database tables in order to ameliorate database-management-system execution bottlenecks and inefficiencies. Certain embodiments of the present invention employ the creation and instantiation of templates in order to continuously partition a database, while other embodiments of the present invention provide high-level-interface support for on-going relational-database-table partitioning. | 06-17-2010 |
20100161567 | Compressed data page with uncompressed data fields - Systems, methods, and other embodiments associated with a compressed data page that includes uncompressed data fields are described. One example method includes compressing user records and storing them on a compressed data page and then storing one or more uncompressed data fields on the compressed data page such that the uncompressed data fields can be updated in place without uncompressing the compressed data page. | 06-24-2010 |
20100161568 | Data Compression by Multi-Order Differencing - Embodiments of the present invention enable compression and decompression of data. Applications of the present invention are its use in embodiments of systems for compression and decompression of GPS long-term Ephemeris (LTE) data, although the present invention is not limited to such applications. In embodiments, the LTE data may be grouped into a set of data values associated with a parameter. In embodiments, a data set may be compressed by using a multi-order differencing scheme. In such a scheme, a set of the differences between values may be compressed because the differences have smaller magnitudes than the values. In embodiments, a multi-order differencing scheme determines how many levels (orders) of differencing may be applied to an original data set before it is compressed. In embodiments, the original data may be recovered from a compressed data set based on the type of multi-order differencing scheme used to generate the compressed data. | 06-24-2010 |
20100217753 | Multi-stage quantization method and device - The invention discloses a multi-stage quantization method, which includes the following steps: obtaining a reference codebook according to a previous stage codebook; obtaining a current stage codebook according to the reference codebook and a scaling factor; and quantizing an input vector by using the current stage codebook. The invention also discloses a multi-stage quantization device. With the invention, the current stage codebook may be obtained according to the previous stage codebook, by using the correlation between the current stage codebook and the previous stage codebook. As a result, it does not require an independent codebook space for the current stage codebook, which saves the storage space and improves the resource usage efficiency. | 08-26-2010 |
20100223236 | DYNAMIC AGGREGATION OF CONTENT BASED ON A FALLBACK DEFINITION - A processing device and a method dynamically aggregates desired content. A user may specify a fallback definition for a pivot point via a user interface. The fallback definition may include at least a first and a second desired value for the pivot point. Only assets, including content having metadata with values corresponding to provided desired values for one or more pivot points, may be listed in a created list of assets. The list of assets may be sorted to provide a sorted list of assets, which may be ordered such that assets having content associated with a first desired value of the fallback definition appear before assets having content associated with a second desired value of the fallback definition. Each respective asset listed in the sorted list of assets may be added to a set of aggregated content when an equivalent asset is not already included in the set. | 09-02-2010 |
20100223237 | LOSSLESS DATA COMPRESSION AND REAL-TIME DECOMPRESSION - A method, information processing system, and computer program storage product store data in an information processing system. Uncompressed data is received and the uncompressed data is divided into a series of vectors. A sequence of profitable bitmask patterns is identified for the vectors that maximizes compression efficiency while minimizes decompression penalty. Matching patterns are created using multiple bit masks based on a set of maximum values of the frequency distribution of the vectors. A dictionary is built based upon the set of maximum values in the frequency distribution and a bit mask savings which is a number of bits reduced using each of the multiple bit masks. Each of the vectors is compressed using the dictionary and the matching patterns with having high bit mask savings. The compressed vectors are stored into memory. Also, an efficient placement is developed to enable parallel decompression of the compressed codes. | 09-02-2010 |
20100223238 | DOWNLOADING MAP DATA THROUGH A COMMUNICATION NETWORK - Various methods and systems are disclosed for downloading geographical map data representative of at least one map image of a geographic area. In response to a request for map data, a subset of geographical map data is provided. The subset of geographical map data may be arranged into one or more minimal-sorted groups. The minimal-sorted groups may be used to restore the map data represented by the subset. | 09-02-2010 |
20100228703 | REDUCING MEMORY REQUIRED FOR PREDICTION BY PARTIAL MATCHING MODELS - Some embodiments of a method and an apparatus to reduce memory required for prediction by partial matching (PPM) models usable in data compression have been presented. In one embodiment, statistics of received data are accumulated in a tree of dynamic tree-type data structures. The data is compressed based on the statistics. The tree of dynamic tree-type data structures may be stored in a computer-readable storage medium. | 09-09-2010 |
20100241619 | BACKUP APPARATUS WITH HIGHER SECURITY AND LOWER NETWORK BANDWIDTH CONSUMPTION - A system for more secure, more efficient, more widely applicable backup, retention, and retrieval of data. An apparatus comprising improved means for de-duplication of data and securely storing data remotely with efficient retention and recovery. A method comprising disassembling data objects, efficiently de-duplicating, securely storing and retrieving backups in shared servers on a public network, and controlling retention. | 09-23-2010 |
20100262588 | METHOD OF ACCESSING A MEMORY - A method of accessing a memory is implemented on a memory system having a main memory and a secondary memory. The main memory includes a compressed data area that has a SWAP file zone predefined therein. When the main memory is insufficient for data buffering, the predefined SWAP file zone is provided for data swapping. A newly defined SWAP file zone is dynamically created in the compressed data area whenever each last defined SWAP file zone is insufficient for data swapping, and the secondary memory is substitute for data swapping only when the compressed data area has insufficient capacity. | 10-14-2010 |
20100268694 | SYSTEM AND METHOD FOR SHARING WEB APPLICATIONS - A system and a method for sharing web pages. In some embodiments, the following operations are preformed at a client computer system having one or more processors that execute one or more programs stored in memory of the client computer system. A representation of a web page that is displayed in a window of a web browser in a user interface of the client computer system is generated. The representation of the web page is partitioned into a plurality of tiles based on a document object model of the web page. For each tile in the plurality of tiles, it is determined whether the tile has changed relative to a previous version of the tile. In response to determining that the tile has changed, the tile that has changed is sent to a server. | 10-21-2010 |
20100268695 | SYSTEMS AND METHODS ASSOCIATED WITH HYBRID PAGED ENCODING AND DECODING - According to some embodiments, a system, method, means, and/or computer program code are provided to facilitation a compression of information. In some cases, uncompressed data may be divided into a plurality of portions. A first data density value may be determined for a first portion, and a second data density value may be determined for a second portion. Based on the first data density value, the first portion may be encoded using a first encoding technique. Similarly, the second portion may be encoded using a second encoding technique based on the second data density value. A compressed representation of the uncompressed data may then be stored in accordance with results of said encodings of the first and second portions. | 10-21-2010 |
20100274772 | COMPRESSED DATA OBJECTS REFERENCED VIA ADDRESS REFERENCES AND COMPRESSION REFERENCES - A computing device maintains a mapping of a virtual storage to a physical storage. The mapping includes address references from data included in the virtual storage to one or more compressed data objects included in the physical storage. At least one of the one or more compressed data objects has been compressed at least in part by replacing portions of an uncompressed data object with compression references to matching portions of previously generated compressed data objects. | 10-28-2010 |
20100274773 | NEARSTORE COMPRESSION OF DATA IN A STORAGE SYSTEM - A storage server is configured to receive a request to store a data block from a client. The request to store the data block is serviced by the storage server by compressing the data block into a compression group which includes a number of compressed data blocks. The storage server stores the compression group in a non-volatile memory and flushes the compression group from the non-volatile memory to a physical storage device in response to reaching a consistency point. By compressing data to be stored in system memory of a storage servers the amount of data that can be processed during a given time period by a data storage system is increased. Furthermore, an increase in performance can be achieved at a lower cost, since the cost of additional physical system memory modules can be avoided, | 10-28-2010 |
20100281004 | STORING COMPRESSION UNITS IN RELATIONAL TABLES - A database server stores compressed units in data blocks of a database. A table (or data from a plurality of rows thereof) is first compressed into a “compression unit” using any of a wide variety of compression techniques. The compression unit is then stored in one or more data block rows across one or more data blocks. As a result, a single data block row may comprise compressed data for a plurality of table rows, as encoded within the compression unit. Storage of compression units in data blocks maintains compatibility with existing data block-based databases, thus allowing the use of compression units in preexisting databases without modification to the underlying format of the database. The compression units may, for example, co-exist with uncompressed tables. Various techniques allow a database server to optimize access to data in the compression unit, so that the compression is virtually transparent to the user. | 11-04-2010 |
20100287143 | Relational Database Page-Level Schema Transformations - Methods, devices and systems which facilitate the conversion of database objects from one schema version (e.g., an earlier version) to another schema version (e.g., a newer version) without requiring the objects be unloaded and reloaded are described. In general, data object conversion applies to both table space objects and index space objects. The described transformation techniques may be used to convert any object whose schema changes occur at the page-level. | 11-11-2010 |
20100287144 | COMPRESSION SCHEME FOR IMPROVING CACHE BEHAVIOR IN DATABASE SYSTEMS - A scheme for accessing an index structure using a reference minimum bounding shape is disclosed. In one example embodiment, a reference minimum bounding shape that encloses two or more minimum bounding shapes may be identified from an index structure stored in memory. Each of the two or more minimum bounding shapes may correspond to a data object associated with a corresponding leaf node of the index structure. In one example embodiment, the index structure may be accessed using the reference minimum bounding shape. In one example embodiment, at least one minimum bounding shape of the two or more minimum bounding shapes may be represented in a relative representation calculated relative to the reference minimum bounding shape. Also disclosed are a method, a system and a non-transitory computer-readable storage medium for accomplishing the same scheme as described above. | 11-11-2010 |
20100293153 | METHOD AND DEVICE FOR COMPRESSING TABLE BASED ON FINITE AUTOMATA, METHOD AND DEVICE FOR MATCHING TABLE - A method for compressing a table based on finite automata (FA) includes analyzing transferring characteristics of all states in an original two-dimensional structure table and combining continual states with unified transferring characteristics in the original two-dimensional structure table. A method for matching a table based on FA, a device for compressing a table, and a device for matching a table are also provided. | 11-18-2010 |
20100299316 | BLOCK COMPRESSION OF TABLES WITH REPEATED VALUES - Methods and apparatus, including computer program products, for block compression of tables with repeated values. In general, value identifiers representing a compressed column of data may be sorted to render repeated values contiguous, and block dictionaries may be generated. A block dictionary may be generated for each block of value identifiers. Each block dictionary may include a list of block identifiers, where each block identifier is associated with a value identifier and there is a block identifier for each unique value in a block. Blocks may have standard sizes and block dictionaries may be reused for multiple blocks. | 11-25-2010 |
20100318498 | METHODS AND APPARATUS FOR ORGANIZING DATA IN A DATABASE - Disclosed are methods and apparatus for organizing data in a database in a set-oriented manner. Data is organized by linking data in the form of key-value pairs stored in storage media of the database to corresponding key-value pair identifiers. A set having a corresponding set identifier is then associated with one or more of the key-value pair identifiers where the set includes the stored key-value pairs linked to the key-value pair identifiers. | 12-16-2010 |
20100318499 | DECLARATIVE FRAMEWORK FOR DEDUPLICATION - A system, framework, and algorithms for data deduplication are described. A declarative language, such as a Datalog-type logic language, is provided. Programs in the language describe data to be deduplicated and soft and hard constraints that must/should be satisfied by data deduplicated according to the program. To execute the programs, algorithms for performing graph clustering are described. | 12-16-2010 |
20100318500 | BACKUP AND ARCHIVAL OF SELECTED ITEMS AS A COMPOSITE OBJECT - An archive of items, which are computing data accessed by a user, is created at a semantic object level. The object archiving may group seemingly disparate items as a composite object, which may then be stored to enable retrieval by the user at a later point in time. The composite object may include metadata from the various items to enable identifying the composite object, providing retrieval capabilities (e.g., search, etc.), and so forth. In some aspects, an archiving process may extract item data from an item that is accessed by a computing device. Next, the item may be selected by a schema for inclusion in a composite object when the item data meets criteria specified in the schema. The composite object(s) may then be stored in an object store as an archive (backup). | 12-16-2010 |
20100325094 | Data Compression For Reducing Storage Requirements in a Database System - A system, method, and computer program product for reducing data storage requirements in a database system are described herein. An embodiment includes identifying at least one data candidate of fixed length data type in at least one row of database data for compression based upon a predetermined threshold level and a boundary of compression, providing at least one bit within the at least one row for an identified data candidate according to the boundary of compression, and storing the at least one row as compressed data in the database system. For compression based on a row boundary, the identified data candidates for compression include fixed length columns having lengths that do not fall below the predetermined threshold level in a row of data and the at least one bit comprises a bitmap for a length of the identified data candidates following compression. For compression based on a page boundary, the identified data candidates for compression include redundant byte string data in a page of data, the redundant byte string data including matching data across columns having lengths that do not exceed the predetermined threshold level. | 12-23-2010 |
20100325095 | PERMUTING RECORDS IN A DATABASE FOR LEAK DETECTION AND TRACING - A method comprises receiving, by a processor, a copy of a database containing records, each record having a plurality of attributes. The method also comprises determining, by the processor, whether a first attribute in each record results in a predetermined value in modulo P when hashed with a key and determining, by the processor, whether a second attribute in each record results in the predetermined value in modulo P when hashed with a key. For a first record whose first attribute results in the predetermined value in modulo P when hashed with a key and a second record whose second attribute also results in the predetermined value in modulo P when hashed with a key, the method further comprises swapping by the processor the second attributes between the first and second records. | 12-23-2010 |
20100325096 | MAXIMIZING SYSTEM RESOURCES USED TO DECOMPRESS READ-ONLY COMPRESSED ANALYTIC DATA IN A RELATIONAL DATABASE TABLE - A method, computer program product and system for minimizing system resources used to decompress read-only compressed analytic data in a relational database table. An i-code list associated with a relational database table is converted into a programming language. The programming language is compiled in object code and stored in a module in the user's system. The object code is called with a pointer designating the particular row in the database containing the compressed data to be decompressed. The compressed data designated by the pointer is decompressed upon execution of the object code. By having the source code for decompressing the compressed data stored as object code in the user's system, the interpretation step (as used in the i-code method) is avoiding thereby reducing the number of machine cycles used to decompress the compressed data. As a result, query programs will be able to access large amounts of data more quickly. | 12-23-2010 |
20100332462 | MAXIMIZING SYSTEM RESOURCES USED TO DECOMPRESS READ-ONLY COMPRESSED ANALYTIC DATA IN A RELATIONAL DATABASE TABLE - A method, computer program product and system for minimizing system resources used to decompress read-only compressed analytic data in a relational database table. An i-code list associated with a relational database table is converted into a programming language. The programming language is compiled in object code and stored in a module in the user's system. The object code is called with a pointer designating the particular row in the database containing the compressed data to be decompressed. The compressed data designated by the pointer is decompressed upon execution of the object code. By having the source code for decompressing the compressed data stored as object code in the user's system, the interpretation step (as used in the i-code method) is avoiding thereby reducing the number of machine cycles used to decompress the compressed data. As a result, query programs will be able to access large amounts of data more quickly. | 12-30-2010 |
20110016096 | OPTIMAL SEQUENTIAL (DE)COMPRESSION OF DIGITAL DATA - Methods and apparatus involve an original data stream arranged as a plurality of symbols. Of those symbols, all possible tuples are identified and the highest or most frequently occurring tuple is determined. A new symbol is created and substituted for each instance of the highest occurring tuple, which results in a new data stream. The new data stream is encoded and its size determined. Also, a size of a dictionary carrying all the original and new symbols is determined. The encoding size, the size of the dictionary and sizes of any other attendant overhead is compared to a size of the original data to see if compression has occurred, and by how much. Upon reaching pre-defined objectives, compression ceases. Decompression occurs oppositely. Other features include resolving ties between equally occurring tuples, path weighted Huffman coding, storing files, decoding structures, and computing arrangements and program products, to name a few. | 01-20-2011 |
20110016097 | FAST APPROXIMATION TO OPTIMAL COMPRESSION OF DIGITAL DATA - A “fast approximation” of compression of current data involves using information obtained from an earlier compression of similar data. It overcomes the iterative process of discovering a unique set of optimal symbols. Representatively, a dictionary of symbols corresponding to original data from an earlier compressed file is extracted. Original bits are then obtained from the symbols. Sequences of the original bits are identified in the current data of a current file under consideration. A new bit stream for the current file is created from the original bits and according to the symbols they represent. Every occurrence of the symbols is counted in the new bit stream and a path-weighted Huffman tree is created from the counted occurrences. A coding from the Huffman tree ensues, along with an end-of-file marker. The latter is stored in a new compression file, including the dictionary earlier extracted from the earlier compressed file. | 01-20-2011 |
20110016098 | GROUPING AND DIFFERENTIATING VOLUMES OF FILES - Methods and apparatus teach a digital spectrum of a file. The digital spectrum is used to map a file's position in a multi-dimensional space. This position relative to another file's position reveals distances between the files. Closest files can be grouped together. When contemplating voluminous numbers of files for digital spectrums, various methods include: concatenating all such files together to get a single key useful for creating a file's spectrum; or compressing files individually and combining their collective dictionaries into a single dictionary that defines the digital spectrum. Each provides advantage over the other. The latter consumes considerably less run time because each compression event can be distributed to a separate processor. Method two provides better spectrums because it is more “informationally” valid than is method one. | 01-20-2011 |
20110029492 | System and Method for Implementing a Reliable Persistent Random Access Compressed Data Stream - System and method for implementing a reliable persistent random access compressed data stream is described. In one embodiment, the system comprises a computer-implemented journaled file system that includes a first file for storing a series of independently compressed blocks of a data stream; a second file for storing a series of indexes corresponding to the compressed blocks, wherein each one of the indexes comprises a byte offset into the first file of the corresponding compressed block; and a third file for storing a chunk of data from the data stream before it is compressed and written to the first file. The system further comprises a writer module for writing uncompressed data to the third file and writing indexes to the second file and a compressor module for compressing a chunk of data from the third file and writing it to the end of the first file. | 02-03-2011 |
20110040735 | SYSTEM AND METHOD FOR COMPRESSING FILES - A system and method for compressing files obtains a file to be compressed, divides the file into different sections. The system and method further compresses each section with an image compression algorithm or a text compression algorithm according a type of each section, and connects all compressed sections to obtain a compressed file. | 02-17-2011 |
20110055174 | STORAGE SYSTEM DATA COMPRESSION ENHANCEMENT - Data segments are logically organized in clusters in a data repository of a data storage system. Each clusters contains compressed data segments and data common to the compression of the segments, such as a dictionary. In association with a write request, it is determined in which of the clusters would the data segment most efficiently be compressed, and the data segment is stored in that data cluster. | 03-03-2011 |
20110071990 | Fast History Based Compression in a Pipelined Architecture - A novel and useful system and method of fast history compression in a pipelined architecture with both speculation and low-penalty misprediction recovery. The method of the present invention speculates that a current input byte does not continue an earlier string, but either starts a new string or represents a literal (no match). As previous bytes are checked if they start a string, the method of the present invention detects if speculation for the current byte is correct. If speculation is not correct, then various methods of recovery are employed, depending on the repeating string length. | 03-24-2011 |
20110071991 | SYSTEMS AND METHODS FOR GEOMETRIC DATA COMPRESSION AND ENCRYPTION - Systems, methods, and physical computer-readable storage media for performing geometric data compression and geometric data decompression and/or geometric data encryption and geometric data decryption. A virtual geometric compression object is generated within a computer system by defining a plurality of discrete elements arranged in a geometric shape and assigning one or more data bit values to each of the plurality of discrete elements. The virtual geometric compression object is used by the computer system to compress sequences of uncompressed data bits into compression definitions. A compression definition defines a path through the virtual geometric compression object corresponding to a sequence of uncompressed data bits. In a reverse manner, for data decompression, at least a portion of a virtual geometric compression object is generated and a compression definition is used to extract a corresponding sequence of uncompressed data bits from the portion of the virtual geometric compression object. | 03-24-2011 |
20110082842 | DATA COMPRESSION ALGORITHM SELECTION AND TIERING - A data storage subsystem having a plurality of data compression engines configured to compress data, each having a different compression algorithm. A data handling system is configured to select at least one sample of data; operate a plurality of the data compression engines to compress the selected sample(s); determine the compression ratios of the operated data compression engines with respect to the selected sample(s); and select the data compression engine having the greatest compression ratio with respect to the selected sample(s), to compress the data. Further, the data compression engines may be in tiers from low to high in accordance with expected latency to compress data and to uncompress compressed data; and a data compression engine is selected from a tier that is inverse to the present rate of access. | 04-07-2011 |
20110082843 | DATABASE SYSTEM, METHOD OF MANAGING DATABASE, DATABASE STRUCTURE, AND COMPUTER PROGRAM - The database system includes: a storage unit that stores a database including an entity data group and a plurality of identifier tables having only a plurality of fixed-length data; and a data processing unit that receives a query and performs data processing on the database on the basis of the received query. Each of the identifier tables includes at least one tuple that is defined in a row direction and at least one attribute field that is defined in a column direction and includes a plurality of data identifiers uniquely indicating the plurality of entity data as the fixed-length data. The database includes a link table that connects the tuples between the identifier tables, in addition to the plurality of identifier tables. The data processing unit performs the data processing using the link table and the identifier tables. | 04-07-2011 |
20110087640 | Data Compression and Storage Techniques - Provided are systems and methods for use in data archiving. In one arrangement, compression techniques are provided wherein an earlier version of a data set (e.g., file folder, etc) is utilized as a dictionary of a compression engine to compress a subsequent version of the data set. This compression identifies changes between data sets and allows for storing these differences without duplicating many common portions of the data sets. For a given version of a data set, new information is stored along with metadata used to reconstruct the version from each individual segment saved at different points in time. In this regard, the earlier data set and one or more references to stored segments of a subsequent data set may be utilized to reconstruct the subsequent data set. | 04-14-2011 |
20110099155 | FAST BATCH LOADING AND INCREMENTAL LOADING OF DATA INTO A DATABASE - Embodiments of the present invention provide for batch and incremental loading of data into a database. In the present invention, the loader infrastructure utilizes machine code database instructions and hardware acceleration to parallelize the load operations with the I/O operations. A large, hardware accelerator memory is used as staging cache for the load process. The load process also comprises an index profiling phase that enables balanced partitioning of the created indexes to allow for pipelined load. The online incremental loading process may also be performed while serving queries. | 04-28-2011 |
20110119240 | METHOD AND SYSTEM FOR GENERATING A BIDIRECTIONAL DELTA FILE - The present invention relates to a system and method of generating an encoded bidirectional delta file to be used for reconstructing target and source files by decoding said bidirectional delta file, each of said target and source files comprising one or more substantially identical substrings, wherein each of said substrings is encoded within said bidirectional delta file by using a single pointer. | 05-19-2011 |
20110125722 | METHODS AND APPARATUS FOR EFFICIENT COMPRESSION AND DEDUPLICATION - Mechanisms are provided for performing efficient compression and deduplication of data segments. Compression algorithms are learning algorithms that perform better when data segments are large. Deduplication algorithms, however, perform better when data segments are small, as more duplicate small segments are likely to exist. As an optimizer is processing and storing data segments, the optimizer applies the same compression context to compress multiple individual deduplicated data segments as though they are one segment. By compressing deduplicated data segments together within the same context, data reduction can be improved for both deduplication and compression. Mechanisms are applied to compensate for possible performance degradation. | 05-26-2011 |
20110137875 | INCREMENTAL MATERIALIZED VIEW REFRESH WITH ENHANCED DML COMPRESSION - An incremental refresh of a materialized view may be simplified, and therefore made more cost efficient, by reducing the number of DML operations being merged with the materialized view during the incremental refresh. Specifically, subsequences of sequences of data manipulation language operations that have been recorded for a particular row of a base table may be inspected to determine whether the subsequences conform to particular patterns of data manipulation language operator types. If a subsequence conforms to one of the particular patterns, the subsequence may be replaced with a single substitute: either a single data manipulation language operation, or null. Refresh operations that are generated based on the simplified sequences of data manipulation language operations are more simple, and therefore, less costly to perform. | 06-09-2011 |
20110153577 | Query Processing System and Method for Use with Tokenspace Repository - A search engine server system receives from a client system a search query and identifies a set of documents in accordance with the search query. A content snippet corresponding to content in a respective document of the identified set of documents is generated, the content snippet associated with at least one query term of the one or more query terms in the search query. A response to the search query is returned to the client system, the response including information identifying at least the respective document and including the content snippet. Generating the content snippet includes performing a first decompression operation on first token identifiers, from a compressed document repository, to provide a set of second token identifiers, and performing a second decompression operation on the set of second token identifiers to recover uncompressed content comprising a portion of the respective document. | 06-23-2011 |
20110173163 | OPTIMIZING MEDIA PLAYER MEMORY DURING RENDERING - Optimizing operation of a media player during rendering of media files. The invention includes authoring software to create a data structure and to populate the created data structure with obtained metadata. The invention also includes rendering software to retrieve the metadata from the data structure and to identify media files to render. In one embodiment, the invention is operable as part of a compressed media format having a set of small files containing metadata, menus, and playlists in a compiled binary format designed for playback on feature-rich personal computer media players as well as low cost media players. | 07-14-2011 |
20110173164 | STORING TABLES IN A DATABASE SYSTEM - A method for processing data contained in tables in a relational database includes joining a first table and a second table into a joined table determining metadata for at least one column of a table of the following tables: the first table, the second table, and the joined table. The metadata is used for processing data in the at least one column of the table, and for processing data in at least one column of at least one other table of the following tables: the first table, the second table, and the joined table. | 07-14-2011 |
20110173165 | MANAGEMENT OF PERFORMANCE DATA - A method of handling performance data comprising a set of events is described. An event record for each event is stored as a set of blocks, each block containing one or more attributes of the event. The storage space occupied by each event record in is then reduced in discrete steps, each step including a reduction process that reduces the size of one of the set of blocks. This enables the provision of intermediate records between events and counters so that new event records contain complete details of their event, older event records contain less information, and even older event records may contain only high-level (counter) information. | 07-14-2011 |
20110173166 | GENERATING AND MERGING KEYS FOR GROUPING AND DIFFERENTIATING VOLUMES OF FILES - Methods and apparatus teach a digital spectrum of a file. The digital spectrum is used to map a file's position in a multi-dimensional space. This position relative to another file's position reveals distances between the files. Closest files can be grouped together. When contemplating voluminous numbers of files for digital spectrums, various methods include: concatenating all such files together to get a single key useful for creating a file's spectrum; or compressing files individually and combining their collective dictionaries into a single dictionary with or without the use of tree mechanisms that defines the digital spectrum. Each provides advantage over the other. The latter consumes considerably less run time because each compression event can be distributed to a separate processor. Method two provides better spectrums because it is more “informationally” valid than is method one. | 07-14-2011 |
20110173167 | METHOD AND APPARATUS FOR WINDOWING IN ENTROPY ENCODING - The present invention provides efficient window partitioning algorithms for entropy-encoding. The present invention enhances compression performance of entropy encoding based on the approach of modeling a dataset with the frequencies of its n-grams. The present invention may then employ approximation algorithms to compute good partitions in time O(s*log s) and O(s) respectively, for any data segment S with length s. | 07-14-2011 |
20110184922 | METHOD FOR FINDING FREQUENT ITEMSETS OVER LONG TRANSACTION DATA STREAMS - Provided is a method for finding frequent itemsets over logn transaction data streams. The method for finding frequent itemsets from data streams includes: (a) generating a plurality of projection tractions by projecting generated transactions; (b) mining each of the plurality of projection transactions by using a plurality of first layer prefix trees; (c) compressing the frequent itemsets generated at the first layer prefix tree to generate compressed itemsets; and (d) merging the generated compressed itemsets and mining the merged compressed itemsets by using a second layer prefix tree. Therefore, the present invention can effectively perform the frequent itemsets in the long transaction data stream environment. | 07-28-2011 |
20110196849 | METHOD AND APPARATUS FOR COMPRESSING AND DECOMPRESSING DATA RECORDS - A data compression method is provided according to an embodiment of the invention. The data compression method comprises receiving a first data record and at least a second data record. The first data record is compared to the second data record. The second data record is compressed as a difference between the first data record and the second data record. | 08-11-2011 |
20110202509 | EFFICIENT EXTRACTION AND COMPRESSION OF DATA - A device for dynamically extracting and compressing information for a streaming media asset is provided. One embodiment of the device provides a computing device comprising a processor and memory comprising instructions stored therein that are executable by the processor. The instructions stored in the memory are executable to provide to a requesting computing device dynamically compressed information for a streaming media asset, the dynamically compressed information derived from an information file comprising variable data elements arranged in one or more data fields according to a well-known structure. For example, the instructions are executable to receive from the requesting computing device a request for the compressed information, extract the variable data elements from the information file, compress the variable data elements to form compressed data elements, and send to the requesting computing device a compressed file comprising the compressed data elements. | 08-18-2011 |
20110202510 | SUPPORTING MULTIPLE DIFFERENT APPLICATIONS HAVING DIFFERENT DATA NEEDS USING A VOXEL DATABASE - A system can include a voxel database and the set of applications. The voxel database can include a set of voxel indexed records, wherein the voxel database manages a volumetric storage space corresponding to a real-world volumetric space, where units of real-world volumetric space and data specific to these units map to voxels and attributes of voxel indexed records. Each of the applications can include a user interface that renders a volumetric simulation space that corresponds to the volumetric storage space. Geospatial data for the simulation space can include visual attributes used to render a graphical user interface representation of the simulation space, where these visual attributes are acquired from the voxel database. The applications can have different geospatial formatting and content needs from each other, yet the content needs of each of the applications can be supported by the voxel database. | 08-18-2011 |
20110218974 | SYSTEMS AND METHODS FOR COMPRESSING FILES FOR STORAGE AND OPERATION ON COMPRESSED FILES - Methods and systems for creating, reading, and writing compressed files in a computer system comprising a file system coupled with storage medium and at least one application program interface (API) configured to communicate with the file system by means of file access-related requests are provided. The file access-related requests are intercepted in order to provide at least one of the following: a) to derive and compress data corresponding to the intercepted file access request and to facilitate storing the compressed data at the storage medium as a compressed file; b) to facilitate restoring at least part of compressed data corresponding to the intercepted file request and communicating the resulting data through the API. The compressed files comprise plurality of compressed units. One or more corresponding compressed units may be read and/or updated with no need of restoring the entire file whilst maintaining de-fragmented structure of the compressed file. | 09-08-2011 |
20110218975 | METHOD AND SYSTEM FOR COMPRESSION OF FILES FOR STORAGE AND OPERATION ON COMPRESSED FILES - Methods and systems for creating, reading, and writing compressed data for use with a block mode access storage. The compressed data are packed into a plurality of compressed units and stored in a storage logical unit (LU). One or more corresponding compressed units may be read and/or updated with no need of restoring the entire storage logical unit while maintaining a de-fragmented structure of the LU. | 09-08-2011 |
20110218976 | METHOD AND SYSTEM FOR COMPRESSION OF FILES FOR STORAGE AND OPERATION ON COMPRESSED FILES - Systems and methods for creating, reading, and writing compressed files for use with a file access storage. The compressed data of a raw file are packed into a plurality of compressed units and stored as compressed files. One or more corresponding compressed units may be read and/or updated with no need for restoring the entire file while maintaining a de-fragmented structure of the compressed file. | 09-08-2011 |
20110218977 | SYSTEMS AND METHODS FOR COMPRESSION OF DATA FOR BLOCK MODE ACCESS STORAGE - Systems and methods for creating, reading, and writing compressed data for use with a block mode access storage. The compressed data are packed into plurality of compressed units and stored in a storage logical unit (LU). One or more corresponding compressed units may be read and/or updated with no need of restoring the entire storage logical unit while maintaining de-fragmented structure of the LU. | 09-08-2011 |
20110225131 | STORAGE SYSTEM, STORAGE CONTROLLER AND DATA COMPRESSION METHOD - Provided are a storage system, a storage controller and a data compression method that will not deteriorate the performance of a storage system by adjusting the compression processing time required for compressing data. This storage controller includes a statistical information management unit for managing a throughput value decided based on statistical information of previously compressed file data as a system performance value, a mode setting unit for setting a compression mode of either a system performance priority mode that terminates compression of the file data and gives priority to system performance or a compression priority mode that gives priority to compression of the file data based on a comparison between a system performance target value preset as a target value and the system performance value, and a compression controller for compressing the file data based on the compression mode set with the mode setting unit. | 09-15-2011 |
20110231376 | METHOD FOR PROCESSING DATA IN TREE FORM AND DEVICE FOR PROCESSING DATA - The data processing method reversibly processing data information input to a data processing device by a processing unit including a data volume reducing unit reducing a data volume of the data information, and a developing unit reconstructing data information reduced in the data volume reducing unit. The processing unit is structured by overlaying processing layers formed of a plurality of cells. The data volume reducing unit performs unit processing on each of the plurality of cells having the data information. The unit processing performs identification processing by a weight according to equivalence and distance of data from a cell group adjacent to the cells, and reduces the cells by each of the processing layers in an order from a lower layer to an upper layer of the processing layers until a data position existing on a time axis of the cells stops to thereby reduces the data volume. | 09-22-2011 |
20110231377 | METHOD OF MANAGING STORAGE AND RETRIEVAL OF DATA OBJECTS - A technique for managing storage of a data object in a storage device involves receiving the data object (A) to store in the storage device, where the data object has an indicator bit pattern (P). Successive compression data transformations are applied to data object A to obtain respective corresponding compressed data objects, and one of these compressed data objects is selected, such that the selected compressed data object (C) has the shortest length with respect to the remaining compressed data objects. Compression information (I) is then associated with the compression data transformation used to generate data object C, and a threshold value T is calculated at least partly from the length of compression information I. If length (C)+T.gtoreq.length (A), then the indicator bit pattern of data object A is reset and the data object A is written to the storage device. If length (C)+T09-22-2011 | |
20110238635 | Combining Hash-Based Duplication with Sub-Block Differencing to Deduplicate Data - In one embodiment, a method includes accessing data; partitioning the data into sub-blocks; determining whether a first one of the sub-blocks is identical to another one of the sub-blocks or similar to another one of the sub-blocks; if the first one of the sub-blocks is identical to another one of the sub-blocks, applying by the one or more computer systems hash-based deduplication to storage of the first one of the sub-blocks with respect to the other one of the sub-blocks; and, if the first one of the sub-blocks is similar to another one of the sub-blocks, applying by the one or more computer systems sub-block differencing to storage of the first one of the sub-blocks with respect to the other one of the sub-blocks. | 09-29-2011 |
20110238636 | DATA CONVERSION DEVICE, DATA CONVERSION METHOD, AND PROGRAM - There is realized a data conversion device that performs generation of a hash value with improved analysis resistance and a high degree of safety. There are provided a stirring processing section performing a data stirring process on input data; and a compression processing section performing a data compression process on input data including data segments which are divisions of message data, the message data being a target of a data conversion. Part of multi-stage compression subsections is configured to perform a data compression process based on both of output of the stirring processing section and the data segments in the message data. There is provided such a configuration that the stirring process is executed at least on fixed timing of a compression processing round of plural rounds and thus, there is realized a data conversion device that performs generation of a hash value with improved analysis resistance and a high degree of safety. | 09-29-2011 |
20110246432 | ACCESSING DATA IN COLUMN STORE DATABASE BASED ON HARDWARE COMPATIBLE DATA STRUCTURES - Embodiments of the present invention provide one or more hardware-friendly data structures that enable efficient hardware acceleration of database operations. In particular, the present invention employs a column-store format for the database. In the database, column-groups are stored with implicit row ids (RIDs) and a RID-to-primary key column having both column-store and row-store benefits via column hopping and a heap structure for adding new data. Fixed-width column compression allow for easy hardware database processing directly on the compressed data. A global database virtual address space is utilized that allows for arithmetic derivation of any physical address of the data regardless of its location. A word compression dictionary with token compare and sort index is also provided to allow for efficient hardware-based searching of text. A tuple reconstruction process is provided as well that allows hardware to reconstruct a row by stitching together data from multiple column groups. | 10-06-2011 |
20110252007 | METHOD OF STORING DATA IN STORAGE MEDIA, DATA STORAGE DEVICE USING THE SAME, AND SYSTEM INCLUDING THE SAME - A method of storing data in a storage media includes compressing raw data based on a physical storage unit of the storage media and storing the compressed data in the storage media. The physical storage unit of the storage media storing the compressed data includes an update region into which update data may be written. | 10-13-2011 |
20110252008 | Intelligent Data Storage and Processing Using FPGA Devices - Methods and apparatuses for processing data are disclosed, including methods and apparatuses that leverage a reconfigurable logic device to offload decompression and search operations from a processor to thereby enable high speed data searches within data that has been stored in a compressed format. | 10-13-2011 |
20110264632 | SYSTEMS AND METHODS FOR TRANSFORMATION OF LOGICAL DATA OBJECTS FOR STORAGE - Methods and systems for transforming a logical data object for storage in a storage device configured to operate with at least one storage protocol. One method comprises creating in the storage device a transformed logical data object comprising a one or more allocated storage sections with a predefined size and receiving one or more data chunks corresponding to the transformed logical data object. The method further comprises determining if each received data chunk comprises a predefined criterion, transforming each data chunk that comprises the predefined criterion, maintaining each data chuck in raw form that does not comprise the predefined criterion, and sequentially storing each transformed data chuck and data chunk in raw form into said one or more allocated storage sections in accordance with an order said transformed data chunks and data chunks in raw form are received. One system comprises a processor configured to perform the above method. | 10-27-2011 |
20110264633 | SYSTEMS AND METHODS FOR TRANSFORMATION OF LOGICAL DATA OBJECTS FOR STORAGE - Methods and systems for transforming a logical data object for storage in a storage device configured to operate with at least one storage protocol. One method comprises creating in the storage device a transformed logical data object comprising a one or more allocated storage sections with a predefined size and receiving one or more data chunks corresponding to the transformed logical data object. The method further comprises determining if each received data chunk comprises a predefined criterion, transforming each data chunk that comprises the predefined criterion, maintaining each data chuck in raw form that does not comprise the predefined criterion, and sequentially storing each transformed data chuck and data chunk in raw form into said one or more allocated storage sections in accordance with an order said transformed data chunks and data chunks in raw form are received. One system comprises a processor configured to perform the above method. | 10-27-2011 |
20110264634 | SYSTEMS AND METHODS FOR TRANSFORMATION OF LOGICAL DATA OBJECTS FOR STORAGE - Systems and methods for compressing a raw logical data object ( | 10-27-2011 |
20110276545 | SYSTEMS AND METHODS FOR TRANSFORMATION OF LOGICAL DATA OBJECTS FOR STORAGE - Systems and methods for compressing a raw logical data object ( | 11-10-2011 |
20110276546 | SYSTEMS AND METHODS FOR TRANSFORMATION OF LOGICAL DATA OBJECTS FOR STORAGE - Systems and methods for compressing a raw logical data object ( | 11-10-2011 |
20110276547 | SYSTEMS AND METHODS FOR TRANSFORMATION OF LOGICAL DATA OBJECTS FOR STORAGE - Systems and methods for compressing a raw logical data object ( | 11-10-2011 |
20110276548 | SYSTEMS AND METHODS FOR TRANSFORMATION OF LOGICAL DATA OBJECTS FOR STORAGE - Systems and methods for compressing a raw logical data object ( | 11-10-2011 |
20110282849 | System and Method for Data Compression Using Compression Hardware - A system and method for data compression using compression hardware is disclosed. In accordance with the method, a data set in a data stream is received. The data set includes a set of data descriptor fields. The data set is portioned into one or more data subsets using the set of data descriptor fields. One or more tabular slices and an index are generated for at least one of the data subsets using the set of data descriptor fields. The one or more tabular slices are identified by the index. The one or more tabular slices are compressed into a compressed data block by a data compression scheme using a hardware compressor. A compression data file is generated in a database. The compression data file has a header that stores information about the data compression scheme. The compressed data block is stored in the compression data file. | 11-17-2011 |
20110295817 | Technique For Compressing XML Indexes - A method and apparatus for XXX is provided. | 12-01-2011 |
20110295818 | METHOD AND SYSTEM FOR TRANSFORMATION OF LOGICAL DATA OBJECTS FOR STORAGE - Various embodiments for transforming a logical data object for storage in a storage device operable with at least one storage protocol are provided. In one such embodiment, the logical data object into one or more segments are divided with each segment characterized by respective start and end offsets. One or more obtained variable size data chunks are processed corresponding to the logical data object to obtain processed data chunks, wherein at least one of the processed data chunks comprises transformed data resulting from the processing. Each of the variable size data chunks is associated with a respective segment of the logical data object. The processed data chunks are sequentially accommodated in accordance with an order of chunks received while keeping the association with the respective segments | 12-01-2011 |
20110295819 | METHOD AND SYSTEM FOR TRANSFORMATION OF LOGICAL DATA OBJECTS FOR STORAGE - Various embodiments for transforming a logical data object for storage in a storage device operable with at least one storage protocol are provided. In one such embodiment, the logical data object into one or more segments are divided with each segment characterized by respective start and end offsets. One or more obtained variable size data chunks are processed corresponding to the logical data object to obtain processed data chunks, wherein at least one of the processed data chunks comprises transformed data resulting from the processing. Each of the variable size data chunks is associated with a respective segment of the logical data object. The processed data chunks are sequentially accommodated in accordance with an order of chunks received while keeping the association with the respective segments | 12-01-2011 |
20110295820 | BLOCK-BASED DIFFERENCING ALGORITHM - A system and method for a block based differencing algorithm which includes the ability to limit memory requirements regardless of source file sizes by splitting the source file into optimally sized blocks. The invention allows the blocks to be processed in any order allowing in-place operation. Further, the present invention allows a second stage compressor to match the compressor blocks to those used by the differencing algorithm to optimize compressor and decompressor performance. | 12-01-2011 |
20110313980 | COMPRESSION OF TABLES BASED ON OCCURRENCE OF VALUES - Methods and apparatus, including computer program products, for compression of tables based on occurrence of values. In general, a number representing an amount of occurrences of a frequently occurring value in a group of adjacent rows of a column is generated, a vector representing whether the frequently occurring value exists in a row of the column is generated, and the number and the vector are stored to enable searches of the data represented by the number and the vector. The vector may omit a portion representing the group of adjacent rows. The values may be dictionary-based compression values representing business data such as business objects. The compression may be performed in-memory, in parallel, to improve memory utilization, network bandwidth consumption, and processing performance. | 12-22-2011 |
20110320417 | DATABASE COMPRESSION - Apparatus, systems, and methods may operate to receive a set of ordered user-selected compression rules as a compression rule set comprising at least one compression threshold condition, to create or transform a database object with rows to be selectively compressed according to the compression rules in the compression rule set (providing a transformed object), and to publish at least a portion of the transformed object to one of a storage medium or a display screen. Other apparatus, systems, and methods are disclosed. | 12-29-2011 |
20110320418 | DATABASE COMPRESSION ANALYZER - Apparatus, systems, and methods may operate to receive requests to execute a plurality of compression and/or decompression mechanisms on one or more database objects; to execute each of the compression and/or decompression mechanisms, on a sampled basis, on the database objects; to determine comparative performance characteristics associated with each of the compression and/or decompression mechanisms; and to record at least some of the performance characteristics and/or derivative characteristics derived from the performance characteristics in a performance summary table. The table may be published to a storage medium or a display screen. Other apparatus, systems, and methods are disclosed. | 12-29-2011 |
20120005172 | INFORMATION SEARCHING APPARATUS, INFORMATION MANAGING APPARATUS, INFORMATION SEARCHING METHOD, INFORMATION MANAGING METHOD, AND COMPUTER PRODUCT - A computer-readable recording medium stores therein an information searching program that causes a computer having access to archives including a compressed file group of compressed files that are to be searched and that have described therein character strings, to execute: sorting the compressed files in descending order of access frequency of the compressed files; combining the compressed files in descending order of access frequency after the sorting at the sorting such that a storage capacity of a cache area for a storage area that stores therein the compressed file group is not exceeded by a combined size of the compressed files combined; and writing, from the storage area into the cache area, the compressed files combined at the combining, the compressed files combined being written prior to a search of the compressed files combined. | 01-05-2012 |
20120016847 | File Management System And Method - A file management system ( | 01-19-2012 |
20120023073 | Efficient Indexing of Documents with Similar Content - A set of documents may be stored and indexed as a compressed sequence of tokens. A set of documents are grouped into clusters. Sequences of tokens representing the clusters of documents are encoded to elide some repeating instances of tokens. A compressed sequence of tokens is generated from the compressed cluster sequences of tokens. Queries on the compressed sequence are performed by identifying cluster sequences within the compressed sequence that are likely to have documents that satisfy the query and then identifying, within these identified clusters, the documents that actually satisfies the query. | 01-26-2012 |
20120041930 | METHOD FOR OPTIMIZING THE STORAGE OF CALIBRATION DATA IN AN AUTOMOBILE ELECTRONIC CONTROL UNIT - A method for storing, in the rewritable memory of an automobile electronic control unit, calibration data functionally equivalent to a set of various models (M | 02-16-2012 |
20120041931 | SYSTEMS AND METHODS FOR DATA COMPRESSION AND DECOMPRESSION - By way of example only, in various embodiments, the present system and system is designed to reduce the size of data on a computer through compression, to improve hash, message digest, and checksum technology and their application to information and data storage, to improve uniqueness by using mutual exclusion in hash and checksum tests, to improve checksum tests providing better computer security, to create an XML compression format and to move binary compression formats to XML or other markup language, to utilize variable length hashes and message digests, and to create an XML based checksum that can be used to verify the integrity of files. | 02-16-2012 |
20120047113 | MULTIPLE-SOURCE DATA COMPRESSION - One embodiment of the present invention is directed to a method for compressing data generated by multiple data sources. The method includes steps of partitioning data generated by the multiple data sources into data partitions, the data included in each data partition containing inter-data-source redundancies and, for each data partition, compressing the data in the data partition to remove the inter-data-source redundancies. | 02-23-2012 |
20120059804 | Data compression and decompression using relative and absolute delta values - A data compressor is disclosed for receiving a data stream comprising a plurality of data items and for outputting a compressed data stream, said data compressor comprising: a data input for receiving said data stream; a delta value calculator for generating a compressed delta value, said delta value calculator being configured to receive said plurality of data items from said data input and being configured for at least some of said received data items to access said data store to determine if a related data item to said received data item is stored in said data store and: in response to said related data item being stored, to retrieve said related data item from said data store and to calculate a delta value from said received data item and said related data item and to output said delta value; and in response to said related data item not being stored in said data store to calculate a delta value from said received data item and a predetermined value and to output said delta value; a data store for storing said plurality of data items received at said data input; said data compressor further comprising: a data store controller for controlling the storage of said plurality of data items in said data store, said data store controller being configured to access said data store in response to receipt of a data item at said data input and to determine if a storage location is allocated to said data item and: if so to store said data item in said allocated storage location; and if not to allocate a storage location to said data item and to evict and discard any data stored in said allocated storage location and to store said data item in said allocated storage location. A data decompressor for decompressing the compressed data stream is also disclosed. | 03-08-2012 |
20120066188 | RECORDING / REPRODUCING METHOD AND RECORDING / REPRODUCING DEVICE - A recording/reproducing method includes: reading a first uncompressed data from a first recording medium; reading the first uncompressed data from a buffer memory at a speed higher than a normal reproduction speed and compressing the read first uncompressed data to generate a compressed data; and recording the generated compressed data in a second recording medium; determining whether a predefined unit volume of compressed data is recorded in the second recording medium; and when it is determined that the predefined unit volume of compressed data is not yet recorded in the second recording medium, generating a second uncompressed data by reading the compressed data from the second recording medium at a speed higher than the normal reproduction speed and decompressing the read compressed data when it is determined that the predefined unit volume of compressed data is already recorded in the second recording medium. | 03-15-2012 |
20120078859 | SYSTEMS AND METHODS TO UPDATE A CONTENT STORE ASSOCIATED WITH A SEARCH INDEX - Some aspects include determination of second document identifiers added to a search index. The search index associates each of a plurality of words with at least one of a plurality of first document identifiers. For each of the second document identifiers, metadata of a document identified by the second document identifier is added to a content store storing metadata of each document identified by the plurality of first document identifiers. | 03-29-2012 |
20120078860 | ALGORITHMIC COMPRESSION VIA USER-DEFINED FUNCTIONS - A method, apparatus, and article of manufacture for accessing data in a computer system. One or more user-defined functions (UDFs) implementing a desired compression or decompression algorithm are created, wherein the UDFs are associated with one or more columns of a table when the table is created or altered, in order to perform compression or decompression of data stored in the associated columns, such that the data is compressed by the UDF implementing the desired compression algorithm when the data is inserted or updated in the table, and the data is decompressed by the UDF implementing the desired decompression algorithm when the data is retrieved from the table. | 03-29-2012 |
20120078861 | METHOD FOR PROCESSING A DIGITAL FILE NOTABLY OF THE IMAGE, VIDEO AND/OR AUDIO TYPE - A method for processing a digital file of the images, video and/or audio type which comprises a phase for putting into line per color layer and/or per audio channel, digital data of any audio, image and video file, a compression phase using algorithm in which each compressed value VC | 03-29-2012 |
20120084271 | REPRESENTING AND MANIPULATING RDF DATA IN A RELATIONAL DATABASE MANAGEMENT SYSTEM - Techniques for generating hash values for instances of distinct data values. In the techniques, each distinct data value is mapped to hash value generation information which describes how to generate a unique hash value for instances of the distinct data value. The hash value generation information for a distinct data value is then used to generate the hash value for an instance of the distinct data value. The hash value generation information may indicate whether a collision has occurred in generating the hash values for instances of the distinct data values and if so, how the collision is to be resolved. The techniques are employed to normalize RDF triples by generating the UIDS employed in the normalization from the triples' lexical values. | 04-05-2012 |
20120089579 | COMPRESSION PIPELINE FOR STORING DATA IN A STORAGE CLOUD - A cloud storage appliance separates a point-in-time copy of a storage system into payload data chunks and metadata data chunks. The cloud storage appliance identifies a plurality of payload data chunks that have not been saved to a storage cloud. The cloud storage appliance compresses the plurality of payload data chunks. The cloud storage appliance groups the plurality of compressed payload data chunks into one or more cloud files, wherein each of the one or more cloud files is formatted for storage on the storage cloud. The cloud storage appliance then sends the one or more cloud files to the storage cloud. | 04-12-2012 |
20120102005 | FILE MANAGEMENT METHOD AND COMPUTER SYSTEM - To inhibit deterioration in the I/O performance of a file even if the file includes an area that is frequently accessed. | 04-26-2012 |
20120109908 | ACTIVE MEMORY EXPANSION AND RDBMS META DATA AND TOOLING - Techniques are described for estimating and managing memory compression for query processing. Embodiments of the invention may generally include receiving a query to be executed, ascertaining indicatory data about the retrieved data, and selectively compressing a portion of the data in memory according to the indicatory data. In one embodiment, the amount of compression performed during each query execution is recorded and outputted to assist in adjusting the selective compression process. | 05-03-2012 |
20120109909 | Random Access Data Compression - Methods, program products, and systems implementing random access data compression are disclosed. Data can be stored in a data structure in compressed or non-compressed form. The data structure can include a header block, one or more data blocks, and one or more index blocks. Each data block can include data compressed using different compression technology. The header block can include searchable references to the data blocks, which can be located in the data structure after the header block. The searchable references permit non-sequential access to the data blocks. The data blocks can be organized independent of a file system structure. The header block can additionally include references to the one or more index blocks, which can expand the references in the header block. | 05-03-2012 |
20120109910 | EFFICIENT COLUMN BASED DATA ENCODING FOR LARGE-SCALE DATA STORAGE - The subject disclosure relates to column based data encoding where raw data to be compressed is organized by columns, and then, as first and second layers of reduction of the data size, dictionary encoding and/or value encoding are applied to the data as organized by columns, to create integer sequences that correspond to the columns. Next, a hybrid greedy run length encoding and bit packing compression algorithm further compacts the data according to an analysis of bit savings. Synergy of the hybrid data reduction techniques in concert with the column-based organization, coupled with gains in scanning and querying efficiency owing to the representation of the compact data, results in substantially improved data compression at a fraction of the cost of conventional systems. | 05-03-2012 |
20120109911 | Compression Of XML Data - Methods of compressing XML source data include identifying each element type of the XML source data, generating a representation of element names for each identified element type, and generating a representation of data content for each instance of each element type separate from the representation of element names of the element types. | 05-03-2012 |
20120117038 | LAZY OPERATIONS ON HIERARCHICAL COMPRESSED DATA STRUCTURE FOR TABULAR DATA - A highly flexible and extensible structure is provided for physically storing tabular data. The structure, referred to as a compression unit, may be used to physically store tabular data that logically resides in any type of table-like structure. Techniques are employed to avoid changing tabular data within existing compression units. Deleting tabular data within compression units is avoided by merely tracking deletion requests, without actually deleting the data. Inserting new tabular data into existing compression units is avoided by storing the new data external to the compression units. If the number of deletions exceeds a threshold, and/or the number of new inserts exceeds a threshold, new compression units may be generated. When new compression units are generated, the previously-existing compression units may be discarded to reclaim storage, or retained to allow reconstruction of prior states of the tabular data. | 05-10-2012 |
20120124016 | ACTIVE MEMORY EXPANSION IN A DATABASE ENVIRONMENT TO QUERY NEEDED/UNEEDED RESULTS - Techniques are described for estimating and managing memory compression for request processing. Embodiments of the invention may generally include receiving a request for data, determining if the requested data contains any compressed data, and sending the requesting entity only the uncompressed data. A separate embodiment generally includes receiving a request for data, determining if the requested data contains any compressed data, gathering uncompression criteria about the requested data, and using the uncompression criteria to selectively determine what portion of the compressed data to uncompress. | 05-17-2012 |
20120124017 | COMPRESSION METHOD, DECOMPRESSION METHOD, COMPRESSION UNIT, DECOMPRESSION UNIT AND COMPRESSED DOCUMENT - A structured document having at least one informational unit with at least one character is divided, according to a first base type, into sections of a second base type. The sections are compressed according to specified compression instructions for the second base type to achieve an increased rate of compression. The informational elements may be expressed in an XML language. The compression method and corresponding compression unit, decompression method and decompression unit can be applied in the area of initialization of end devices, such as in systems engineering or in the IT consumer industry. | 05-17-2012 |
20120124018 | METHOD, PROGRAM, AND SYSTEM FOR PROCESSING OBJECT IN COMPUTER - A method, an article of manufacture, and system for heapifying an object. The method includes: storing, in a working set, a first address of a certain object in a stack frame, copying the certain object into the heap area and holding a second address of the certain object in the heap area, following each stack frame to find a pointer pointing to the first address stored in the working set, converting the address that the pointer points to into the second address, proceeding to a next stack frame, where the address conversion includes storing an address of another object in the working set if the converted address is stored as a value of a field of the other object in the stack frame, and terminating the process in response to a lack of pointers found in the stack frame to point to the addresses stored in the working set. | 05-17-2012 |
20120124019 | COMPRESSION OF TABLES BASED ON OCCURRENCE OF VALUES - Methods and apparatus, including computer program products, for compression of tables based on occurrence of values. In general, a number representing an amount of occurrences of a frequently occurring value in a group of adjacent rows of a column is generated, a vector representing whether the frequently occurring value exists in a row of the column is generated, and the number and the vector are stored to enable searches of the data represented by the number and the vector. The vector may omit a portion representing the group of adjacent rows. The values may be dictionary-based compression values representing business data such as business objects. The compression may be performed in-memory, in parallel, to improve memory utilization, network bandwidth consumption, and processing performance. | 05-17-2012 |
20120130963 | USER DEFINED FUNCTION DATABASE PROCESSING - Apparatus, systems, and methods may operate to retrieve multiple rows of a database in response to receiving a request to execute an aggregate user defined function (UDF) over the multiple rows, to sort each of the multiple rows into common groups, grouping together individual ones of the multiple rows that share one of the common groups, and to send UDF execution requests to apply the aggregate UDF to aggregate buffers of the common groups to produce an aggregate result, so that one of the UDF execution requests and one context switch are used to process each of the aggregate buffers used within one of the groups to provide at least one intermediate result that can be processed to form the aggregate result. Other apparatus, systems, and methods are disclosed. | 05-24-2012 |
20120130964 | FAST ALGORITHM FOR MINING HIGH UTILITY ITEMSETS - The present invention discloses a fast algorithm for mining high utility itemsets, wherein some transaction data and item utilities are recorded in a tree structure. The method to construct a tree structure is recording on a node the item utilities appearing from the root node to the node. Some techniques are used to reduce the mining space, whereby the fast algorithm can directly generate high utility itemsets from the tree structure without generating any candidates. The fast algorithm of the present invention is more efficient than the existing highest-efficiency algorithm. The present invention further proposes a compression method to effectively save memory space. | 05-24-2012 |
20120130965 | DATA COMPRESSION METHOD - Disclosed herein is a data compression method for improving a compression rate when compressing computer data by employing both a method of generating a character string dictionary and storing indexes and a method of storing compression codes corresponding to character strings. Accordingly, a compression rate and a decompression speed increase. | 05-24-2012 |
20120143833 | STRUCTURE OF HIERARCHICAL COMPRESSED DATA STRUCTURE FOR TABULAR DATA - A highly flexible and extensible structure is provided for physically storing tabular data. The structure, referred to as a compression unit, may be used to store tabular data that logically resides in any type of table-like structure. According to one embodiment, compression units are recursive. Thus, a compression unit may have a “parent” compression unit to which it belongs, and may have one or more “child” compression units that belong to it. In one embodiment, compression units include metadata that indicates how the tabular data is stored within them. The metadata for a compression unit may indicate, for example, whether the data is stored in row-major or column major-format the order of the columns within the compression unit (which may differ from the logical order of the columns dictated by the definition of their logical container), a compression technique for the compression unit, the child compression units (if any), etc. | 06-07-2012 |
20120143834 | DATA SUMMARY SYSTEM, METHOD FOR SUMMARIZING DATA, AND RECORDING MEDIUM - Each time sequential data is generated by a data generation source ( | 06-07-2012 |
20120150828 | METHOD AND APPARATUS FOR DECODING ENCODED STRUCTURED DATA FROM A BIT-STREAM - A method for decoding encoded structured data from a bit-stream comprising a plurality of encoded data units having the steps of, obtaining unit information comprising positions of the encoded data units within the bit-stream, retrieving the encoded data units from the bit-stream based on the unit information, creating decoding tasks for decoding the retrieved encoded data units, assigning the created decoding tasks to cores of a multi-core decoder, based on estimated decoding costs of the encoded data units, running the tasks on their assigned cores to decode the encoded data units in parallel is disclosed. It is applied to the decoding of XML documents in the EXI format. | 06-14-2012 |
20120158675 | Partial Recall of Deduplicated Files - The subject disclosure is directed towards changing a file from a fully deduplicated state to a partially deduplicated state in which some of the file data is deduplicated in a chunk store, and some is recalled into the file, that is, in the file's storage volume. A partial recall mechanism such as in a file system filter tracks (e.g., via a bitmap in a file reparse point) whether file data is maintained in the chunk store or has been recalled to the file. Data is recalled from the chunk store as needed, and committed (e.g., flushed) to the file. Also described is efficiently returning the file to a fully deduplicated state by using the tracking information to determine which parts of the file are already deduplicated into the chunk store so as to avoid their further deduplication processing. | 06-21-2012 |
20120158676 | ENABLING RANDOM ACCESS WITHIN OBJECTS IN ZIP ARCHIVES - Objects stored in a zip archive may be extracted in random-access fashion (without involving other objects stored in the zip archive) using the addresses of the objects stored in the central directory of the zip archive. However, zip archives often provide insufficient information to enable random access to the data within an object. This capability may be provided by segmenting the object into sections of a section size, and including in the zip archive a block table specifying, for respective sections, the block size of the corresponding block. A zip archive extractor may achieve random access to the object by using the block table to computing the addresses of blocks comprising the selected portion and extracting only those blocks. Backwards compatibility of the zip archive with other zip archive extractors may be preserved by including the block table within a zip extension of the central directory of the zip archive. | 06-21-2012 |
20120158677 | SYSTEMS AND METHODS FOR STREAMING COMPRESSED FILES VIA A NON-VOLATILE MEMORY - This can relate to streaming compressed files via a non-volatile memory (“NVM”) of a media player. In particular, the NVM can stream compressed media files. The NVM can include an NVM controller and an NVM die storing the compressed media file. The NVM controller can read the compressed media file from the NVM die, decompress the media file, and send the decompressed media file to a digital-to-analog converter (“DAC”) for conversion to analog format. Since the decompression can be performed by the NVM itself, an application processor may be significantly removed from the media playback process. In some embodiments, it may only be necessary for the application processor to issue an initial read request and/or receive a completion confirmation from the NVM. This can result in significant power savings for the media player and can free the application processor for performing other functions of the media player. | 06-21-2012 |
20120166404 | REAL-TIME TEXT INDEXING - Systems, methods, and other embodiments associated with real-time text indexing are described. One example method includes receiving a document for indexing in a search system that includes a mature index and indexing the received document in a staging index. The staging index may be stored in direct access memory associated with query processing that does not degrade query performance even when postings become fragmented. The staging index and the mature text index are accessed to process queries on the search system. The example method may also include periodically merging the staging index into the mature index based on query feedback. | 06-28-2012 |
20120173496 | NUMERIC, DECIMAL AND DATE FIELD COMPRESSION - A method, apparatus, and article of manufacture for accessing data in a computer system. Compression and decompression functions are associated with a column of the table, in order to perform compression of decimal, numeric or date data stored in the column when the data is inserted or updated in the table, and in order to perform decompression of the data stored in the column when the data is retrieved from the table. The compression function compresses and stores the data in a fixed-length compressed field in the column without a length value, and the fixed-length compressed field has a size that is determined by a range of values for the data stored in the fixed-length compressed field. The decompression function retrieves and decompresses the data from the fixed-length compressed field. | 07-05-2012 |
20120185447 | Systems and Methods for Providing Increased Scalability in Deduplication Storage Systems - A computer-implemented method for providing increased scalability in deduplication storage systems may include ( | 07-19-2012 |
20120185448 | CONTENT BASED FILE CHUNKING - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for transferring electronic data. In general, one aspect of the subject matter described in this specification can be embodied in methods that include the actions of identifying a data item to be chunked; determining the type of the data item; determining whether the type of the data item is one of a specified one or more types; if it is determined that the type of the data item is not one of the specified one or more types, performing a first chunking of the data item; and if it is determined that the type of the data item is one of the specified one or more types, performing a second chunking of the data item that is based on the particular content portions of the data item. | 07-19-2012 |
20120203746 | SYSTEM AND METHOD FOR ADAPTIVELY COLLECTING PERFORMANCE AND EVENT INFORMATION - Selective compression of data, wherein it is determined which of a number of compression algorithms do not incur an overhead that exceeds available resources. Then, one of the determined algorithms is selected to maximize compression. | 08-09-2012 |
20120226671 | EFFICIENT SIZE OPTIMIZATION OF VISUAL INFORMATION OR AUDITORY INFORMATION - A file, including visual information or auditory information may be uploaded to a processing device. Respective portions of content of the file may be identified for compressing and saving at respective bit rates. A number of component files may be created, compressed and saved, at the respective bit rates, based on the identified respective portions of content of the file. A network page, including a reference to the uploaded file, may be created. The reference to the uploaded file, in the network page, may be replaced with references to the compressed, saved component files and the network page may be saved. A processing device of a user may request the network page and the compressed, saved component files. A reasonable facsimile of the file may be reproduced based on an aggregate of the compressed, saved component files. | 09-06-2012 |
20120246128 | DATABASE PROCESSING DEVICE, DATABASE PROCESSING METHOD, AND RECORDING MEDIUM - The database system of the present invention decides a fragment length responding to a unit of a data process of a parallel arithmetic unit, and stores tuple data containing variable-length data into a fragment and metadata of the fragment into a fragment header, respectively, in a column store database. The database system refers to the metadata when executing a process for data stored in the column store database, decides the fragments to be assigned to each thread that is executed by the parallel arithmetic unit, assigns the fragments to each thread based upon the decided content, and causes each thread to execute a parallel arithmetic operation. | 09-27-2012 |
20120246129 | EFFICIENT STORAGE AND RETRIEVAL FOR LARGE NUMBER OF DATA OBJECTS - A data object management scheme for storing a large plurality of small data objects (e.g., image files) in small number of large object stack file for storage in secondary storage (e.g., hard disks). By storing many individual data objects in a single object stack file, the number of files stored in the secondary storage is reduced by several orders of magnitude, from the billions or millions to the hundreds or so. Index data for each object stack file is generated and stored in primary storage to allow efficient and prompt access to the data objects. Requests to store or retrieve the data objects are made using HTTP messages including file identifiers that identify the files storing the data objects and keys identifying the data objects. A file server stores or retrieves the data object from secondary storage of a file server without converting the requests to NSF or POSIX commands. | 09-27-2012 |
20120254133 | METHOD FOR BINARY PERSISTENCE IN A SYSTEM PROVIDING OFFERS TO SUBSCRIBERS - A computerized method and system for binary persistence in a system providing offerings to subscribers of a service provider are provided. The method includes receiving a plurality of objects respective of offerings made to a subscriber of a service provider; serializing the plurality of objects beginning at an origin to generate a binary record; and storing the binary record in a binary field of an entry in a database, the entry being respective of the subscriber, wherein retrieval of the offerings made to the subscriber requires merely extraction of the binary record from the binary field and performing at least a partial deserialization thereon. | 10-04-2012 |
20120259822 | METHOD FOR COMPRESSING IDENTIFIERS - The invention relates to a method for compressing identifiers of program code elements in a portable data carrier, to a method for calling compressed identifiers, to a portable data carrier, and to a semiconductor chip having a memory area for storing the compressed identifiers. | 10-11-2012 |
20120265737 | ADAPTIVE COMPRESSION - Technology for adaptive compression is described (“the technology”). The technology may identify two or more partitions of a data stream; optionally pre-process data in each partition; create one or more evaluation functions to evaluate a suitability for compression of the data in each partition using a set of potential compression methods; process the created one or more evaluation functions; choose a subset of the set of potential compression methods for each segment at least partly by analyzing the evaluation functions; select a compression method for each segment based on a compression ratio of compressing the sequence of used compression methods and a compression rate of the data; compress the data in each partition using the selected compression method for the partition; compress a subsequence that indicates which compression method is used for each segment. | 10-18-2012 |
20120265738 | SEMANTIC COMPRESSION - Technology for semantic compression is disclosed. In various embodiments, the technology receives data that represents one or more physical attributes sensed by one or more sensors; employs at least one pattern or statistical feature to identify a first region and a second region in the received data; computes a first utility and a first relevant feature for the first region, and a second utility and a second relevant feature for the second region; and identifies based on at least the first utility and the second utility a first compression method to apply to the first region and a second compression method to apply to the second region wherein the first and the second compression methods have different compression rates, different feature preservation characteristics, or both. | 10-18-2012 |
20120265739 | DATA COLLECTION SYSTEM, DATA COLLECTION METHOD AND DATA COLLECTION PROGRAM - It is an object to provide a data collection system that is configured to reduce a communication amount, etc. at the time when data are collected from devices, so as to reduce a communication amount attended by the collection of data without increasing processing loads imposed on devices. The data collecting device comprises a code operating means for deriving a frequency of the symbol for each symbol corresponding to the code being contained in the already compressed data based upon the data analysis result being contained in the received already compressed data, and a code operation developing means for adding the frequency of the description format, out of the frequencies obtained by the code operating means, to the frequency of the basic symbol corresponding to the above description format, and adding the frequency of the derivative symbol to the frequency of each basic symbol constituting the derivative symbol. | 10-18-2012 |
20120265740 | METHOD AND SYSTEM FOR COMPRESSION OF FILES FOR STORAGE AND OPERATION ON COMPRESSED FILES - Methods and systems for creating, reading, and writing compressed data for use with a block mode access storage. The compressed data are packed into a plurality of compressed units and stored in a storage logical unit (LU). One or more corresponding compressed units may be read and/or updated with no need of restoring the entire storage logical unit while maintaining a de-fragmented structure of the LU. | 10-18-2012 |
20120271802 | FORWARD COMPATIBILITY GUARANTEED DATA COMPRESSION AND DECOMPRESSION METHOD AND APPARATUS THEREOF - A forward compatibility guaranteed data compression and decompression method and apparatus are provided. The compressed data decompression apparatus includes a compressed file parsing unit which parses a compressed file comprising compressed data, a header including information on the compressed data and an extension field to extract the compressed data, and an original file generating unit which decompresses the compressed data to generate an original file. The extension field includes one or more extension field units and an extension field terminating code indicating an end of a region capable of including the extension field units, each of the header and the extension field units starts with a data identification code having the same number of bytes, and the extension field unit further includes its own length data separated by a predetermined number of bytes from its own data identification code. If the data identification code of the extension field unit is not defined while the parsing is carried out based on the data identification code, the compressed file parsing unit skips the extension field unit using the length data without processing the extension field unit. | 10-25-2012 |
20120278291 | AVOIDING THREE-VALUED LOGIC IN PREDICATES ON DICTIONARY-ENCODED DATA - According to one embodiment of the present invention, a method for dictionary encoding data without using three-valued logic is provided. According to one embodiment of the invention, a method includes encoding data in a database table using a dictionary, wherein the data includes values representing NULLs. A query having a predicate is received and the predicate is evaluated on the encoded data, whereby the predicate is evaluated on both the encoded data and on the encoded NULLs. | 11-01-2012 |
20120284239 | METHOD AND APPARATUS FOR OPTIMIZING DATA STORAGE - Embodiments of the invention relate to evaluation and storage of data in a computer system configured with a shared pool of resources. A multi-level adaptive compression technique is employed to minimize the cost of data storage based upon the type of data being stored and their access pattern. The costs of data storage include capacity, bandwidth, and compute cycles. Data is transformed local to a client in communication with the shared pool, local to the shared pool, or as a combination with a partial transformation local to the client and a partial transformation local to the shared pool. | 11-08-2012 |
20120284240 | MANAGING STORAGE OF INDIVIDUALLY ACCESSIBLE DATA UNITS - A method for managing data includes receiving individually accessible data units, each identified by a key value; storing a plurality of blocks of data, each of at least some of the blocks being generated by combining a plurality of the data units; and providing an index that includes an entry for each of the blocks. One or more of the entries enable location, based on a provided key value, of a block that includes data units corresponding to a range of key values that includes the provided key value. | 11-08-2012 |
20120284241 | SYSTEMS AND METHODS FOR COMPRESSING FILES FOR STORAGE AND OPERATION ON COMPRESSED FILES - Methods and systems for creating, reading, and writing compressed files in a computer system comprising a file system coupled with storage medium and at least one application program interface (API) configured to communicate with the file system by means of file access-related requests are provided. The file access-related requests are intercepted in order to provide at least one of the following: a) to derive and compress data corresponding to the intercepted file access request and to facilitate storing the compressed data at the storage medium as a compressed file; b) to facilitate restoring at least part of compressed data corresponding to the intercepted file request and communicating the resulting data through the API. The compressed files comprise plurality of compressed units. One or more corresponding compressed units may be read and/or updated with no need of restoring the entire file whilst maintaining de-fragmented structure of the compressed file. | 11-08-2012 |
20120284242 | SYSTEMS AND METHODS FOR COMPRESSING FILES FOR STORAGE AND OPERATION ON COMPRESSED FILES - Methods and systems for creating, reading, and writing compressed files in a computer system comprising a file system coupled with storage medium and at least one application program interface (API) configured to communicate with the file system by means of file access-related requests are provided. The file access-related requests are intercepted in order to provide at least one of the following: a) to derive and compress data corresponding to the intercepted file access request and to facilitate storing the compressed data at the storage medium as a compressed file; b) to facilitate restoring at least part of compressed data corresponding to the intercepted file request and communicating the resulting data through the API. The compressed files comprise plurality of compressed units. One or more corresponding compressed units may be read and/or updated with no need of restoring the entire file whilst maintaining de-fragmented structure of the compressed file. | 11-08-2012 |
20120296881 | Index Compression in a Database System - A method for compressing index pages in a database system is provided. The database system includes a table, and the table includes table columns. The method includes: providing an index associated with the table, wherein the index is stored on at least one index page of the database system, and wherein the index comprises index columns related to a part of the table columns; providing a first sequence of the index columns; providing a second sequence of the index columns; arranging the index columns stored on the at least one index page according to the second sequence; performing a prefix compression on entries of the at least one index page; and accessing the index using the first sequence of the index columns. | 11-22-2012 |
20120296882 | METHOD AND APPARATUS FOR SPLITTING MEDIA FILES - A method and an apparatus for splitting media files are provided. The method and apparatus enable a mobile device to automatically segment a media file into split files. The mobile device may download a large media file whose size exceeds the file size limit imposed by the file system. The method includes detecting a download event for a specified file, comparing the size of an event related file corresponding to the specified file for which a download event occurs with a file size limit imposed by a file system of a mobile device, identifying, when the size of the event related file is greater than or equal to the file size limit, a type of the event related file, splitting, when the event related file is a media file, the event related file on the basis of the file size limit, and storing the split files according to setting information. | 11-22-2012 |
20120296883 | Techniques For Automatic Data Placement With Compression And Columnar Storage - For automatic data placement of database data, a plurality of access-tracking data is maintained. The plurality of access-tracking data respectively corresponds to a plurality of data rows that are managed by a database server. While the database server is executing normally, it is automatically determined whether a data row, which is stored in first one or more data blocks, has been recently accessed based on the access-tracking data that corresponds to that data row. After determining that the data row has been recently accessed, the data row is automatically moved from the first one or more data blocks to one or more hot data blocks that are designated for storing those data rows, from the plurality of data rows, that have been recently accessed. | 11-22-2012 |
20120303596 | POSITION INVARIANT COMPRESSION OF FILES WITHIN A MULTI-LEVEL COMPRESSION SCHEME - An aggregated file is generated, by storing a plurality of initially provided files in a sequence. A computational device executes a first set of compression operations on each of the plurality of initially provided files to generate a plurality of compressed files that replace the plurality of initially provided files, wherein starting locations of the plurality of compressed files and the plurality of initially provided files are identical, and wherein predetermined bit patterns are stored in empty spaces that follow each of the plurality of compressed files. The computational device sends the aggregated file to a linear storage device configured to perform a second set of compression operations on the aggregated file. | 11-29-2012 |
20120303597 | System and Method for Storing Data Streams in a Distributed Environment - Systems and methods for storing and retrieving data elements transmitted via data streams received from distributed devices connected via a network. The received data elements may be stored in block stores on the distributed devices. The stored data-elements may be allocated to data-blocks of a block-store that have assigned block-identifiers and further allocated to events of the data-blocks that have assigned token-names. Stream-schema of the received data-streams may comprise a list of token-names and an index-definition for each corresponding data-stream. Indices may be generated for the event-allocated data-elements. A query may be executed in order to retrieve data-elements of the received data-streams based on the indices. | 11-29-2012 |
20120303598 | REAL-TIME ADAPTIVE BINNING - In one embodiment, a set of boundaries may be obtained, where the set of boundaries includes boundaries for each of one or more bins. The boundaries for each of the one or more bins may include a lower boundary and an upper boundary, wherein the set of boundaries of the one or more bins together defines a contiguous range of data values capable of being stored in the one or more bins. A data value may be obtained. The data value may be added to one of the one or more bins according to the boundaries of the one or more bins. It may be determined whether to modify the set of boundaries. The set of boundaries may be adjusted according to a result of the determining step. | 11-29-2012 |
20120310903 | Method and System for Efficiently Replicating Data in Non-Relational Databases - A method replicates data between instances of a distributed database. The method identifies at least two instances of the database at distinct geographic locations. The method tracks changes to the database by storing deltas. Each delta has a row identifier that identifies the piece of data modified, a sequence identifier that specifies the order in which the deltas are applied to the data, and an instance identifier that specifies where the delta was created. The method determines which deltas to send using an egress map that specifies which combinations of row identifier and sequence identifier have been acknowledged as received at other instances. The method builds a transmission matrix that identifies deltas that have not yet been acknowledged as received. The method then transmits deltas identified in the transmission matrix. After receiving acknowledgement that transmitted deltas have been incorporated into databases at other instances, the method updates the egress map. | 12-06-2012 |
20120323867 | SYSTEMS AND METHODS FOR QUERYING COLUMN ORIENTED DATABASES - Systems and methods for accessing data stored in a data array, mapping the data using a bitmap index, and processing data queries by determining positions of query attributes in the bitmap index and locating values corresponding to the positions in the data array are described herein. | 12-20-2012 |
20120330908 | SYSTEM AND METHOD FOR INVESTIGATING LARGE AMOUNTS OF DATA - A data analysis system is proposed for providing fine-grained low latency access to high volume input data from possibly multiple heterogeneous input data sources. The input data is parsed, optionally transformed, indexed, and stored in a horizontally-scalable key-value data repository where it may be accessed using low latency searches. The input data may be compressed into blocks before being stored to minimize storage requirements. The results of searches present input data in its original form. The input data may include access logs, call data records (CDRs), e-mail messages, etc. The system allows a data analyst to efficiently identify information of interest in a very large dynamic data set up to multiple petabytes in size. Once information of interest has been identified, that subset of the large data set can be imported into a dedicated or specialized data analysis system for an additional in-depth investigation and contextual analysis. | 12-27-2012 |
20120330909 | System and Method for Storing Data Streams in a Distributed Environment - Systems, methods and computer readable medium for storing data elements transmitted via data streams received from distributed devices connected via a network. The received data elements may be stored in block stores on the distributed devices. The stored data elements may be allocated to data blocks of a block store that have assigned block identifiers and further allocated to events of the data blocks. Stream schema of the received plurality of data streams may have the same stream schema, and indices may be generated indices based on the order of the event allocated data elements. Stream schema of the received data streams may comprise a list of token names. Token names may be assigned to the event allocated data elements. Indices may be generated for the event allocated data elements based on the stream schema. | 12-27-2012 |
20120330910 | BLOCK-BASED DIFFERENCING ALGORITHM - A system and method for a block based differencing algorithm which includes the ability to limit memory requirements regardless of source file sizes by splitting the source file into optimally sized blocks. The invention allows the blocks to be processed in any order allowing in-place operation. Further, the present invention allows a second stage compressor to match the compressor blocks to those used by the differencing algorithm to optimize compressor and decompressor performance. | 12-27-2012 |
20130006948 | COMPRESSION-AWARE DATA STORAGE TIERING - A method, including assigning, to each tier in a storage system comprising multiple tiers, a respective range of priority scores, and calculating a compression ratio for a file stored on one of the multiple tiers. Using the compression ratio, a priority score is calculated for the file, and the file is migrated to the tier whose assigned range of priority scores includes the calculated priority score. | 01-03-2013 |
20130013574 | Block Entropy Encoding for Word Compression - A computer-implemented method, computer-readable media, and a computerized system to compress words are provided. The computerized system includes a compression engine that compresses a list of words. The compression engine generates a symbol list from the list of words, decomposes the words using the symbol list and a cost function, and encodes the decomposed words. The words may be from a search index. The compression engine may be utilized to reduce the size of the search index and improve efficiency. | 01-10-2013 |
20130018856 | COMPRESSION OF BITMAPS AND VALUES - The present invention relates to compression of values and bitmaps, and methods thereof. Such methods are configured for operating on a computer system having a word length architecture of length WL and are based on the observation that not all the bits used for the run-length counter—i.e., the fill length field (FL) inhere—are often used, since runs are seldom so long. Contrarily to other compression schemes (e.g., WAH), said methods may assign the unused bits to one or more position list fields (PL, PL | 01-17-2013 |
20130018857 | SYSTEM AND METHOD FOR FILE SYSTEM LEVEL COMPRESSION USING COMPRESSION GROUP DESCRIPTORS - A system and method for transparently compressing file system data using compression group descriptors is provided. When data contained within a compression group be compressed beyond a predefined threshold value, a compression group descriptor is included in the compression group that signifies that the data for the group of level 0 blocks is compressed into a lesser number of physical data blocks. When performing a read operation, the file system first determines the appropriate compression group that contains the desired data and determines whether the compression group has been compressed. If so, the file system decompresses the data in the compression group before returning the decompressed data. If the magic value is not the first pointer position, then the data within the compression group was previously stored in an uncompressed format, and the data may be returned without performing a decompression operation. | 01-17-2013 |
20130024432 | METHOD AND SYSTEM FOR STORING DATA IN COMPLIANCE WITH A COMPRESSION HANDLING INSTRUCTION - A method for storing data in a storage system. In one embodiment, implementation of a method for storing data in compliance with a compression handling instruction includes: at a storage controller, receiving an object for storage within a data storage, wherein the object is in an original state; determining whether a compression handling instruction is received in association with the object; and executing the compression handling instruction when storing the object. | 01-24-2013 |
20130024433 | REAL-TIME COMPRESSION OF TABULAR DATA - Exemplary method, system, and computer program product embodiments for real-time column compression of data are provided. In one embodiment, by way of example only, a data structure is estimated for an initially unknown structured data. The estimated data structure is placed in a stream. A columnar compression operation is applied to the stream to generate an achieved compression ratio. The stream is compressed. Feedback of the achieved compression ratio is analyzed from the stream to determine if an optimal one of the columnar compression operations has been applied. If the optimal one of the columnar compression operations has been applied, the actual data structure of the initially unknown structured data is determined. | 01-24-2013 |
20130024434 | STREAM COMPRESSION AND DECOMPRESSION - A method for compressing a sequence of records, each record comprising a sequence of fields, comprises steps of buffering a record in a line of a matrix, reordering the lines of the matrix according to locality sensitive hash values of the buffered records such that records with similar contents in corresponding fields are placed in proximity, and consolidating fields in columns of the matrix into a block of codes. In this, consolidating yields codes of one of a first type comprising a sequence of individual fields and a second type comprising a sequence of fields with at least one repetition. The second type of code comprises a presence field indicating repeated fields and an iteration field indicating a number of respective repetitions. Decompression of the records from the block codes compressed above is also described. | 01-24-2013 |
20130031063 | COMPRESSION OF DATA PARTITIONED INTO CLUSTERS - The invention notably relates to a computer-implemented method for compressing data. The data is partitioned into clusters of pieces of data resulting from K-means clustering. Each cluster has a centroid. The method comprises applying (S | 01-31-2013 |
20130031064 | Compressing Massive Relational Data - A relational dependency transform is introduced as a way to exploit information redundancy in conditioning data in a relational database for better compressibility. An optimum relational dependency transform of the relational database is first computed. Fields of the relational database are then sorted topologically based on a weighted, directed graph having nodes representing predictor and predictee fields. For each predictee field in the topological order, a transformed field is then computed via the relationship between predictor and predictee in the optimum relational dependency transform. | 01-31-2013 |
20130031065 | MULTI-LEVEL COMPRESSED LOOK-UP TABLES FORMED BY LOGICAL OPERATIONS TO COMPRESS SELECTED INDEX BITS - A lookup is performed using multiple levels of compressed stride tables in a multi-bit Trie structure. An input lookup key is divided into several strides including a current stride of S bits. A valid entry in a current stride table is located by compressing the S bits to form a compressed index of D bits into the current stride table. A compression function logically combines the S bits to generate the D compressed index bits. An entry in a prior-level table points to the current stride table and has a field indicating which compression function and mask to use. Compression functions can include XOR, shifts, rotates, and multi-bit averaging. Rather than store all 2 | 01-31-2013 |
20130036101 | Compression Analyzer - Techniques are described herein for automatically selecting the compression techniques to be used on tabular data. A compression analyzer gives users high-level control over the selection process without requiring the user to know details about the specific compression techniques that are available to the compression analyzer. Users are able to specify, for a given set of data, a “balance point” along the spectrum between “maximum performance” and “maximum compression”. The point thus selected is used by the compression analyzer in a variety of ways. For example, in one embodiment, the compression analyzer uses the user-specified balance point to determine which of the available compression techniques qualify as “candidate techniques” for the given set of data. The compression analyzer selects the compression technique to use on a set of data by actually testing the candidate compression techniques against samples from the set of data. After testing the candidate compression techniques against the samples, the resulting compression ratios are compared. The compression technique to use on the set of data is then selected based, in part, on the compression ratios achieved during the compression tests performed on the sample data. | 02-07-2013 |
20130054543 | Inverted Order Encoding in Lossless Compresssion - A method of compressing an electronic file is provided. The method comprises reading a first electronic file in reverse order sequence from bottom to top, while reading the first file, identifying patterns in a content of the first file and while reading the first file, building a dictionary comprising a plurality of entries, each entry defining an association of a code to one of the patterns identified in the content of the first file. The method further comprises, while reading the first file, building a second electronic file that is a compressed version of the first file, wherein the second electronic file comprises a compressed content portion and a dictionary portion, wherein the compressed content portion comprises codes from the dictionary and wherein the dictionary portion comprises the dictionary. | 02-28-2013 |
20130054544 | Content Aware Chunking for Achieving an Improved Chunk Size Distribution - The subject disclosure is directed towards partitioning a file into chunks that satisfy a chunk size restriction, such as maximum and minimum chunk sizes, using a sliding window. For file positions within the chunk size restriction, a signature representative of a window fingerprint is compared with a target pattern, with a chunk boundary candidate identified if matched. Other signatures and patterns are then checked to determine a highest ranking signature (corresponding to a lowest numbered Rule) to associate with that chunk boundary candidate, or set an actual boundary if the highest ranked signature is matched. If the maximum chunk size is reached without matching the highest ranked signature, the chunking mechanism regresses to set the boundary based on the candidate with the next highest ranked signature (if no candidates, the boundary is set at the maximum). Also described is setting chunk boundaries based upon pattern detection (e.g., runs of zeros). | 02-28-2013 |
20130054545 | MANAGING DEREFERENCED CHUNKS IN A DEDUPLICATION SYSTEM - A chunk index has information on chunks in a storage space referenced in objects in the storage space. The chunk index includes a reference count for each chunk indicating a number of objects in which the chunk is referenced and a reference measurement representing a level of data object references to the chunk. One chunk is selected to remove from the storage space based on a criteria applied to the reference measurements of chunks having reference counts indicating that the chunks are not referenced in one object in the storage space. | 02-28-2013 |
20130054546 | HARDWARE-BASED ARRAY COMPRESSION - Technologies are generally described herein for compressing an array using hardware-based compression and performing various instructions on the compressed array. Some example technologies may receive an instruction adapted to access an address in an array. The technologies may determine whether address is compressible. If the address is compressible, then the technologies may determine a compressed address of a compressed array based on the address. The compressed array may represent a compressed layout of the array where a reduced size of each compressed element in the compressed array is smaller than an original size of each element in the array. The technologies may access the compressed array at the compressed address in accordance with the instruction. | 02-28-2013 |
20130054547 | System and Method of Font Compression Using Selectable Entropy Encoding - A request for a font file including a first font table and a second font table is received. A first entropy encoder is selected, based on characteristics of the first font table, front among a plurality of entropy encoders. A second entropy encoder is selected, based on characteristics of the second font table, front among the plurality of entropy encoders. The first entropy encoder is applied to the first font table. The second entropy encoder is applied to the second font table. Compressed data corresponding to the first and second font tables are combined to generate a compressed font file. The compressed font file is transmitted. | 02-28-2013 |
20130060740 | DATA MANAGING METHOD, APPARATUS, AND RECORDING MEDIUM OF PROGRAM, AND SEARCHING METHOD, APPARATUS, AND MEDIUM OF PROGRAM - A data management apparatus includes a storage device; and a processor that executes a procedure, the procedure including selecting a data group, each data in the data group including one of a plurality of tags, among a plurality of data, compressing the data group into a compressed data group, and storing the compressed data group in the storage device, the stored compressed data group being associated with tagging information which indicates that each data of the data group includes the certain tag. | 03-07-2013 |
20130073530 | Block Compression of Tables With Repeated Values - Methods and apparatus, including computer program products, for block compression of tables with repeated values. In general, value identifiers representing a compressed column of data may be sorted to render repeated values contiguous, and block dictionaries may be generated. A block dictionary may be generated for each block of value identifiers. Each block dictionary may include a list of block identifiers, where each block identifier is associated with a value identifier and there is a block identifier for each unique value in a block. Blocks may have standard sizes and block dictionaries may be reused for multiple blocks. | 03-21-2013 |
20130080410 | ACTIVE MEMORY EXPANSION IN A DATABASE ENVIRONMENT TO QUERY NEEDED/UNEEDED RESULTS - Techniques are described for estimating and managing memory compression for request processing. Embodiments of the invention may generally include receiving a request for data, determining if the requested data contains any compressed data, and sending the requesting entity only the uncompressed data. A separate embodiment generally includes receiving a request for data, determining if the requested data contains any compressed data, gathering uncompression criteria about the requested data, and using the uncompression criteria to selectively determine what portion of the compressed data to uncompress. | 03-28-2013 |
20130086011 | Associative Memory Visual Evaluation Tool - A method, apparatus, and non-transitory computer readable storage medium for validating content is provided. Data is parsed into at least a first group of data and a second group of data according to a plurality of types of content present in the data. The data is ingested into an associative memory. The associative memory forms a plurality of associations among the data. The associative memory is configured to be queried based on at least one relationship selected from a group consisting of direct relationships and indirect relationships among the data. The associative memory comprises a content-addressable structure, the content-addressable structure comprising a memory organization in which the data is configured to be accessed by the content as opposed to being configured to be accessed by addresses for the data. The first group of data and the second group of data are communicated in a graphical representation. | 04-04-2013 |
20130086012 | Methods, Systems and Computer Program Products for Providing a Distributed Associative Memory Base - Systems, methods and computer program products are provided for a distributed associative memory base. Such methods may include providing a distributed memory base that includes a network of networks of associative memory networks. The memory base may include a network of associative memory networks, a respective associative memory network comprising associations among a respective observer entity and a plurality of observed entities that are observed by the respective observer entity. Ones of the associative memory networks are physically and/or logically independent from other ones of the associative memory networks. Methods include imagining associations from the associative memory base using a plurality of streaming queues that correspond to ones of a plurality of rows of ones of the associative memory networks. | 04-04-2013 |
20130091105 | System for organizing and fast searching of massive amounts of data - A system to collect and analyze performance metric data recorded in time-series measurements, converted into unicode, and arranged into a special data structure. The performance metric data is collected by one or more probes running on machines about which data is being collected. The performance metric data is also organized into a special data structure. The data structure at the server where analysis is done has a directory for every day of performance metric data collected with a subdirectory for every resource type. Each subdirectory contain text files of performance metric data values measured for attributes in a group of attributes to which said text file is dedicated. Each attribute has its own section and the performance metric data values are recorded in time series as unicode hex numbers as a comma delimited list. Analysis of the performance metric data is done using regular expressions. | 04-11-2013 |
20130097126 | USING AN INVERTED INDEX TO PRODUCE AN ANSWER TO A QUERY - In response to a query having a search term, an inverted index that is defined on a set of attributes of a database structure is accessed, where the inverted index associates values of the set of attributes with corresponding references to rows of the database structure. It is determined whether any of the attributes in the set is in the search term. In response to determining that any of the attributes in the set is in the search term, the inverted index is used to produce an answer to the query. | 04-18-2013 |
20130097127 | Method and System for Database Storage Management - Embodiments of the present invention relate to run-length encoded sequences and supporting efficient offset-based updates of values while allowing fast lookups. In an embodiment of the present invention, an indexing scheme is disclosed, herein called count indexes, that supports O(log n) offset-based updates and lookups on a run-length sequence with n runs. In an embodiment, count indexes of the present invention support O(log n) updates on bitmapped sequences of size n. Embodiments of the present invention can be generalize to be applied to block-oriented storage systems. | 04-18-2013 |
20130097128 | TIME-SERIES DATA DIAGNOSING/COMPRESSING METHOD - An allowable error used for compressing time-series data can be set without knowledge of equipment. It is possible to prevent the data from being lost not only in the event of an abnormality, but also during a period in which an evidence of a predicted abnormality is detected. In addition, it is also possible to verify the properness of a set allowable error. Thus, the amount of data gathered and stored in a memory can be reduced without losing information required for detection of an evidence of a predicted abnormality occurring in the equipment of interest. As a result, it is possible to provide a time-series data diagnosing/compressing method capable of gathering time-series data with a high degree of efficiency and a data gathering/storing apparatus adopting the method. That is to say, in accordance with the present invention, there is provided a predicted-failure-evidence diagnosing section not depending on the equipment and not requiring knowledge of the equipment and, on the basis of a result of a predicted-abnormality-evidence diagnosis carried out by this section on time-series data gathered from the equipment, an allowable error used for compressing the gathered data can be set and managed in order to compress the data if the result of the diagnosis is normal or restrict the compression of the data during a period in which an evidence of a predicted abnormality is detected. Thus, the amount of data stored in a memory can be reduced. | 04-18-2013 |
20130097129 | DYNAMIC DATA TRANSFORMATIONS FOR NETWORK TRANSMISSIONS - A method of dynamically performing data transformations on information that is transmitted between a user device and a web service may include receiving interface code from the web service, receiving an input from the user device that identifies a data type, and a data transformation to be applied to data instances matching the data type. The method may also include causing a definition file to be stored with the data type, the data transformation, and a resource locator. The method may additionally include, in a second communication session, intercepting a transmission, accessing the definition file using the resource locator, determining whether the data instance matches the data type, causing the data transformation to be performed on the data instance to generate transformed data, and inserting the transformed data into the transmission if the data instance matches the data type. | 04-18-2013 |
20130103655 | MULTI-LEVEL DATABASE COMPRESSION - Embodiments of the invention relate to a multi-level database compression technique to compress table data objects stored in pages. A compact dictionary structure is encoded that represents frequent values of data at any level of granularity. More than one level of compression is provided, wherein input to a finer level of granularity is an output of a coarser level of granularity. Based upon the encoded dictionary structure, a compression technique is applied to a stored page to compress each row on the page. Similarly, a de-compression technique may be applied to decompress the compressed data, utilizing the same dictionary structures at each level of granularity. | 04-25-2013 |
20130103656 | EVENT IDENTIFICATION - A method of identifying an event associated with consumption of a utility comprising the steps of: generating a utility consumption profile from utility consumption data, the utility consumption data comprising a plurality of utility consumption values measured at a corresponding plurality of measurement points; wherein generating the utility consumption profile comprises the step of determining a gradient of rate of change of utility consumption between consecutive measurement points; identifying measurement points at which a change in gradient exceeds a predetermined threshold; and storing in the utility consumption profile the utility consumption value at measurement points where the threshold is exceeded; and identifying an event within the utility consumption profile that matches the profile of an event stored in a database of utility consumption profiles. | 04-25-2013 |
20130103657 | TIME-SERIES DATA MANAGEMENT DEVICE, SYSTEM, METHOD, AND PROGRAM - Disclosed is a time-series management device capable of filtering time-series data having a possibility of matching a specified search pattern and reading in the data from a storage device when performing a time-series analysis. A data accumulation unit ( | 04-25-2013 |
20130117242 | Adaptive Data Suppression - Systems, method and devices for adaptively suppressing data items based on one or more dynamic characteristics of the data items are disclosed. Adaptive data suppression of an operational data item may be accomplished by monitoring the operational data item for one or more dynamic characteristics required by a data aging rule associated with the operational data item, wherein at least one of the database and operational data item are stored in memory, detecting the one or more dynamic characteristics required by the data aging rule, recording the one or more detected dynamic characteristics and assessing whether the one or more detected dynamic characteristics satisfy the data aging rule. If a data aging rule is satisfied, the operational data item may be suppressed to persistent data storage. Related systems, methods, and articles of manufacture are also described. | 05-09-2013 |
20130117243 | COMPRESSION AND STORAGE OF COMPUTER AIDED DESIGN DATA - The size of lightweight JT data files containing CAD data is reduced by employing lossy compression where acceptable for portions of the CAD data, such as 3D geometry data. Compression for the remaining portions can be augmented by exploiting common repeated structures for some portions, such as precise Brep data, and compressing separate but similar data, such as all metadata for a given part and all scene graph data, together as a single block. The compressed data is then written in separate, uniquely identified data segments indexed in a table of contents, allowing quick access to any data segment for streaming. | 05-09-2013 |
20130124488 | METHOD AND SYSTEM FOR MANAGING AND QUERYING LARGE GRAPHS - A method, system and computer program product for managing and querying a graph. The method includes the steps of: receiving a graph; partitioning the graph into homogeneous blocks; compressing the homogeneous blocks; and storing the compressed homogeneous blocks in files where at least one of the steps is carried out using a computer device. | 05-16-2013 |
20130124489 | COMPRESSING A MULTIVARIATE DATASET - A method, computer program product and system for compressing a multivariate dataset. A dataset is selected that includes a plurality of variates. A first compression method is applied to the values of a first variate of the dataset. A second compression method is applied to the values of a second variate of the dataset, where the second compression method is arranged to compress the second variate values relative to the variation of the corresponding first variate values. | 05-16-2013 |
20130132352 | EFFICIENT FINE-GRAINED AUDITING FOR COMPLEX DATABASE QUERIES - The present application provides for techniques for implementing data auditing embodiments that determine whether a query into a database is or has referenced forbidden data within the database. Various techniques are given for efficiently finding all tuples in a database referenced by a given query. A set of sensitive data is determined within a database and the set of sensitive data is employed to define a forbidden view within the database. Data within the database may be annotated to provide efficient identification of data access by query. Incoming queries may be analyzed and modified to propagate annotations for analyzing what data is or was accessed. | 05-23-2013 |
20130132353 | Compression Of Genomic Data - The present subject matter discloses a system and a method for compression of genomic data. In one embodiment, the method for compression of genomic data includes obtaining modified genomic data from genomic data based at least in part on intermediary data identified from the genomic data. In one implementation, the modified genomic data includes a plurality of primary characters. The genomic data may then be modified to generate one or more most-frequent character files based at least on a most-frequent character and a second most-frequent character from among the plurality of primary characters. Further, based at least on the one or more most-frequent character files and the modified genomic data, a least-frequent characters file may be created from the modified genomic data. | 05-23-2013 |
20130144849 | DELTA COMPRESSION USING MULTIPLE POINTERS - Encoding a new version of a data module includes constructing a delta data module having data for providing the new version of the data module. The delta data module may indicate an encoding for copying data at an offset from one of a number of pointers into different versions of the data module. Decoding a delta data module to provide a new version of a data module includes copying, to the new version of the data module, data relative to a target pointer when an encoding in the delta data module indicates a matching pattern relative to the target pointer, and copying, to the new version of the data module, data relative to at least one other pointer when an encoding in the delta data module indicates a matching pattern relative to the at least one other pointer. | 06-06-2013 |
20130144850 | STREAM COMPRESSION AND DECOMPRESSION - A method for compressing a sequence of records, each record comprising a sequence of fields, comprises steps of buffering a record in a line of a matrix, reordering the lines of the matrix according to locality sensitive hash values of the buffered records such that records with similar contents in corresponding fields are placed in proximity, and consolidating fields in columns of the matrix into a block of codes. In this, consolidating yields codes of one of a first type comprising a sequence of individual fields and a second type comprising a sequence of fields with at least one repetition. The second type of code comprises a presence field indicating repeated fields and an iteration field indicating a number of respective repetitions. Decompression of the records from the block codes compressed above is also described. | 06-06-2013 |
20130151485 | APPARATUS AND METHOD FOR STORING TRACE DATA - An apparatus and method to store trace data are provided. The apparatus includes a compression information generating unit configured to generate compression information to indicate the valid trace data in a trace data set. The apparatus further includes a compressing unit configured to extract the valid trace data from the trace data set based on the compression information. The apparatus further includes a write control unit configured to generate a write control signal for use in writing the valid trace data based on the compression information. The apparatus further includes a trace data buffer configured to store the valid trace data in response to the write control signal. | 06-13-2013 |
20130159263 | LOSSY COMPRESSION OF DATA POINTS USING POINT-WISE ERROR CONSTRAINTS - A method for compressing a cloud of points with imposed error constraints at each point is disclosed. Surfaces are constructed that approach each point to within the constraint specified at that point, and from the plurality of surfaces that satisfy the constraints at all points, a surface is chosen which minimizes the amount of memory required to store the surface on a digital computer. | 06-20-2013 |
20130159264 | DATA CONVERSION DEVICE, DATA CONVERSION METHOD, AND PROGRAM - There is realized a data conversion device that performs generation of a hash value with improved analysis resistance and a high degree of safety. There are provided a stirring processing section performing a data stirring process on input data; and a compression processing section performing a data compression process on input data including data segments which are divisions of message data, the message data being a target of a data conversion. Part of multi-stage compression subsections is configured to perform a data compression process based on both of output of the stirring processing section and the data segments in the message data. There is provided such a configuration that the stirring process is executed at least on fixed timing of a compression processing round of plural rounds and thus, there is realized a data conversion device that performs generation of a hash value with improved analysis resistance and a high degree of safety. | 06-20-2013 |
20130166518 | Compression Of Genomic Data File - Systems and methods for compression of a genomic data file are described herein. In one embodiment, genomic sequences, sequence headers, and quality sequences associated with a plurality of data streams provided in a genomic data file are identified. Each of the genomic sequences includes at least one of primary characters and secondary characters. Further, the secondary characters from each of the genomic sequences may be removed to obtain an intermediate genomic sequence file and a quality score corresponding to the secondary character may be modified in quality sequences to obtain an intermediate quality sequence file. Based on the intermediate genomic sequence file and the intermediate quality sequence file, a modified genomic sequence file and a modified quality sequence file, respectively are generated. A compressed genomic data file is obtained using at least the modified genomic sequence and the modified quality sequence. | 06-27-2013 |
20130166519 | DEVICE ACCESS SYSTEM - Storage system that includes: an address search section for i) storing an address for frequent use data and a data index assigned to the address, ii) acquiring an address of write or read data, and iii) searching stored addresses with the acquired address, a frequent use data storage section for i) storing a tag related to the use data and the index, ii) acquiring the index when an address acquired by the search section has hit a stored address, and iii) identifying frequent use data that corresponds to the tag, a data comparator for i) acquiring the frequent use data from the storage section, ii) comparing the data with write data, and iii) identifying frequent use data that hit the write data, and an compression-expansion section for acquiring and compressing the write data and the frequent use data from the comparator, and for acquiring the read data. | 06-27-2013 |
20130173564 | SYSTEM AND METHOD FOR DATA COMPRESSION USING MULTIPLE ENCODING TABLES - A system and method for compressing and decompressing multiple types of character data. The system and method employ multiple encoding tables, each designed for encoding a subset of character data, such as numeric data, uppercase letters, lowercase letters, Latin, or UNICODE data, to perform compressions and decompression of character data. The character encoding tables are smaller than the size of the alphabet of the uncompressed strings. | 07-04-2013 |
20130179409 | SEPARATION OF DATA CHUNKS INTO MULTIPLE STREAMS FOR COMPRESSION - For on-line separation of data chunks for compression, unrelated data chunks are classified based on various attributes. The classified data chunks are sent to at least one available compression contexts. The classified data chunks are related. The classified data chunks are encoded by at least one the compression operations. A compression ratio is achieved and included as feedback. | 07-11-2013 |
20130179410 | REAL-TIME SELECTION OF COMPRESSION OPERATIONS - Exemplary method, system, and computer program product embodiments for real-time selection of compression operations are provided. In one embodiment, by way of example only, available compression operations are initialized according to an assigned success factor. The available compression operations are tested for determining if at least one of the compression operations yields a compression ratio greater than a minimal compression ratio. The available compression operations selected in real time for compressing at least one of the data blocks is applied. Additional system and computer program product embodiments are disclosed and provide related advantages. | 07-11-2013 |
20130179411 | SEPARATION OF DATA CHUNKS INTO MULTIPLE STREAMS FOR COMPRESSION - For on-line separation of data chunks for compression, unrelated data chunks are classified based on various attributes. The classified data chunks are sent to at least one available compression contexts. The classified data chunks are related. The classified data chunks are encoded by at least one the compression operations. A compression ratio is achieved and included as feedback. | 07-11-2013 |
20130179412 | QUERY-AWARE COMPRESSION OF JOIN RESULTS - A method is provided for compressing results of a join query. A join order of a result set is determined from the join query, where the result set includes a plurality of tuples. A plurality of dictionary entries for the result set is received. A nested hierarchy of dictionaries is created based on the join order and the dictionary entries. A plurality of encoded tuples is received. The nested hierarchy of dictionaries is used by a processor to decode the plurality of encoded tuples so as to produce the plurality of tuples of the result set. | 07-11-2013 |
20130179413 | Compressed Distributed Storage Systems And Methods For Providing Same - Disclosed are embodiments of a compressed distributed storage system that is designed to satisfy: reliability; minimum storage; efficient update; cost-effective access. An exemplary system can comprise a splitter, an encoder, a parameterizer, and a compressor. In contrast to the prior art, the encoding is performed before the compression. Furthermore, in the exemplary system parameterization, data classification, and memory-assisted compression are key features in efficient compression. The splitter can split an input data file into a plurality of original segments. The encoder can perform fault-tolerant encoding on the plurality of original segments, providing plurality of redundant segments. The parameterizer can classify each redundant segment and form and memorize statistics (context) of each class of the redundant segments. With the class-based context, each redundant segment can be compressed and later decompressed individually. Each compressed redundant segment can be stored at a storage unit of a distributed storage system. | 07-11-2013 |
20130185267 | METHODS AND SYSTEMS FOR COMPRESSING AND COMPARING GENOMIC DATA - Systems and methods are disclosed for compressing and comparing data such as genomic data. The disclosed systems and methods may include selecting a segment, creating a delta representation of the segment, the delta representation comprising a script, and storing the script. Furthermore, the disclosed systems and methods may include receiving a first script comprising a compressed version of a first segment and receiving a second script comprising a compressed version of a second segment. The disclosed systems and methods may further include comparing the first script to the second script and determining if the first segment matches the second segment based upon the comparison of the first script to the second script. | 07-18-2013 |
20130185268 | METHODS OF COMPRESSING AND STORING DATA AND STORAGE DEVICES USING THE METHODS - A method of compressing and storing data may include generating target code words by compressing target data to be stored in a memory device by using an initially set compression algorithm; and/or storing in the memory device the target code words according to a data pattern that is repeated in the target code words. A method of compressing and storing data may include generating target code words by compressing target data to be stored in a memory device by using an initially set compression algorithm; and/or storing in the memory device delta-encoded information obtained by performing delta-encoding on the target code words based on source code words including the data pattern. | 07-18-2013 |
20130185269 | REAL-TIME SELECTION OF COMPRESSION OPERATIONS - Exemplary method, system, and computer program product embodiments for real-time selection of compression operations are provided. In one embodiment, by way of example only, available compression operations are initialized according to an assigned success factor. The available compression operations are tested for determining if at least one of the compression operations yields a compression ratio greater than a minimal compression ratio. The available compression operations selected in real time for compressing at least one of the data blocks is applied. Additional system and computer program product embodiments are disclosed and provide related advantages. | 07-18-2013 |
20130191351 | Compressing, storing and searching sequence data - The redundancy in genomic sequence data is exploited by compressing sequence data in such a way as to allow direct computation on the compressed data using methods that are referred to herein as “compressive” algorithms. This approach reduces the task of computing on many similar genomes to only slightly more than that of operating on just one. In this approach, the redundancy among genomes is translated into computational acceleration by storing genomes in a compressed format that respects the structure of similarities and differences important to analysis. Specifically, these differences are the nucleotide substitutions, insertions, deletions, and rearrangements introduced by evolution. Once such a compressed library has been created, analysis is performed on it in time proportional to its compressed size, rather than having to reconstruct the full data set every time one wishes to query it. | 07-25-2013 |
20130191352 | DYNAMIC PARTIAL UNCOMPRESSION OF A DATABASE TABLE - A database dynamic partial uncompression mechanism determines when to dynamically uncompress one or more compressed portions of a database table that also includes uncompressed portions. A query may include an express term that specifies whether or not to skip compressed portions. In addition, a query may include associated information that specifies whether or not to skip compressed portions, and one or more thresholds that may be used to determine if the system is too busy to perform uncompression. A display mechanism may also determine whether or not to display compressed portions. The uncompression may occur at the database server or at a client. The database dynamic partial uncompression mechanism thus performs dynamic uncompression in a way that preferably uncompresses one or more compressed portions of a partially compressed database table only when the compressed portions satisfy a query and/or need to be displayed. | 07-25-2013 |
20130191353 | DYNAMIC PARTIAL UNCOMPRESSION OF A DATABASE TABLE - A database dynamic partial uncompression mechanism determines when to dynamically uncompress one or more compressed portions of a database table that also includes uncompressed portions. A query may include an express term that specifies whether or not to skip compressed portions. In addition, a query may include associated information that specifies whether or not to skip compressed portions, and one or more thresholds that may be used to determine if the system is too busy to perform uncompression. A display mechanism may also determine whether or not to display compressed portions. The uncompression may occur at the database server or at a client. The database dynamic partial uncompression mechanism thus performs dynamic uncompression in a way that preferably uncompresses one or more compressed portions of a partially compressed database table only when the compressed portions satisfy a query and/or need to be displayed. | 07-25-2013 |
20130198151 | METHODS FOR FILE SHARING RELATED TO THE BIT FOUNTAIN PROTOCOL - An embodiment relates to distributing media over a peer-to-peer network by employing a digital fountain coding. Accordingly, the file is separated into file portions and the portions are combined to obtain encoded portions which are then transmitted. A file portion may form a part of a plurality of the encoded and transmitted file portions. The portions may be pieces and/or blocks of the file, wherein a piece includes a plurality of blocks. An embodiment further provides mechanisms for efficient block-request-transmission approaches in which the initial requests for blocks in the file are transmitted and additional requests for some random blocks are transmitted. The additional requests may be transmitted after each piece or after the entire file blocks have been requested, or both. | 08-01-2013 |
20130198152 | SYSTEMS AND METHODS FOR DATA COMPRESSION - In one example embodiment, an updated version of a file is encoded via differential encoding from an original version of the file ( | 08-01-2013 |
20130204850 | METHOD AND SYSTEM FOR COMPRESSING DATA RECORDS AND FOR PROCESSING COMPRESSED DATA RECORDS - System and method to compress data records by providing data records with a binary structure; dividing the data records into several bit vectors; reducing the size of each bit vector by dividing the bit vector into consecutive partial areas of equal size, each partial area consisting of n bits, classifying the partial areas as trivial partial areas, quasi-trivial partial areas and non-trivial partial areas, combining one non-trivial or several consecutive non-trivial partial areas into one so named R block, and removing the trivial partial areas; as well as combining one quasi-trivial or several consecutive quasi-trivial partial areas into one so named O block. | 08-08-2013 |
20130204851 | METHOD AND APPARATUS FOR COMPRESSING AND DECOMPRESSING GENETIC INFORMATION OBTAINED BY USING NEXT GENERATION SEQUENCING (NGS) - Provided are methods and apparatuses for compressing genetic information, the methods and apparatuses obtaining read information about reads and alignment information about positions of the reads that are aligned to a reference sequence, and generating a compressed file comprising information about an address of a block corresponding to the aligned reads. Also, a method and apparatus for decompressing genetic information obtains a compressed file with respect to the genetic information, determines an address of a block corresponding to input gene search information, from the compressed file, and selectively decompresses genetic information corresponding to the determined address. | 08-08-2013 |
20130204852 | APPARATUS AND METHOD FOR TRANSMITTING DATA - A data transmission apparatus and method. Information exchange between virtual worlds is achieved by encoding information on a virtual world into metadata and transmitting the metadata to another virtual world. In particular, since the information on the virtual world is encoded into a binary format, high speed data transmission may be achieved. | 08-08-2013 |
20130212075 | ASSESSING COMPRESSED-DATABASE RAW SIZE - A computer-implemented process ( | 08-15-2013 |
20130212076 | Generating Content Snippets Using a Tokenspace Repository - A search engine server system receives from a client system a search query and identifies a set of documents in accordance with the search query. A content snippet corresponding to content in a respective document of the identified set of documents is generated, the content snippet associated with at least one query term of the one or more query terms in the search query. A response to the search query is returned to the client system, the response including information identifying at least the respective document and including the content snippet. Generating the content snippet includes performing a first decompression operation on first token identifiers, from a compressed document repository, to provide a set of second token identifiers, and performing a second decompression operation on the set of second token identifiers to recover uncompressed content comprising a portion of the respective document. | 08-15-2013 |
20130212077 | DEMAND PAGING METHOD FOR MOBILE TERMINAL, CONTROLLER AND MOBILE TERMINAL - A demand paging method for a mobile terminal, a controller and a mobile terminal, wherein the demand paging method determines, when a mobile terminal needs to operate a compressed file, a storage location of the compressed file in an external part of the controller of the mobile terminal; a decoding unit of the internal part of the controller of the mobile terminal decompresses the compressed file in the storage location; the mobile terminal saves the decompressed file to a designated part of the memory, wherein the designated part of the memory comprises the memory in the internal part of the controller of the mobile terminal and/or the memory in the external part of the controller of the mobile terminal; the mobile terminal continues to operate on the basis of the decompressed file. The technical solution increases the processing efficiency of demand paging of the mobile terminal. | 08-15-2013 |
20130226885 | PATH-DECOMPOSED TRIE DATA STRUCTURES - Path-decomposed trie data structures are described, for example, for representing sets of strings in a succinct manner whilst still enabling fast operations on the string sets such as string retrieval or looking up a string with a specified identifier. A path-decomposed trie is a trie (tree data structure for storing a set of strings) where each node in the path decomposed trie represents a path in the trie. In various embodiments a path-decomposed trie data structure is represented succinctly by interleaving node labels and node degrees in an array and optionally by compressing the node labels using a static dictionary. Node labels may be string characters and a node degree may be a number of children of a node. In some embodiments a path-decomposed hollow trie data structure is used to provide a hash function for string sets. | 08-29-2013 |
20130238573 | CONTEXTUAL DATA COMPRESSION FOR GEO-TRACKING APPLICATIONS - Various exemplary embodiments relate to a method of compressing location data. The method may include: receiving original location data; selecting a contextual profile based at least in part on the original location data; selecting a compression method based on the contextual profile; and converting the original location data to a compressed format based on the compression method. Various exemplary embodiments relate to a system for compressing location data. The system may include: a location receiver configured to generate original location data based at least on signals from global navigation satellite system (GNSS) satellites; a location engine configured to select a contextual profile based at least in part on the original location data; and a contextual compression filter configured to generate compressed location data in a compressed format based on the selected contextual profile. | 09-12-2013 |
20130238574 | CLOUD SYSTEM AND FILE COMPRESSION AND TRANSMISSION METHOD IN A CLOUD SYSTEM - The present invention relates to a cloud system and a method for compressing and sending a file in the cloud system, wherein when a user who accesses a cloud computing system and uploads or downloads files requests a target compression file to be compressed, whether the same file as the target compression file is stored in the cloud system or not is check and if, as a result of the check, the same file is present, an ID code of the target compression file is stored while compressing the target compression file. In accordance with the present invention, there are advantages in that a load of the cloud system can be reduced and there is a marginal network band because the same file is not redundantly stored if the same file is present in the cloud system. | 09-12-2013 |
20130246375 | METHOD AND SYSTEM FOR FACILITATING ACCESS TO RECORDED DATA - The present invention relates to a method and system for facilitating access to recorded data. The system comprises an interface and a processing device. The interface is arranged to receive data and the processing device is arranged to separate the received data in data subsets, compress each data subset and assign an identifier to each compressed data subset, thereby creating data units each comprising a compressed data subset and an associated identifier, the processing device further being arranged to establish an index on the basis of the assigned identifiers. | 09-19-2013 |
20130254171 | QUERY-BASED SEARCHING USING A VIRTUAL TABLE - A method of searching all tables in a data model is disclosed, using a non-materializing virtual table interface that acts as a view into the underlying data model. The virtual table is virtually built on the fly at query execution time, and maps to all columns and rows within the data model. A query on the virtual table is translated into a set of data model queries for searching the data model, based on columns selected from the virtual table and other specified search parameters, as well as the virtual table definition. The search process works in conjunction with data domains, and uses compaction and tokenization of data. | 09-26-2013 |
20130262407 | MULTIPLEX CLASSIFICATION FOR TABULAR DATA COMPRESSION - For multiplexer classification for column compression of tabular data, similar type data segments are classified into classes for grouping the data segments into compression streams associated with each one of the classes. The compression streams are encoded based on a class-specific optimized encoding operation. The compression streams into one output buffer, wherein the compression streams are extracted. | 10-03-2013 |
20130262408 | TRANSFORMATION FUNCTIONS FOR COMPRESSION AND DECOMPRESSION OF DATA IN COMPUTING ENVIRONMENTS AND SYSTEMS - One or more transformation functions can be used in connection or together with one or more compression/decompression techniques. A transformation function can transform data (e.g., a data object) into a form more suitable for compression and/or decompression. As a result, data can be compressed and/or decompressed more effectively. In addition, multiple data objects can be associated with various transformation functions and/or compression/decompression techniques. As a result, different approaches can be taken with respect to compression and decompression of data objects in an effort to find an optimum approach for compression of data objects that may vary significantly from each other and change over time. It will be appreciated that the objects can be associated with transformation functions in a dynamic manner to accommodate changes to data. Also, an extendible and/or extensible system can allow for growth and adaption of new data in forms not currently present or expected. | 10-03-2013 |
20130262409 | MULTIPLEX CLASSIFICATION FOR TABULAR DATA COMPRESSION - For column compression of tabular data, similar type data segments are classified into classes for grouping the data segments into compression streams associated with each one of the classes. The compression streams are encoded based on a class-specific optimized encoding operation. The compression streams into one output buffer, wherein the compression streams are extracted. | 10-03-2013 |
20130262410 | DATA PREVIEWING BEFORE RECALLING LARGE DATA FILES - Techniques for providing data preview before recalling large data files are disclosed. In one aspect, a data file is made accessible while being offline by converting the data file from a native format to a preview format, storing the data file in the preview format in a primary storage that is locally available and moving, after the conversion to the preview format, the data file in the native format to a secondary storage. When a viewing request is received for the data file, the data file in the preview format is displayed to fulfill the viewing request. | 10-03-2013 |
20130262411 | FILE MAP COMPRESSION - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for compressing file maps. In one aspect, a method includes accessing a file maintained by a file system that manages access to a block device. The file includes a plurality of active blocks associated with a respective logical block number and a respective block index. The method also includes assigning a file index to the file, analyzing the file to determine a maximum block index and a minimum block index, and identifying runs of blocks in the plurality of active blocks. Each run of blocks includes a respective start block. For each of the runs of blocks, the method includes identifying a respective length. For each start block, the method includes generating a file map entry for each start block. The method also includes storing the file map entries in a file map. | 10-03-2013 |
20130262412 | Method and System For Database Transaction Log Compression On SQL Server - The present invention provides a method and system for providing database transaction log compression, where the transaction log data that is written to the transaction log file is compressed independently from the database data. In accordance with the present invention, the method and system provide for obtaining data to be written to the database transaction log file, compressing the data to be written to the database transaction log file and writing the compressed transaction log data to the database transaction log file. | 10-03-2013 |
20130262413 | INFORMATION PROCESSING DEVICES THAT MERGE FILES, INFORMATION PROCESSING METHODS FOR MERGING FILES, AND COMPUTER-READABLE MEDIA STORING INSTRUCTIONS THAT INSTRUCT INFORMATION PROCESSING DEVICES TO MERGE FILES - An information processing devices transmits a request including identifying information that identifies the information processing device. The information processing device receives a first file and first location information that represents a location of a first terminal device. The information processing device receives a second file and second location information that represents a location of a second terminal device. The information processing device determines a positional relationship between the first and second terminal devices based on the first and second location information. The information processing device merges the first and second files in an arrangement based on the positional relationship between the first and second terminal devices. Some information processing devices receive a first file request and particular identifying information identifying a particular terminal device. The information processing devices transmit a second file request and a particular response including a particular file in response to receiving the first file request. | 10-03-2013 |
20130262414 | SYSTEM AND METHOD FOR MULTI-RESIDUE MULTIVARIATE DATA COMPRESSION - A method for encoding data comprising generating a table having N moduli, where N is a positive integer equal to two or more, where each of a plurality of integers has a unique set of residue values associated with the moduli. Storing or transmitting the first data field value of a sequence of L data fields values, where L is an integer equal to or greater than 2. Storing or transmitting a set of K residue values, where K10-03-2013 | |
20130262415 | AUTOMATED ENCODING OF INCREMENT OPERATORS - In one example a method includes: receiving a first input value associated with a first data field; responsive to determining that the first data field is associated with an increment operation, selecting a second input value associated with a corresponding second data field of a previously transmitted message; comparing the first input value and second input value to determine if the first input value includes a sum of the second input value and an increment value; when the first input value includes the sum of the second input value and increment value, generating a message that omits the first input value for the first data field, and providing an operator symbol indicating the increment operation to specify that the first data field of the message is to be associated with the sum of the increment value and second input value of the second data field in the previously transmitted message. | 10-03-2013 |
20130268501 | SYSTEM AND METHOD FOR MONITORING DISTRIBUTED ASSET DATA - A computer-based monitoring system and monitoring method implemented in computer software for detecting, estimating, and reporting the condition states, their changes, and anomalies for many assets. The assets are of same type, are operated over a period of time, and outfitted with data collection systems. The proposed monitoring method accounts for variability of working conditions for each asset by using regression model that characterizes asset performance. The assets are of the same type but not identical. The proposed monitoring method accounts for asset-to-asset variability; it also accounts for drifts and trends in the asset condition and data. The proposed monitoring system can perform distributed processing of massive amounts of historical data without discarding any useful information where moving all the asset data into one central computing system might be infeasible. The overall processing is includes distributed preprocessing data records from each asset to produce compressed data. | 10-10-2013 |
20130268502 | DATA MANAGEMENT APPARATUS AND DATA MANAGEMENT METHOD - A data management method is disclosed. The data management method includes receiving input image data including a plurality of frames, sorting a type of a frame included in the input image data, and erasing one or more I-frames among the plurality of frames included in the input image data or erasing at least a portion of data corresponding to the one or more I-frames among the plurality of frames included in the input image data. Thus, the data management method stores a low amount of image data in a limited storage space while minimizing loss of the image data, thereby effectively storing and managing data. | 10-10-2013 |
20130275396 | Systems and Methods for Selecting Data Compression for Storage Data in a Storage System - Storage systems and methods to improve space saving from data compression by providing a plurality of compression processes, and optionally, one or more parameters for controlling operation of the compression processes and selecting from the plurality of compression processes and the parameters to satisfy resource limits, such as CPU usage and memory usage. In one embodiment, the methods takes into account the content-type, such as text file or video file, and select the compression process and parameters that provide the greatest space savings for that content type while also remaining within a defined resource-usage limit. | 10-17-2013 |
20130275397 | TABLE BOUNDARY DETECTION IN DATA BLOCKS FOR COMPRESSION - Data is converted into a minimized data representation using a suffix tree by sorting data streams according to symbolic representations for building table boundary formation patterns. The converted data is fully reversible for reconstruction while retaining minimal header information. | 10-17-2013 |
20130275398 | CLOUD SERVICE ENABLED TO HANDLE A SET OF FILES DEPICTED TO A USER AS A SINGLE FILE IN A NATIVE OPERATING SYSTEM - Systems and methods method enabling file actions to be performed on a folder structure in a cloud-based service are disclosed. In one aspect, embodiments of the present disclosure include a method, which may be implemented on a system, for representing the folder structure in a user interface to the cloud-based service as a file and enabling file actions to be performed on file representing the folder structure in the user interface to the cloud-based service. In one embodiment, the folder structure and associated content is stored on a server which provides the cloud-based service in a compressed file format which is able to preserve the metadata associated with the folder structure which indicates its representation as the file in the user interface. | 10-17-2013 |
20130275399 | TABLE BOUNDARY DETECTION IN DATA BLOCKS FOR COMPRESSION - Data is converted into a minimized data representation using a suffix tree by sorting data streams according to symbolic representations for building table boundary formation patterns. The converted data is fully reversible for reconstruction while retaining minimal header information. | 10-17-2013 |
20130275400 | DATA CORESET COMPRESSION - An approach to compression of a large (n point or samples) data set has a combination of one or more desirable technical properties: for a desired level of accuracy (ε), the number of compressed points (a “coreset”) representing the original data is O(log n); the level of accuracy comprises a guaranteed bound expressed as multiple of error of an associated line simplification of the data set; for a desired level of accuracy and a complexity (e.g., number k of optimal line segments) of the associated line simplification, the computation time is O(n); and for a desired level of accuracy (c) and a complexity of the associated line simplification, the storage required for the computation is O(log n). | 10-17-2013 |
20130282677 | DATA COMPRESSION SYSTEM FOR DNA SEQUENCE - The present invention discloses a data compression system for DNA sequence, which is a lossless compression system for DNA sequence data, based on the MA-ARV codebook, which is able to search the approximate repeat fragment of the MA-ARV code vector in the whole sequence, and use a heuristic optimization algorithm of memetic algorithm to optimize the construction process of the compressed codebook, so as to fully use the repeat nature of DNA sequence data, and eliminate the redundancy effectively. | 10-24-2013 |
20130290281 | STORAGE APPARATUS AND DATA MANAGEMENT METHOD - The processing load when rewriting portions of compressed data is alleviated. | 10-31-2013 |
20130290282 | Logless Atomic Data Movement - A system and method of logless atomic data movement. An internal transaction is started within a multi-level storage architecture, the internal transaction to merge data from the first level storage structure to the second level storage structure. Committed data is read from a first level storage structure of the multi-level storage architecture as specified by the internal transaction. The committed data from the first level storage structure is inserted into a second level storage structure in a bulk insertion process, and the committed data is marked as being deleted from the first level storage. The internal transaction is then committed to the multi-level storage architecture when the committed data has been inserted into the second level storage structure. | 10-31-2013 |
20130297573 | Character Data Compression for Reducing Storage Requirements in a Database System - A system, method, and computer program product for character data compression for reducing data storage requirements in a database system are described. Embodiments include identifying data of a particular character type in a full data page, and identifying usage frequency of each character of the particular character type. Each character is encoded based on the identified usage frequency and stored, with storage requirements for most frequently used characters are reduced. | 11-07-2013 |
20130297574 | METHOD AND APPARATUS FOR COMPRESSING THREE-DIMENSIONAL POINT CLOUD DATA - A method and apparatus for compressing three-dimensional point cloud data is disclosed. In one aspect, a method for compressing three dimensional point cloud includes steps of retrieving three-dimensional point cloud data; providing one or more grids to the three-dimensional point cloud data; assigning one binary digit to each three-dimensional grid voxel containing said point cloud data and assigning the other binary digit to each three-dimensional grid voxel that does not have said point cloud data; converting the three-dimensional grid into two-dimensional tiles; and storing information of a plurality of binary strings in said two-dimensional tiles. In one embodiment, the step of storing information of a plurality of binary strings in said two-dimensional tiles includes a step of storing the number of repeating times of each binary digit in the binary strings. The method can significantly reduce memory space, as well as preserving small details of the point cloud. | 11-07-2013 |
20130297575 | DATA MANAGEMENT SYSTEMS AND METHODS USING COMPRESSION - The present disclosure is directed to systems and methods for providing fast and efficient data compression using a combination of content dependent, content estimation, and content independent data compression. In one aspect of the disclosure a method for compressing data comprises the steps of: analyzing a data block of an input data stream to identify a data type of the data block, the input data stream comprising a plurality of disparate data types; performing content dependent data compression on the data block, if the data type of the data block is identified; performing content estimation data compression if the content is estimable; and performing content independent data compression on the data block, if the data type of the data block is not identified or estimable. In another aspect of the present invention LZDR compression is applied to simultaneously perform one method of compression while computing statistics useful in estimating the optimal form of compression to be applied. | 11-07-2013 |
20130311435 | MINIMIZATION OF SURPRISAL DATA THROUGH APPLICATION OF HIERARCHY FILTER PATTERN - A method, computer product, and computer system of minimizing surprisal data comprising: at a source, reading and identifying characteristics of a genetic sequence of an organism; receiving an input of rank of at least two identified characteristics of the genetic sequence of the organism; generating a hierarchy of ranked, identified characteristics based on the rank of the at least two identified characteristics of the genetic sequence of the organism; comparing the hierarchy of ranked, identified characteristics to a repository of reference genomes; and if at least one reference genome from the repository matches the hierarchy of ranked, identified characteristics, breaking the matched reference genomes into pieces, combining pieces associated with the identified characteristics from at least one matched reference genome to form a filter pattern to be compared to the nucleotides of the genetic sequence of the organism, to obtain differences and create surprisal data. | 11-21-2013 |
20130332428 | Online and Workload Driven Index Defragmentation - The subject disclosure is directed towards defragmenting one or more ranges of a database index based upon actual usage statistics and policy. A range tracker tracks and uses statistics corresponding to actual I/O operations to determine whether the benefit of defragmenting a range sufficiently (based upon the policy) exceeds its cost. If so, the online range defragmenter automatically defragments the range in an online manner. The range tracker may be configurable to monitor less than all ranges of the index. | 12-12-2013 |
20130339321 | METHOD, SYSTEM, AND COMPUTER-READABLE MEDIUM FOR PROVIDING A SCALABLE BIO-INFORMATICS SEQUENCE SEARCH ON CLOUD - The present invention relates to a computer-implemented method, system and computer readable medium for providing a scalable bio-informatics sequence search on cloud. The method comprises the steps of partitioning a genome data into a plurality of datasets and storing the plurality of data sets in a database. Receiving at least one sequence search request input and searching for a genome sequence in the database corresponding to the search request input and scaling of the sequence search based on the sequence search request input. | 12-19-2013 |
20130339322 | REDUCING DECOMPRESSION LATENCY IN A COMPRESSION STORAGE SYSTEM - In a compression processing storage system, using a pool of compression cores, the compression cores are assigned to process either compression operations, decompression operations, or decompression and compression operations, which are scheduled for processing. A maximum number of the compression cores are set for processing only the decompression operations, thereby lowering a decompression latency. A minimal number of the compression cores are allocated for processing the compression operations, thereby increasing compression latency. Upon reaching a throughput limit for the compression operations that causes the minimal number of the plurality of compression cores to reach a busy status, the minimal number of the plurality of compression cores for processing the compression operations is increased. | 12-19-2013 |
20130339323 | METHODS AND SYSTEMS FOR ENCODING/DECODING FILES AND TRANSMISSIONS THEREOF - In one embodiment, the instant invention includes a computer system that includes at least the following components: a) a first computer that performs, in concurrent manner, at least the following tasks: dividing a computer file into a plurality of segments, compressing segments, and sending the compressed segments to a second computer over a network; b) the second computer that performs, in concurrent manner, at least the following tasks: decompressing the compressed segments and assembling the decompressed segment to reconstruct the computer file, where the compressing task performed by the first computer and the decompressing task performed by the second computer are synchronized and performed concurrently. | 12-19-2013 |
20130339324 | SYSTEMS AND METHODS FOR TRANSFORMATION OF LOGICAL DATA OBJECTS FOR STORAGE - Methods and systems for transforming a logical data object for storage in a storage device configured to operate with at least one storage protocol. One method comprises creating in the storage device a transformed logical data object comprising a one or more allocated storage sections with a predefined size and receiving one or more data chunks corresponding to the transformed logical data object. The method further comprises determining if each received data chunk comprises a predefined criterion, transforming each data chunk that comprises the predefined criterion, maintaining each data chuck in raw form that does not comprise the predefined criterion, and sequentially storing each transformed data chuck and data chunk in raw form into said one or more allocated storage sections in accordance with an order said transformed data chunks and data chunks in raw form are received. One system comprises a processor configured to perform the above method. | 12-19-2013 |
20130346378 | MEMORY COMPACTION MECHANISM FOR MAIN MEMORY DATABASES - The present invention extends to methods, systems, and computer program products for performing memory compaction in a main memory database. The main memory database stores records within pages which are organized in doubly linked lists within partition heaps. The memory compaction process uses quasi-updates to move records from a page to the emptied to an active page in a partition heap. The quasi-updates create a new version of the record in the active page, the new version having the same data contents as the old version of the record. The creation of the new version can be performed using a transaction that employs wait for dependencies to allow the old version of the record to be read while the transaction is creating the new version thereby minimizing the effect of the memory compaction process on other transactions in the main memory database. | 12-26-2013 |
20130346379 | STREAMING DYNAMICALLY-GENERATED ZIP ARCHIVE FILES - A method and system for streaming dynamically generated Zip archive file content using a standard, non-streaming Zip archive format. In response to a request from a client to receive one or more files, a Zip archive file is dynamically generated that includes at least one file that is altered while servicing the request, wherein the size of the altered file is unknown prior to completion of the alteration operation. For a Zip file entry corresponding to an altered file, a local file header including an overestimated file size and predetermined CRC32 value is generated. After alteration, the file entry content is adjusted using padding and a CRC32 adjustment such that the length and CRC32 values for the resulting Zip file entry match the overestimated file size and predetermined CRC32 value. Examples of file alteration operations include watermarking, compressing, and/or encrypting the file content. | 12-26-2013 |
20140006364 | MEDIA STREAM INDEX MERGING | 01-02-2014 |
20140006365 | MINIMIZATION OF EPIGENETIC SURPRISAL DATA OF EPIGENETIC DATA WITHIN A TIME SERIES | 01-02-2014 |
20140012824 | DATA FORMAT FOR WEBSITE TRAFFIC STATISTICS - A data format is optimized for storing data such as website traffic data. The data format enables easy access to and filtering of data, for example in generating website traffic reports. The data format also provides significant data compression. A method for generating a data file according to the data format employs linear compression and indexing to efficiently store the data. Data stored according to the format can be easily retrieved, particularly when a known value is specified and particular entries matching the known value are sought. | 01-09-2014 |
20140032509 | ACCELERATED ROW DECOMPRESSION - A method comprises streaming one or more pages of a database to a hardware accelerator, extracting one or more rows from each of the one or more pages of the database, determining whether a given one of the extracted rows is compressed, decompressing the given one of the extracted rows responsive to the determination and outputting the decompressed row. The decompressing step is performed in the hardware accelerator. The hardware accelerator may be a field-programmable gate array. The method allows for hardware accelerated row decompression. | 01-30-2014 |
20140040213 | AGGREGATING DATA IN A MEDIATION SYSTEM - Records received from one or more sources in a network are processed. For each of multiple intervals of time, a matching procedure is attempted on sets of one or more records, including comparing identifiers associated with different records to generate the sets and determining whether or not a completeness criterion is satisfied for one or more of the sets. The processing also includes, for at least some of the intervals of time, processing at least one complete set, consisting of one or more of the received records on which the matching procedure is first attempted during the interval of time and one or more records stored in a data store before the interval of time, and for at least some of the intervals of time, processing at least one incomplete set, consisting of one or more records stored in the data store before the interval of time. | 02-06-2014 |
20140040214 | Entropy Coding and Decoding Using Polar Codes - Technologies are described herein for compressing or decompressing data using polar codes. Some example technologies may receive a data string comprising a first set of symbols. The technologies may transform the data string into a generalized message comprising a second set of symbols by mapping the data string to the generalized message via an inverse of a transformation function. The technologies may identify, based on a polar code, fixed symbols of the generalized message. The technologies may generate a compressed data string by extracting the fixed symbols from the generalized message and concatenating the fixed symbols into the compressed data string. As a result, the generalized message may be transformed into the compressed data string. | 02-06-2014 |
20140040215 | METHOD FOR ENCODING A MESH MODEL, ENCODED MESH MODEL AND METHOD FOR DECODING A MESH MODEL - Many 3D mesh models have a large number of small connected components that are repeated in various positions, scales and orientations. The respective positions are defined by the position of at least one reference point per component. For an enhanced encoding of the positions of the respective reference points, a given space is divided into segments and the number of points lying in each particular segment is determined. When a cell with at least n points is subdivided into child cells, an indication is added indicating if all points of a parent are in only one child cell. If so, the index of the only non-empty child node is encoded, while otherwise the number of points in one of the two child cells is decremented and encoded. The invention avoids non-effective subdivisions of a cell, and therefore improves the compression efficiency. | 02-06-2014 |
20140052700 | Delta Version Clustering and Re-Anchoring - A system, a method, and a computer program product for delta version clustering and re-anchoring are provided. A first anchor having a plurality of delta-compressed versions of data dependent on the first anchor is generated. The first anchor and the plurality of delta-compressed versions form a cluster. A second anchor is generated. The first anchor is replaced with the second anchor. The replacing includes re-computing at least one delta-compressed version in the plurality of delta-compressed versions to be dependent on the second anchor. The second anchor replaces the first anchor as an anchor of the cluster. | 02-20-2014 |
20140052701 | SYSTEM AND METHOD FOR COMPRESSING PRODUCTION DATA STREAM AND FILTERING COMPRESSED DATA WITH DIFFERENT CRITERIA - Production data are streamed by a shop floor (a field) of a plant towards a data compression processor inside a MES/ERP server. The data stream is segmented in field data intervals of variable duration, each one carrying a tag composed of initial timespan s°, final timespan e°, and the variation v° undergone by the monitored variable. The processor takes a first incoming tag and calculates a data compression interval of constant duration y which is a function of e°, then it creates a vector [s°, e°, v°, m=v°, n=e°−s°]. Until the incoming tags fall into the current compression interval, subsequent variations v° are summed up and subsequent s° and e° updated, obtaining an updated vector [s, e, v, m, n], otherwise the compression vector is stored in a SQL database and a new compression interval entered. | 02-20-2014 |
20140059021 | FORMAT IDENTIFICATION FOR FRAGMENTED IMAGE DATA - Format identification for fragmented data is disclosed. In some embodiments, an input stream of information that includes a continuity property is received. A format identifier of at least a portion of the stream is determined, wherein the format identifier includes a data representation size, a group size, and an alignment that is consistent with the continuity property. The stream of information is compressed using a compression technique selected based on the format identifier to produce a compressed stream, and the compressed stream is stored. | 02-27-2014 |
20140059022 | FORMAT IDENTIFICATION FOR FRAGMENTED IMAGE DATA - Format identification for fragmented data is disclosed. In some embodiments, an input stream of information that is divided into fragments is received. Fragment boundaries are determined and a data format for each fragment is found based on continuity properties including by: dividing the stream of information into windows, determining whether each window has a known or unknown format; and comparing portions of windows having an unknown format with neighboring windows to determine fragment boundaries. The stream of information is compressed using a compression technique selected based on the data format, and the compressed stream is stored. | 02-27-2014 |
20140067777 | COMPRESSION OF TIMING DATA OF DATABASE SYSTEMS AND ENVIRONMENTS - Timing data associated with a database or database system can be stored in a reduced or compressed form which can be decompressed back to a full or original form. In doing so, timing data can be compressed by using a subset of a full set of possible values (e.g., a determined range which is more likely to occur) instead of using a full set of possible values. Timing data can also be compressed by eliminating redundant, insignificant duplicate and/or common values, for example, between one or more components (e.g., start and end times of a period of time) of the timing data. | 03-06-2014 |
20140074805 | STORING COMPRESSION UNITS IN RELATIONAL TABLES - A database server stores compressed units in data blocks of a database. A table (or data from a plurality of rows thereof) is first compressed into a “compression unit” using any of a wide variety of compression techniques. The compression unit is then stored in one or more data block rows across one or more data blocks. As a result, a single data block row may comprise compressed data for a plurality of table rows, as encoded within the compression unit. Storage of compression units in data blocks maintains compatibility with existing data block-based databases, thus allowing the use of compression units in preexisting databases without modification to the underlying format of the database. The compression units may, for example, co-exist with uncompressed tables. Various techniques allow a database server to optimize access to data in the compression unit, so that the compression is virtually transparent to the user. | 03-13-2014 |
20140081929 | NEARSTONE COMPRESSION OF DATA IN A STORAGE SYSTEM - A storage server is configured to receive a request to store a data block from a client. The request to store the data block is serviced by the storage server by compressing the data block into a compression group, which includes a number of compressed data blocks. The storage server stores the compression group in a non-volatile memory and flushes the compression group from the non-volatile memory to a physical storage device in response to reaching a consistency point. By compressing data to be stored in system memory of a storage server, the amount of data that can be processed during a given time period by a data storage system is increased. Furthermore, an increase in performance can be achieved at a lower cost, since the cost of additional physical system memory modules can be avoided. | 03-20-2014 |
20140081930 | COMPRESSION SCHEME FOR IMPROVING CACHE BEHAVIOR IN DATABASE SYSTEMS - The apparatuses and methods described herein may operate to identify, from an index structure stored in memory, a reference minimum bounding shape that encloses at least one minimum bounding shape. Each of the at least one minimum bounding shape may correspond to a data object associated with a leaf node of the index structure. Coordinates of a point of the at least one minimum bounding shape may be associated with a set of first values to produce a relative representation of the at least one minimum bounding shape. The set of first values may be calculated relative to coordinates of a reference point of the reference minimum bounding shape such that each of the set of first values comprises a first number of significant bits fewer than a second number of significant bits representing a second value associated with a corresponding one of absolute coordinates of the point. | 03-20-2014 |
20140089276 | SEARCH UNIT TO ACCELERATE VARIABLE LENGTH COMPRESSION/DECOMPRESSION - Systems and methods to accelerate compression and decompression with a search unit implemented in the processor core. According to an embodiment, a search unit may be implemented to perform compression or decompression on an input stream of data. The search unit may use a look-up table to identify appropriate compression or decompression symbols. The look-up table may be populated with a table derived using the variable length coding symbols of a sequence of vertices to be compressed or extracted from a received data stream to be decompressed. A comparator and a finite state machine may be implemented in the search unit to facilitate traversal of the look-up table. | 03-27-2014 |
20140089277 | Method and Device for Compressing, Decompressing and Querying Document - A method for processing an XML document with a schema includes extracting structure content and data content of an XML document, determining path coding of a node in the structure content, and determining data content corresponding to the node according to a pre-stored preorder of the node, wherein the path coding of the node identifies a storage position of the node in the structure content through the node and other nodes in the structure content, and compressing respectively the node, the path coding of the node and the data content. | 03-27-2014 |
20140101116 | ROBUST TRANSMISSION OF DATA UTILIZING ENCODED DATA SLICES - A method begins by a processing module concurrently encoding a collection of data segments to produce sets of encoded data slices, where each set includes a total number of encoded data slices and where a decode threshold number of encoded data slices is required to recover a corresponding data segment. The method continues with the processing module determining a transmit number to be initially greater than the decode threshold number and less than the total number. The method continues with the processing module selecting a transmit number of encoded data slices from each set of encoded data slices to produce sets of transmit encoded data slices. The method continues with the processing module randomizing ordering of the sets of transmit encoded data slices to produce a random order of encoded data slices and transmitting encoded data slices of the random order of encoded data slices. | 04-10-2014 |
20140108360 | COMPRESSED NAVIGATION MAP DATA - A method for generating a compressed navigation map database from uncompressed navigation map data, wherein the uncompressed navigation map data contains different building blocks of navigation data, each building block addressing a functional aspect of the navigation data, each block containing strings of data. The method includes determining, for each block of the uncompressed navigation map data, most frequent substrings of the block; storing, for each block, the determined most frequent substrings of the block in a seed block; replacing, for each block, in the strings the determined most frequent substrings stored in the seed block by a reference to the seed block thereby generating a compressed block for each block; and storing, for each block, the compressed block and the seed block in order to generate the compressed navigation map database. | 04-17-2014 |
20140108361 | METHOD AND APPARATUS FOR PROVIDING LOCATION TRAJECTORY COMPRESSION BASED ON MAP STRUCTURE - An approach is provided for compressing location trajectories based on map structure. A compression platform causes, at least in part, a mapping of at least one location trajectory to at least one map to determine one or more intersections traveled along the at least one location trajectory. The compression platform further determines at least one compression key based, at least in part, on one or more outgoing roads of the one or more intersections. The compression platform also causes, at least in part, a compression of the at least one location trajectory based, at least in part, on the at least one compression key. | 04-17-2014 |
20140108362 | DATA COMPRESSION APPARATUS, DATA COMPRESSION METHOD, AND MEMORY SYSTEM INCLUDING THE DATA COMPRESSION APPARATUS - Provided are data compression method, data compression apparatus, and memory system. The data compression method includes receiving input data and generating a hash key for the input data, searching a hash table with the generated hash key, and if it is determined that the input data is a hash hit, compressing the input data using the hash table; and searching a cache memory with the input data, and if it is determined that the input data is a cache hit, compressing the input data using the cache memory. | 04-17-2014 |
20140108363 | Encoding/Decoding Processing Method, Encoder/Decoder and Terminal - Embodiments of the present invention provide an encoding/decoding processing method, an encoder/decoder, and a terminal. The method includes determining whether a field in a file is a locally recognizable field. If the field is not a locally recognizable field, the method includes acquiring a field display identifier of the field, where the field display identifier is a content identifier of the field displayed by an application to a user, and decoding the field according to the field display identifier. According to the technical solutions in the embodiments of the present invention, it can be ensured that data is not lost in a process of exchanging a VCard file, thereby improving efficiency of exchanging the VCard file. | 04-17-2014 |
20140108364 | DATABASE COMPRESSION SYSTEM AND METHOD - A database compression system includes and analyzer, a counting engine, and a mapping engine. The analyzer analyzes a schema of a database by maintaining a list of attributes and corresponding values. The analyzer also analyzes a selection of entries in the database. The counting engine determines a frequency of occurrence of each attribute/value pair in the selection of entries. The mapping engine assigns a condensed code to a character string determined on the basis of the attribute/value pair with a highest frequency of occurrence. | 04-17-2014 |
20140114935 | METHOD OF COMPRESSING TEST FILE - A compression method for compressing an original test file is disclosed. The compression method includes the following steps: defining type modules; scanning the original test file line by line in bytes and matching data of the original test file with the type modules to determine types of the data; compressing continuous data of the same type in lines and representing each compressed portion with a thumbnail. The compression method enables a browser to read test files with a fast speed by compressing test files according to the types of data. | 04-24-2014 |
20140114936 | METHOD FOR GENERATING DATA IN STORAGE SYSTEM HAVING COMPRESSION FUNCTION - The present invention aims at improving the performance of a compression function in a storage system, and solves the prior art problem of having to decompress a whole compression unit even if a read request or a write request targets only a portion smaller than the compression unit, causing increase of overhead of decompression processing and elongation of processing time, and deteriorating performance The present invention prevents unnecessary decompression processing and reduces the overhead of processing by suppressing the range of decompression processing to a minimum portion according to the relationship between the read/write request range and the compression unit. | 04-24-2014 |
20140114937 | METHOD TO SHORTEN HASH CHAINS IN LEMPEL-ZIV COMPRESSION OF DATA WITH REPETITIVE SYMBOLS - An apparatus having a circuit is disclosed. The circuit may be configured to (i) generate a sequence of hash values in a table from a stream of data values with repetitive values, (ii) find two consecutive ones of the hash values in the sequence that have a common value and (iii) create a shortened hash chain by generating a pointer in the table at an intermediate location that corresponds to a second of the two consecutive hash values. The pointer generally points forward in the table to an end location that corresponds to a last of the data values in a run of the data values. | 04-24-2014 |
20140114938 | DATA COMPRESSION APPARATUS AND METHOD - A data compression apparatus generates a global symbol table for an overlapping data using a part of the entire data to be compressed and a local symbol table that is not overlapped with the global symbol table and compressing data with a block as a unit. The apparatus increase compression efficiency. | 04-24-2014 |
20140122451 | SYSTEM AND METHOD FOR PREVENTING DUPLICATE FILE UPLOADS FROM A MOBILE DEVICE - A method and system for preventing duplicate file uploads in a remote content management system is described. The user device receives a hash value list associated with the files stored in the remote content management system. The user device calculates a hash value associated with new files to be uploaded. The system then compares the hash value(s) associated with the new file(s) to be uploaded with the hash value list received from the remote file storage system. If the hash values of any of the new files to be uploaded match a hash value on the hash value list, then the system prevents the new files from being uploaded to the remote file storage system. | 05-01-2014 |
20140122452 | UNIFIED TABLE QUERY PROCESSING - A system and method of query processing in a multi-level storage system having a unified table architecture. A query is received by a common query execution engine connected with the unified table architecture, the query specifying a data record. The common query execution engine performs a look-up for the data record based on the query at the first level storage structure. If the data record is not present at the first level storage structure, the common query execution engine performs separate look-ups in each of the second level storage structure and the main store. | 05-01-2014 |
20140129529 | Storing Data Files in a File System - A mechanism is provided for storing data files in a file system. The file system provides a plurality of reference data files, where each reference data file in the plurality of data files represents a group of similar data files. The mechanism creates a new data file and associated the new data file with one reference data file in the plurality of data files thus defining an associated reference data file of the plurality of reference data files. The mechanism informs the file system about the association of the new data file with the associated reference data file. The mechanism compresses the new data file using the associated reference data file thereby forming a compressed data file. The mechanism stores the compressed data file together with information about the association of the new data file with the associated reference data file. | 05-08-2014 |
20140129530 | SYSTEM, METHOD AND DATA STRUCTURE FOR FAST LOADING, STORING AND ACCESS TO HUGE DATA SETS IN REAL TIME - A computerized system including a processor and a computer-readable non-transient memory in communication with the processor, the memory storing instructions that when executed manage a novel data structure and related group of algorithms that can be used as a method for representing a set and as a base for very efficient indexing, hash and compression. SHB is an improvement of hierarchical bitmap. An improved database system that can utilize the innovative data structure which includes a raw data stream provided to the system via a data processing module, data blocks, fields indexes tables and a keys table. There is provided an index creating process and a columns creating process, for transforming the data blocks and tables into index blocks and data columns. | 05-08-2014 |
20140136492 | SYSTEMS AND METHODS FOR LOSSLESS COMPRESSION OF IMAGE COLOR PROFILES - Techniques to allow for accurate color representation of images stored within and delivered by a social networking system. In an embodiment, a match between at least a portion of a longest tag value from a plurality of tag values and a subsequence of a tagged element data string in a tag-based file associated with an image is identified. The tagged element data string and a tag table are optimized based on the match. | 05-15-2014 |
20140136493 | METHOD AND APPARATUS FOR MANAGING STORAGE SPACE ON STORAGE DEVICE IN AN ELECTRONIC APPARATUS BY USING CONTEXT DATA AND USER PROFILE DATA - A method and apparatus for reserving a usable storage space on a storage device is provided. The method includes collecting context data representing an environment surrounding the storage device; selecting at least one file from among files stored in the storage device by using at least one of the context data and user profile data; and processing the selected file and reserving a usable storage space on the storage device. The method reserves the usable storage space by using the context data or user profile data, thereby allowing efficient reserving of usable storage space without a user's manual intervention and preventing waste of unnecessary resources. | 05-15-2014 |
20140149367 | Row Identification Column Authorization Profiles - Tables in a database can include an internal RowID column. For each new row or new version of a row in the table, a new RowID can be assigned and stored in the RowID column. RowID values can be stored using either or both of range compression and block compression, or other compression approaches. In response to receipt of a query of the database table, at least one of a forward look up and a reverse lookup of a DocID value associated with a specific RowID value can be performed. | 05-29-2014 |
20140156608 | EFFICIENCY OF COMPRESSION OF DATA PAGES - A system includes a processor executing code to compress a first page of data stored in memory and calculate an effectiveness of the compression on the first page. The processor further, in response to the calculated compression effectiveness being at least equal to a pre-determined/pre-established compression effectiveness threshold: identifies a plurality of second pages of data from memory that have similarities in content with the first page; and sequentially performs subsequent compressions of second pages from among the plurality of second pages in an order that is based on a relative ranking of the plurality of second pages. The ranking of the second pages is according to a calculated differential parameter associated with each of the second pages, which indicates a level of similarity that exists between the first page and a corresponding second page. Higher ranked second pages are compressed ahead of lower rank second pages, yielding greater compression efficiency. | 06-05-2014 |
20140156609 | DATABASE TABLE COMPRESSION - Embodiments relate to table compression in a database. The database is organized in tables including rows and columns An aspect includes defining a range partition of a table of the database according to a first attribute of the table. Internal ranges of the table of the database are defined according to a second attribute of the table. A target internal range of the internal ranges is determined to insert a row as a new entry into the table. A determination is made as to whether an internal range compression directory exists for the target internal range. Based on determining that no internal range compression directory exists for the target internal range and a predefined threshold value of a number of rows is exceeded in the target internal range, the internal range compression directory for the target internal range is created. | 06-05-2014 |
20140156610 | SELF-GOVERNED CONTENTION-AWARE APPROACH TO SCHEDULING FILE DEFRAGMENTATION - A method, system, and computer program product for file storage defragmentation on a cluster of nodes. The method for self-governed, contention-aware scheduling of file defragmentation operations commences by calculating a score for candidate files of a storage volume, where the score is based on a fragmentation severity value. The process proceeds to determine an amount of contention for access to a candidate file (e.g., by accessing the candidate file to record the amount of time it takes to obtain access). If the fragmentation severity value and the amount of contention suggestion a benefit from defragmentation, then the method initiating defragmentation operations on the candidate file. The method delays for a calculated wait time before performing a second defragmentation operation. Real-time monitors are used to determine when the contention is too high or when system utilization is too high. Only files that have ever been opened are considered candidates for defragmentation. | 06-05-2014 |
20140156611 | EFFICIENCY OF COMPRESSION OF DATA PAGES - A method includes compressing a first page of data stored in memory and calculating an effectiveness of the compression on the first page. The method further includes, in response to the calculated compression effectiveness being at least equal to a pre-determined/pre-established compression effectiveness threshold: identifying a plurality of second pages of data from memory that have similarities in content with the first page; and sequentially performing subsequent compressions of second pages from among the plurality of second pages in an order that is based on a relative ranking of the plurality of second pages. The ranking of the second pages is according to a calculated differential parameter associated with each of the second pages, which indicates a level of similarity that exists between the first page and a corresponding second page. Higher ranked second pages are compressed ahead of lower rank second pages, yielding greater compression efficiency. | 06-05-2014 |
20140156612 | PREPARING LC/MS DATA FOR CLOUD AND/OR PARALLEL IMAGE COMPUTING - Functionality is described for data management and querying LC/MS spectrometry data, therefore making it easier to store, retrieve, transfer, and process the mass spectrometry data. The functionality transforms a plurality of raw LC/MS files obtained from a biological experiment into a set of LC/MS images on a common M/Z and RT grid compatible for image processing (e.g., time alignment, peak detection and quantification, differential analysis, etc.). The functionality then spits large LC/MS images into smaller chunks, therefore making easier parallel querying and processing using cloud or high performance computing systems. | 06-05-2014 |
20140156613 | Methods and Apparatus for Increasing the Efficiency of Electronic Data Storage and Transmission - An electronic data storage and transmission system. A plurality of electronic data objects may be associated to a plurality of electronic data indicators, and the associations may be combined. Contextual awareness of a second location may allow generation of streamlined electronic data objects. Electronic spatial data objects may be automatically contiguously combined and compression may be leveraged with combination efficiencies. Combinations of electronic data objects may be threshold limited. Transmission of electronic data may achieve effective compression and effective transmission rates exceeding a benchmark network transmission rate of an electronic data communications network. | 06-05-2014 |
20140172806 | SYSTEMS, METHODS, AND APPARATUSES FOR IMPLEMENTING DATA MASKING VIA COMPRESSION DICTIONARIES - In accordance with disclosed embodiments, there are provided methods, systems, and apparatuses for implementing data masking via compression dictionaries including, for example, means for receiving customer data at the host organization; compressing the customer data using dictionary based compression and a compression dictionary; storing the compressed customer data in a database of the host organization; retrieving the compressed customer data from the database of the host organization; and de-compressing the compressed customer data via a masked compression dictionary, in which the masked compression dictionary de-compresses the customer data into masked customer data. Other related embodiments are disclosed. | 06-19-2014 |
20140188820 | TECHNIQUES FOR FINDING A COLUMN WITH COLUMN PARTITIONING - Techniques for finding a column with column partitioning are provided. Metadata for a container row is expanded to include information for searching ranges of partitioned column values. The metadata identifies offsets to specific ranges and specific columns within a specific range. The offsets also identify where compressed data for a desired column resides. Thereby, permitting partitioned columns having compressed data to be located without being decompressed and decompressed on demand as needed. | 07-03-2014 |
20140188821 | Method and System to Avoid Space Bloating During Run-Time Compression - Methods, systems, and computer program products are provided to manage a database system. The method includes locking during a database system idle time access by the database system to a data page of a data allocation unit, compressing during the database system idle time a data stored in the locked data page, and recording during the database system idle time an indication that the compressed and locked data page includes free storage space, wherein unlocked data pages of the data allocation unit are accessible by the database system during the compressing of the data stored in the locked data page. Thus, the data page may be compressed during idle time and the space freed therein may be used during a subsequent run time without the need for a reorganization of the data pages within the corresponding table (as in, for example, operation of a reorg+rebuild SQL command combination). | 07-03-2014 |
20140188822 | Efficient De-Duping Using Deep Packet Inspection - The efficiency of data de-duplication may be improved by storing related file data in a single container, or in multiple linked containers, of a history. Additionally, the efficiency of data de-duplication may be improved when shorter hash tables are used to reference historical data in a history. Shorter hash tables may be achieved by storing fewer than all the hash values obtained for a given amount of historical data. Further, the efficiency of data de-duplication may be improved by comparing related incoming file data with historical data from a container without hashing/chunking the remaining file data upon matching an earlier chunk of the incoming file data to the container. | 07-03-2014 |
20140188823 | REDUCING FRAGMENTATION IN COMPRESSED JOURNAL STORAGE - A data chunk is compressed into a storage block when emitting the data chunk. If the data chunk is unable to be completely compressed into the storage block, attributes of the data chunk are analyzed for determining whether the data chunk should be split. If the data chunk should be split, a remaining portion of the data chunk is compressed to a next chronological storage block. If the data chunk should not be split, all of the data chunk is moved to the next chronological storage block while leaving any remaining space in the storage block as unused. | 07-03-2014 |
20140188824 | REDUCING FRAGMENTATION IN COMPRESSED JOURNAL STORAGE - A data chunk is compressed into a storage block when emitting the data chunk. If the data chunk is unable to be completely compressed into the storage block, attributes of the data chunk are analyzed for determining whether the data chunk should be split. If the data chunk should be split, a remaining portion of the data chunk is compressed to a next chronological storage block. If the data chunk should not be split, all of the data chunk is moved to the next chronological storage block while leaving any remaining space in the storage block as unused. | 07-03-2014 |
20140195497 | REAL-TIME IDENTIFICATION OF DATA CANDIDATES FOR CLASSIFICATION BASED COMPRESSION - Identification of data candidates for data processing is performed in real time by a processor device in a computing environment. Data candidates are sampled for performing a classification-based compression upon the data candidates. A heuristic is computed on a randomly selected data sample from the data candidate for determining if the data candidate may benefit from the classification-based compression. A decision is provided for approving the classification-based compression on the data candidates according to the heuristic. | 07-10-2014 |
20140195498 | REAL-TIME REDUCTION OF CPU OVERHEAD FOR DATA COMPRESSION - Real-time reduction of CPU overhead for data compression is performed by a processor device in a computing environment. Non-compressing heuristics are applied on a randomly selected data sample from data sequences for determining whether to compress the data sequences. A compression potential is calculated based on the non-compressing heuristics. The compression potential is compared to a threshold value. The data sequences are either compressed if the compress threshold is matched, compressed using Huffman coding if Huffman coding threshold is matched, or stored without compression. | 07-10-2014 |
20140195499 | REAL-TIME CLASSIFICATION OF DATA INTO DATA COMPRESSION DOMAINS - For real-time classification of data into data compression domains, a decision is made for which of the data compression domains write operations should be forwarded by reading randomly selected data of the write operations for computing a set of classifying heuristics thereby creating a fingerprint for each of the write operations. The write operations having a similar fingerprint are compressed together in a similar compression stream. | 07-10-2014 |
20140195500 | REAL-TIME IDENTIFICATION OF DATA CANDIDATES FOR CLASSIFICATION BASED COMPRESSION - Identification of data candidates for data processing is performed in real time by a processor device in a computing environment. Data candidates are sampled for performing a classification-based compression upon the data candidates. A heuristic is computed on a randomly selected data sample from the data candidate for determining if the data candidate may benefit from the classification-based compression. A decision is provided for approving the classification-based compression on the data candidates according to the heuristic. | 07-10-2014 |
20140195501 | SYSTEM AND METHOD FOR FINGERPRINTING DATASETS - Systems and methods for the matching of datasets, such as input audio segments, with known datasets in a database are disclosed. In an illustrative embodiment, the use of the presently disclosed systems and methods is described in conjunction with recognizing known network message recordings encountered during an outbound telephone call. The methodologies include creation of a ternary fingerprint bitmap to make the comparison process more efficient. Also disclosed are automated methodologies for creating the database of known datasets from a larger collection of datasets. | 07-10-2014 |
20140195502 | MULTIDIMENSION COLUMN-BASED PARTITIONING AND STORAGE - A data storage system includes a storage engine to partition data across multiple dimensions. The storage engine determines chunks according to the partitioning, and performs column-based storage of the chunks. | 07-10-2014 |
20140201173 | FILE-BASED SOCIAL RECOMMENDATIONS IN A SOCIAL NETWORK - Example embodiments relate to file-based social recommendations. In example embodiments, a system may search a social network to identify uploaded files that are similar to a file from a user and then identify unconnected users that uploaded the uploaded files, where the unconnected users are not connected to the user in the social network. In response to identifying the unconnected users, the system may then communicate a user recommendation that at least one unconnected users be added to the social context. | 07-17-2014 |
20140201174 | ENCODING AND DECODING DELTA VALUES - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for encoding and decoding delta values. In one aspect, a method includes accessing a compression buffer having a start position, a sentinel position, and a data storage region; obtaining a first value; determining that a second value stored in the sentinel position does not match a first sentinel value; determining that a third value stored in the start position matches a second sentinel value; and storing the first value at the start position of the compression buffer. | 07-17-2014 |
20140201175 | STORAGE APPARATUS AND DATA COMPRESSION METHOD - A storage apparatus includes a data storage unit, management information storage unit, compression judgment unit, and compression control unit. The data storage unit stores the data of files. The management information storage unit stores management information on the files. The compression judgment unit evaluates compression effectiveness for a file at prescribed execution timing and determines whether the compression of the file is appropriate or not. The compression control unit updates the management information so as to reflect the determination result obtained by the compression judgment unit, and then stores the compressed data of the file in a compressed format in the data storage unit if the determination result indicates that the compression is appropriate, and stores the uncompressed data of the file in an uncompressed format in the data storage unit if the determination result indicates the compression is inappropriate. | 07-17-2014 |
20140207744 | MERGING COMPRESSED DATA ARRAYS - Compressed data sets can be merged without unraveling the compressed data sets. Concatenations of vectors of a first compressed data set that extend beyond a second compressed data set with no data vectors are represented in a third compressed data set. The no data vectors represent lack of data to be contributed from the second compressed data set. The third compressed data set represents a merger of the first and the second compressed data sets. Counterpart vectors of the first and second compressed data sets are determined using compression information for the vectors. Concatenations of the counterpart vectors are represented in the third compressed data set, as well as compression information that accounts for the determined counterpart vectors. | 07-24-2014 |
20140207745 | DATA COMPRESSION ALGORITHM SELECTION AND TIERING - A data storage subsystem having a plurality of data compression engines configured to compress data, each having a different compression algorithm. A data handling system is configured to determine a present rate of access to data; select at least one sample of data; determine the greatest degree of compression of said data compression engines; determine the compression ratios of the operated data compression engines with respect to the selected sample(s); compressing said selected at least one sample with a plurality of said data compression engines at said selected tier; operate a selected data compression engines with respect to the selected sample and determine the greatest degree of compression of the data compression engines; compress the data from which the sample was selected with one of the operated data compression engines determined to have the greatest degree of compression; and store the compressed data in data storage repositories. | 07-24-2014 |
20140214779 | SYSTEM AND METHOD FOR APPLYING AN EFFICIENT DATA COMPRESSION SCHEME TO URL PARAMETERS - Disclosed is a system and methods for data compression and decompression. The systems and methods discussed herein include an encoder, dictionary, decoder, literal string and control output. The discussed systems and methods encode data transmitted over a communications channel through the use of a dynamically compiled dictionary. Upon reviewing the characters within the transmitted data in view of the dictionary, an encoded/compressed output string is created. Such output string may also be decoded in a similar fashion via a dynamically compiled dictionary. | 07-31-2014 |
20140214780 | SYSTEMS AND METHODS FOR GENETIC DATA COMPRESSION - Genetic data may be compressed efficiently by selecting for each bi-allelic marker, from among multiple compression algorithms with different associated storage requirements that depend on the minor allele frequency of the respective marker, the algorithm that has the lowest storage requirements. Efficient approaches compress, store, and load pedigree file data. A hybrid method is used that selects between multiple alternative compression algorithms whose performance depends on the frequency of certain observable genetic variations. The hybrid method may achieve higher compression ratios than PLINK or PBAT. Further, it results in a compressed data format that, generally, does not require any overhead memory space and CPU time for decompression, and, consequently, has shorter loading times for compressed files than the binary format in PLINK or PBAT. Moreover, the compressed data fonnat supports parallel loading of genetic information, which decreases the loading time by a factor of the number of parallel jobs. | 07-31-2014 |
20140214781 | DATA COMPRESSION DEVICE, DATA COMPRESSION METHOD, AND COMPUTER PROGRAM PRODUCT - According to an embodiment, a data compression device includes a receiving unit, generating unit, a selecting unit, and a compressing unit. The receiving receives input data pieces. The generating unit generates starting point candidates representing the data having an error within a threshold value with respect to starting point data input at a first timing. The selecting unit refers to the starting point candidates, end point data input at a second timing, and intermediate data input at a timing in between the first timing and the second timing; and selects the starting point candidate which, as compared to the other candidates, has a greater number of pieces of the intermediate data approximated using the starting point candidate and using the end point data in such a way that the error is within the threshold value. The compressing unit outputs the selected starting point candidate and the end point data. | 07-31-2014 |
20140236908 | METHOD AND APPARATUS FOR PROVIDING ENHANCED DATA RETRIEVAL WITH IMPROVED RESPONSE TIME - An approach for providing enhanced data retrieval with improved response time is described. An enhanced data retrieval platform may retrieve data associated with multiple accounts from multiple database records within multiple databases, wherein each database record of the multiple database records is associated with a different type of information. The enhanced data retrieval platform may further create a single database entry for each account of the multiple accounts based on the retrieved data. The enhanced data retrieval platform may also compress the single database entry. Additionally, the enhanced data retrieval platform may store the compressed single database entry in a cache. | 08-21-2014 |
20140244602 | SEMANTIC COMPRESSION OF STRUCTURED DATA - Systems and methods for the semantic compression of structured data include identifying attributes of elements in a collection structure, such as a table. The attributes may be grouped and the grouping used to consolidate attribute values used in the elements. An index of repeated attribute values may also be generated and used to replace the attribute values in elements of the structured data. | 08-28-2014 |
20140244603 | Multi-Level Memory Compression - According to one embodiment of the present disclosure, an approach is provided in which a processor selects a page of data that is compressed by a first compression algorithm and stored in a memory block. The processor identifies a utilization amount of the compressed page of data and determines whether the utilization amount meets a utilization threshold. When the utilization amount fails to meet the utilization threshold, the processor uses a second compression algorithm to recompresses the page of data. | 08-28-2014 |
20140244604 | PREDICTING DATA COMPRESSIBILITY USING DATA ENTROPY ESTIMATION - The subject disclosure is directed towards predicting compressibility of a data block, and using the predicted compressibility in determining whether a data block if compressed will be sufficiently compressible to justify compression. In one aspect, data of the data block is processed to obtain an entropy estimate of the data block, e.g., based upon distinct value estimation. The compressibility prediction may be used in conjunction with a chunking mechanism of a data deduplication system. | 08-28-2014 |
20140250090 | COMPRESSION OF TABLES BASED ON OCCURRENCE OF VALUES - Methods and apparatus, including computer program products, for compression of tables based on occurrence of values. In general, a number representing an amount of occurrences of a frequently occurring value in a group of adjacent rows of a column is generated, a vector representing whether the frequently occurring value exists in a row of the column is generated, and the number and the vector are stored to enable searches of the data represented by the number and the vector. The vector may omit a portion representing the group of adjacent rows. The values may be dictionary-based compression values representing business data such as business objects. The compression may be performed in-memory, in parallel, to improve memory utilization, network bandwidth consumption, and processing performance. | 09-04-2014 |
20140258247 | ELECTRONIC APPARATUS FOR DATA ACCESS AND DATA ACCESS METHOD THEREFOR - Electronic apparatus for data access and data access method therefor are provided. The electronic apparatus includes: a memory unit and a processing unit. The processing unit includes a processor, a memory mapping unit, and a compression and decompression unit. The memory mapping unit is for performing conversion of a virtual address and a physical address for a read or write operation in the memory unit by the processor. The compression and decompression unit, coupled between the processor and memory unit, is for performing selectively data compression or decompression for the read or write operation in the memory unit by the processor. The processing unit enables the compression and decompression unit to compress data to be written and to accordingly output the corresponding compressed data to the memory unit when the processing unit determines that a compression criterion is satisfied, wherein the compression criterion includes: whether an idle rate of the processing unit is greater than a first threshold. | 09-11-2014 |
20140258248 | Delta Compression of Probabilistically Clustered Chunks of Data - The invention pertains to a method and Information Handling System (IHS) for performing delta compression on probabilistically clustered chunks of data. From a source of chunks a corresponding sketch to represent each chunk is generated. Then, from the generated sketches a subset of similar sketches is determined using a probabilistic based algorithm. Finally, delta compression is performed on the chunks which are represented by the similar sketches in the determined subset. | 09-11-2014 |
20140258249 | METHOD AND APPARATUS FOR PROVIDING COMPRESSED DATA STRUCTURE - An approach for providing a compressed data structure of data records sharing one or more common values is described. A data compression platform may process and/or facilitate a processing of a plurality of data records to determine one or more repeating values common across the plurality of data records. The data compression platform may also cause, at least in part, a storage of the one or more repeating values in at least one header record of a data structure. The data compression platform may further cause, at least in part, a storage of one or more non-repeating values of the plurality of data records in respective one or more point records of the data structure associated with the at least one header record. | 09-11-2014 |
20140279959 | OLTP COMPRESSION OF WIDE TABLES - A data block stores one or more rows of a database table or relation. An entire row may not fit in a data block. Part of the row is stored in one data block, and another part is stored in another data block. Each row part is referred to herein as a row segment and the data blocks are referred to as row-segmented data blocks. Data block dictionary compression is used to compress row-segmented data blocks. Each data block contains a dictionary that is used to compress rows in the data block, including row segments. The dictionary in a data block is used to compress row segments in the data block. Hence, multiple dictionaries may be used to decompress a row comprised of row segments. | 09-18-2014 |
20140279960 | Row Level Locking For Columnar Data - Row locking is performed at the row level of granularity for database data stored in columnar form. Row level locking entails use of a lock vector that is stored in a compression unit in a data block, the compression unit storing rows in columnar-major format. On an as needed basis, the lock vector is expanded to identify more transactions affecting the rows in the compression unit. | 09-18-2014 |
20140279961 | UNIFIED ARCHITECTURE FOR HYBRID DATABASE STORAGE USING FRAGMENTS - Data records of a data set can be stored in multiple main part fragments retained in on-disk storage. Each fragment can include a number of data records that is equal to or less than a defined maximum fragment size. Using a compression that is optimized for each fragment, each fragment can be compressed. After reading at least one of the fragments into main system memory from the on-disk storage, an operation can be performed on the fragment or fragments while the in the main system memory. | 09-18-2014 |
20140279962 | CONSOLIDATION FOR UPDATED/DELETED RECORDS IN OLD FRAGMENTS - A plurality of data records of a data set can be stored in a plurality of main part fragments, at least one of which is an old fragment stored on-disk. A number of one or more data records in the old fragment that have been marked for deletion can be determined to be greater than a threshold number, and the old fragment can be loaded into main system memory. A merge of the old fragment can be performed to remove the one or more data records marked for deletion. | 09-18-2014 |
20140279963 | ASSIGNMENT OF DATA TEMPERATURES IN A FRAMENTED DATA SET - A plurality of data records that comprise a data set can be stored in a plurality of main part fragments such that each main part fragment includes a subset of the set of data records. Each fragment of the plurality of main part fragments can be assigned a relative data temperature. A newly arrived data record for storage in the data set can be placed in a delta part, and a merge can be performed to add the newly arrived data record to a corresponding main part fragment. The performing of the merge can occur more quickly if the corresponding main part fragment has a higher relative data temperature than if the corresponding main part fragment has a lower relative data temperature. | 09-18-2014 |
20140279964 | System and Method for Compressing Data in a Database - A method of compressing a plurality of multi-dimensional keys includes receiving, by a computer, the plurality of multi-dimensional keys, where the plurality of multi-dimensional keys have a first length and determining a first plurality of bit slots that are common among the plurality of multi-dimensional keys, wherein the first plurality of bit slots are not a prefix. Also, the method includes forming a mask indicating the first plurality of bit slots and forming a pattern indicating values of the first plurality of bit slots. Additionally, the method includes determining a second plurality of bit slots that vary among the plurality of multi-dimensional keys and forming a plurality of compressed multi-dimensional keys indicating values of the second plurality of bit slots. Further, the method includes storing the mask, the pattern, and the plurality of compressed multi-dimensional keys. | 09-18-2014 |
20140279965 | COMPRESSING TUPLES IN A STREAMING APPLICATION - A method, system, and computer program product to process data in a streaming application are disclosed. The method, system, and computer program product may include receiving a stream of tuples to be processed by a plurality of processing elements operating on a plurality of compute nodes. The method, system, and computer program product may determine whether a first processing element has additional processing capacity. In some embodiments, the method, system, and computer program product determine whether a second processing element, which receives its input from the first processing element, also has additional processing capacity. The method, system, and computer program product may employ compression at the first processing element if one of the first and the second processing element has additional processing capacity. | 09-18-2014 |
20140279966 | VOLUME HAVING TIERS OF DIFFERENT STORAGE TRAITS - A volume system that presents a volume having an extent of logical addresses to a file system. A volume exposure system exposes the volume to the file system in a manner that the volume has multiple tiers, each offering storage of different traits. This is performed using multiple heterogenic underlying storage systems, each having different storage system-specific traits. Each underlying storage system may be hardware, software, or a combination thereof that permits each storage system to expose storage having the particular storage system-specific traits to the file system. The volume system supports each tier by mapping logical addresses of the tier to portions of underling storage systems that are consistent with the tier traits. | 09-18-2014 |
20140279967 | DATA COMPRESSION USING COMPRESSION BLOCKS AND PARTITIONS - Compression blocks are divided into partitions creating a two dimensional divide of the compression blocks by slicing the compression blocks forming a first dimension and sub-partitioning the compression blocks into the partitions forming a second dimension. Each one of the partitions are compressed in separate compression streams. | 09-18-2014 |
20140279968 | COMPRESSING TUPLES IN A STREAMING APPLICATION - A method, system, and computer program product to process data in a streaming application are disclosed. The method, system, and computer program product may include receiving a stream of tuples to be processed by a plurality of processing elements operating on a plurality of compute nodes. The method, system, and computer program product may determine whether a first processing element has additional processing capacity. In some embodiments, the method, system, and computer program product determine whether a second processing element, which receives its input from the first processing element, also has additional processing capacity. The method, system, and computer program product may employ compression at the first processing element if one of the first and the second processing element has additional processing capacity. | 09-18-2014 |
20140279969 | COMPRESSION/DECOMPRESSION ACCELERATOR PROTOCOL FOR SOFTWARE/HARDWARE INTEGRATION - Embodiments relate to providing a data stream interface for offloading the inflation/deflation processing of data to a stateless compression accelerator. An aspect includes transmitting a request to inflate or deflate a data stream to a compression accelerator. The request may include references to an input buffer for storing input data from the data stream, an output buffer for storing processed input data, and a state data control block for storing a stream state. The stream state is provided to the compression accelerator to continue processing the data stream responsive to the request being a subsequent request. The compression accelerator is instructed to store a current stream state in the state data control block responsive to the request being a non-final request. Accordingly, the current stream state is received from the compression accelerator responsive to the request being a non-final request. The processed input data is received from the compression accelerator. | 09-18-2014 |
20140279970 | COMPACTLY STORING GEODETIC POINTS - Mechanisms are provided for the compact storage of geographical geometries as a collection of points, where individual points are encoded as binary/ternary strings (with the property that points closer to each other share a longer binary/ternary prefix) and the geometry is encoded by compressing the binary/ternary representation of common-prefix points. Mechanisms are also provided for the representation of a geometry using a ternary string that allows efficient storage of arbitrary shapes (e.g., long line segments, oblong polygons) as opposed to binary representations that are more efficient when the geometries are square or nearly square shaped. | 09-18-2014 |
20140279971 | METHOD FOR RESOURCE DECOMPOSITION AND RELATED DEVICES - A method for processing textual resources may include decomposing the textual resources into a sequence of textual fragments, and searching the sequence of textual fragments for a match to a relational pattern including first and second tokens, and a word based relational bond therebetween. The searching may include searching each textual fragment of the sequence of textual fragments for a match to the word based relational bond, and when a given textual fragment matches the word based relational bond, determining whether the given textual fragment also matches the first and second tokens. The method may include when the given textual fragment also matches the first and second tokens, generating a node having the first and second tokens and the word based relational bond therebetween, and storing the node in a node pool. | 09-18-2014 |
20140289208 | DATA COMPRESSION APPARATUS, DATA COMPRESSION METHOD, DATA DECOMPRESSION APPARATUS, AND DATA DECOMPRESSION METHOD - In a data compression apparatus, a search unit examines the sequence of symbols in compression target data, and searches for a second symbol string having the same sequence of symbols as a first symbol string that occurred previously, and a code generation unit encodes the second symbol string into a code containing information that specifies a block to which the beginning of the first symbol string belongs. In a data decompression apparatus, a code acquisition unit sequentially acquires codes from the beginning of the compressed data, and when the code of the second symbol string is acquired, a decompression unit acquires, from a storage device, one or more blocks starting with a block to which the beginning of the decompressed first symbol string belongs, on the basis of the information contained in the acquired code, and decompresses the second symbol string. | 09-25-2014 |
20140297605 | DATABASE TABLE COMPRESSION - Embodiments relate to table compression in a database. The database is organized in tables including rows and columns An aspect includes defining a range partition of a table of the database according to a first attribute of the table. Internal ranges of the table of the database are defined according to a second attribute of the table. A target internal range of the internal ranges is determined to insert a row as a new entry into the table. A determination is made as to whether an internal range compression directory exists for the target internal range. Based on determining that no internal range compression directory exists for the target internal range and a predefined threshold value of a number of rows is exceeded in the target internal range, the internal range compression directory for the target internal range is created. | 10-02-2014 |
20140297606 | METHOD AND DEVICE FOR PROCESSING A TIME SEQUENCE BASED ON DIMENSIONALITY REDUCTION - Disclosed is a method and device for processing a time sequence based on dimensionality reduction, belonging to the technical field of computers. The method includes: acquiring at least one to-be-processed time sequence; processing the at least one time sequence based on Piecewise Linear Approximation (PLA) where a time length of a time segment processed by PLA is unfixed and is an integral multiple of a preset unit time length. According to the present disclosure, a space for storing a time sequence may be reduced. | 10-02-2014 |
20140297607 | METHOD AND TERMINAL DEVICE FOR ORGANIZING STORAGE FILE - A method for organizing a storage file is described in the embodiments of the present disclosure. The method includes: performing predetermination for organizing and scanning on a selected storage file to determine whether the selected storage file needs to be scanned for fragmentation and organized; and scanning the selected storage file for fragmentation and organizing the storage file, if it is determined that the selected storage file needs to be scanned for fragmentation and organized. Thereby, the time spent on organizing the file is reduced and the efficiency for organizing the file is improved. | 10-02-2014 |
20140317068 | DETERMINATION OF COMPRESSION STATE INFORMATION FOR USE IN INTERACTIVE COMPRESSION - The invention is directed at a method and apparatus for determining compression state information which is to be used in the compression of data being transmitted between two communicating parties. The method of determining the compression state information for use in interactively compressing data comprises the steps parsing the data to determine a hierarchical data structure of the data; traversing a shared hierarchical node index to determine common compression state information entries between the hierarchical data structure and the hierarchical node index; and selecting at least one of the common compression state information entries for use in compressing the data. | 10-23-2014 |
20140324799 | DATA PROCESSING APPARATUS, DATA PROCESSING METHOD, AND COMPUTER PROGRAM PRODUCT - According to an embodiment, a data processing apparatus includes a first generating unit, a second generating unit, a transfer instructing unit, and a transfer controller. The first generating unit generates transfer instruction information that specifies a storage position of transfer data in a data transfer source, a storage position of the transfer data in a data transfer destination, and a data transfer size, when the first generating unit determines that data transfer from a first storage unit to a second storage unit is required. The second generating unit generates fragment-transfer instruction information that is obtained by fragmenting the transfer instruction information into fragments of a predetermined data size. The transfer instructing unit instructs to perform data transfer based on the fragment-transfer instruction information. The transfer controller controls the data transfer between the first storage unit and the second storage unit according to instruction of the data transfer. | 10-30-2014 |
20140330796 | COMPRESSED POINTERS FOR CELL STRUCTURES - A system and method are provided for representing pointers. An encoding type for a pointer structure referenced by a first cell of a data structure is determined. A first field of the pointer structure is encoded to indicate the encoding type. Further, a second field of the pointer structure is encoded according to the encoding type to indicate a location in memory where a cell structure corresponding to a second cell of the data structure is stored. | 11-06-2014 |
20140330797 | SEGMENT DEDUPLICATION SYSTEM WITH COMPRESSION OF SEGMENTS - A system for storing compressed data comprises a processor and a memory. The processor is configured to receive a compressed segment. The compressed segment is determined by breaking a data stream, a data block, or a data file into one or more segments and compressing each of the one or more segments. The processor is further configured to determine whether the compressed segment has been previously stored, and in the event that the compressed segment has not been previously stored, store the compressed segment. The memory is coupled to the processor and configured to provide the processor with instructions. | 11-06-2014 |
20140330798 | VDI File Transfer Method and Apparatus - The present disclosure provides a virtual desktop infrastructure VDI file transfer method, which relates to the communications field and can improve compressibility in a VDI file transfer process. The method includes receiving VDI messages, where the VDI messages at least include a VDI file transfer message, separating the VDI file transfer message from the VDI messages, parsing the separated VDI file transfer message; obtaining a data portion of the VDI file transfer message, compressing the data portion, and sending a VDI file transfer message that includes compressed data. The present disclosure further provides a corresponding apparatus. | 11-06-2014 |
20140344230 | Methods and systems for node and link identification - Methods and systems for node and link detection in social network analysis. Interactive noise reduction allows reduction of the data set under analysis to enable substantially real time detection of links and nodes. | 11-20-2014 |
20140344231 | SYSTEM AND METHOD FOR INVESTIGATING LARGE AMOUNTS OF DATA - A data analysis system is proposed for providing fine-grained low latency access to high volume input data from possibly multiple heterogeneous input data sources. The input data is parsed, optionally transformed, indexed, and stored in a horizontally-scalable key-value data repository where it may be accessed using low latency searches. The input data may be compressed into blocks before being stored to minimize storage requirements. The results of searches present input data in its original form. The input data may include access logs, call data records (CDRs), e-mail messages, etc. The system allows a data analyst to efficiently identify information of interest in a very large dynamic data set up to multiple petabytes in size. Once information of interest has been identified, that subset of the large data set can be imported into a dedicated or specialized data analysis system for an additional in-depth investigation and contextual analysis. | 11-20-2014 |
20140351229 | EFFICIENT DATA COMPRESSION AND ANALYSIS AS A SERVICE - Data may be efficiently analyzed and compressed as part of a data compression service. A data compression request may be received from a client indicating data to be compressed. An analysis of the data or metadata associated with the data may be performed. In at least some embodiments, this analysis may be a rules-based analysis. Some embodiments may employ one or more machine learning techniques to historical compression data to update the rules-based analysis. One or more compression techniques may be selected out of a plurality of compression techniques to be applied to the data. Data compression candidates may then be generated according to the selected compression techniques. In some embodiments, a compression service restriction may be enforced. One of the data compression candidates may be selected and sent in a response. | 11-27-2014 |
20140358874 | COMPRESSION SYSTEM AND METHOD - A plurality of lines of data from a file are stored in a cache. The lines of data typically come from a file that is being compressed. The process gets an additional line of data to compress. Based on a compression level, the additional line of data is compared with the lines of data in the cache to determine if there is a best matched line of data from the plurality of lines in the cache. In response to determining the best matched line of data, the additional line of data is compressed with a first compression algorithm based on the best matched line of data to create a compressed line. The compressed line is written to the file. In response to not determining the best matched line of data, the additional line of data is written to the file. The additional line of data is stored in the cache. | 12-04-2014 |
20140358875 | SYSTEM AND METHOD OF FONT COMPRESSION USING SELECTABLE ENTROPY ENCODING - A request for a font file including a first font table and a second font table is received. A first entropy encoder is selected, based on characteristics of the first font table, front among a plurality of entropy encoders. A second entropy encoder is selected, based on characteristics of the second font table, front among the plurality of entropy encoders. The first entropy encoder is applied to the first font table. The second entropy encoder is applied to the second font table. Compressed data corresponding to the first and second font tables are combined to generate a compressed font file. The compressed font file is transmitted. | 12-04-2014 |
20140372388 | HASHING SCHEME USING COMPACT ARRAY TABLES - Embodiments include a method, system, and computer program product for creating an array table. In one embodiment the method includes identifying keys associated with values in a database and identifying bits common between the plurality of keys using logical functions and removing the common bits to form condensed keys. The method also includes modulating the condensed keys using identified common bits to create transformed keys and populating the plurality of array tables using the transformed keys and associated values. | 12-18-2014 |
20140372389 | Data Encoding and Processing Columnar Data - Aspects of the invention are provided for accessing a plurality of data elements. A page of column data is stored in a format that includes compressed and/or non-compressed elements, with the format including a plurality of arrays and a vector. Each of the arrays stores elements with common characteristics, with the vector functioning as a mapping to the stored data elements. The vector is leveraged to identify an array and determine an offset to support access to one or more of the data elements. | 12-18-2014 |
20140372390 | INFORMATION DEVICE, SERVER, RECORDING MEDIUM WITH IMAGE FILE RECORDED THEREON, IMAGE FILE GENERATING METHOD, IMAGE FILE MANAGEMENT METHOD, AND COMPUTER READABLE RECORDING MEDIUM - The information device includes an imaging unit that images a subject and generates image data of the subject, a meta information generating unit that generates meta information related to the image data generated by the imaging unit, a possibility information generating unit that generates, with respect to the meta information, possibility information setting whether or not change of original information is possible by an external device when the meta information is transmitted to the external device, and an image file generating unit that generates an image file associating the image data generated by the imaging unit, the meta information generated by the meta information generating unit, and the possibility information generated by the possibility information generating unit with one another. | 12-18-2014 |
20150012505 | CONFIGURABLE DATA MASKS SUPPORTING OPTIMAL DATA EXTRACTION AND DATA COMPACTION - Systems and methods are provided for accessing and storing variables the systems comprise a standardized executable application module (SEAM), a computer readable storage device containing the configuration data including a data matrix recorded thereon, the computer readable storage medium comprising a dynamic data store (DDS) and a static data store (SDS), wherein the DDS includes a temporary storage location expansion to the data matrix recorded in the SDS. The systems further comprise a workflow service module, the work flow service module including an encode utility and a decode utility, the workflow service module being configured to direct communication between the SDS, the DDS and the SEAM including retrieving a variable from, and storing the variable to, the computer readable storage device based on the encode utility, the decode utility and the data matrix stored in the SDS and in the DDS expansion. | 01-08-2015 |
20150012506 | EFFECTIVE METHOD TO COMPRESS TABULAR DATA EXPORT FILES FOR DATA MOVEMENT - Compression of data for database movement, includes: selecting a first group of categorical columns for compression; selecting a next group of categorical columns from remaining columns for compression; repeating the selecting of the next group until a predetermined compression threshold is met; creating first compression files comprising compressed representations of the columns in the first group; creating next compression files comprising compressed representations of the columns in each of the next groups; storing initial row sort order, group identity, and column positions corresponding to each of the next groups; and storing any columns not selected for compression in an uncompressed file in the original row sort order. Decompression of the data includes: rebuilding categorical columns in each group of compression files using group identity and column positions corresponding to the group; and sorting rows comprising the rebuilt categorical columns to the initial row sort order. Sorted rows for each group are merged with rows in the uncompressed file. | 01-08-2015 |
20150012507 | SYSTEM AND METHODS FOR ACCELERATED DATA STORAGE AND RETRIEVAL - Systems and methods for providing accelerated data storage and retrieval utilizing lossless data compression and decompression. A data storage accelerator includes one or a plurality of high speed data compression encoders that are configured to compress data. The compressed data is subsequently stored in a target memory or other storage device whose input data storage bandwidth is lower than the original input data stream bandwidth. Similarly, a data retrieval accelerator includes one or a plurality of high speed data decompression decoders that are configured to decompress data at a rate equivalent to or faster than the input data stream from the target memory or storage device. The decompressed data is then output at rate data that is greater than the output rate from the target memory or data storage device. | 01-08-2015 |
20150019514 | Method and Apparatus for Storage of Data Records - Method and data access unit for storage of data records for creating a serialized charging record formatted for insertion into a charging database. The method includes traversing the hierarchical charging record and for each part node of said hierarchical charging record identifying an attribute of the part node and determining if said attribute is a key attribute or a search attribute and if affirmative storing an attribute value of said attribute in a field of the serialized charging record based on a charging database configuration definition. A part segment comprising the attribute value and a data value token is stored in a payload body field of the serialized charging record with a part node indicator representing the location of the part node in the hierarchical charging record based on a hierarchical charging record configuration definition. A method and data access unit for creating a hierarchical charging record is also disclosed. An advantage is that a serialized charging record may be stored in one storage entity such as a table row. | 01-15-2015 |
20150019515 | HEAT INDICES FOR FILE SYSTEMS AND BLOCK STORAGE - Techniques and mechanisms are provided to allow for selective optimization, including deduplication and/or compression, of portions of files and data blocks. Data access is monitored to generate a heat index for identifying sections of files and volumes that are frequently and infrequently accessed. These frequently used portions may be left non-optimized to reduce or eliminate optimization I/O overhead. Infrequently accessed portions can be more aggressively optimized. | 01-15-2015 |
20150026141 | In-Memory Bitmap for Column Store Operations - Disclosed herein are system, method, and computer program product embodiments for implementing a bitmap for a column store database. An embodiment operates by creating, by at least one processor, a bitmap identifying rows in a column store database. The bitmap may include a list of bit chunks, a bit chunk including an offset being a natural number indicating a chunk size, and a bit specification including one of an ordered row id list, a contiguous row id sequence, and a bit vector. In addition, the embodiment includes performing database operations using the bitmap. | 01-22-2015 |
20150026142 | TRAJECTORY DATA COMPRESSION - Disclosed is an effective and efficient compression system and technique for large amount of data. The data compression is particularly useful for compressing locational data. The compressed locational data is efficient and effective in tracing a moving object. By selecting appropriate input compression parameters, the accuracy and efficiency of the data compression can be tailored to the needs of the user. | 01-22-2015 |
20150026143 | DATA HANDLING - The concepts relate to data handling, and more specifically to data handling scenarios where data is revised on one computer and stored on another computer. One example can obtain a set of blobs relating to revisions of a file. The example can determine a target size of datastore blobs. In an instance where a total size of the set of blobs is less than the target size, this example can aggregate the set of blobs into an individual datastore blob. Otherwise, the example can identify new or edited individual blobs of the set and aggregate the new or edited individual blobs into first datastore blobs. The example can also aggregate other individual blobs of the set into second datastore blobs. | 01-22-2015 |
20150032704 | APPARATUS AND METHOD FOR PERFORMING COMPRESSION OPERATION IN HASH ALGORITHM - An apparatus and method for performing a compression operation in a hash algorithm are provided. The apparatus includes an interface unit, a message extension unit, a chain variable initial conversion unit, a compression function computation unit, and a chain variable final conversion unit. The interface unit receives a message and chain variable data. The message extension unit generates a plurality of extended messages from the message. The chain variable initial conversion unit converts the chain variable data into initial state data for a compression function. The compression function computation unit repeatedly computes extended message binding and step functions based on the initial state data and the plurality of extended messages, and performs combination with a final extended message, thereby computing final state data. The chain variable final conversion unit generates and outputs chain variable data, into which the chain variable data has been updated, using the final state data. | 01-29-2015 |
20150032705 | INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD, AND COMPUTER PRODUCT - An information processing system includes a processor configured to create, when object data is compressed for each word in units of records, count data that indicates for each record of the object data, an appearance count of each word, the count data being added to the object data that has been compressed; and identify based on the count data, a second character string that corresponds to a first character string defined as a search condition for the object data. | 01-29-2015 |
20150032706 | Enveloping for Cloud Computing via Wavefront Muxing - Data files with digital envelops may be used for many new applications for cloud computing. The new applications include games and entertainments such as digital fortune cookies, and treasure hunting, unique techniques for digital right management, or even additional privacy and survivability on data storage and transport on cloud computing. Wavefront multiplexing/demultiplexing process (WF muxing/demuxing) embodying an architecture that utilizes multi-dimensional waveforms has found applications in data storage and transport on cloud. Multiple data sets are preprocessed by WF muxing before stored/transported. WF muxed data is aggregated data from multiple data sets that have been “customized processed” and disassembled into any scalable number of sets of processed data, with each set being stored on a storage site. The original data is reassembled via WF demuxing after retrieving a lesser but scalable number of WF muxed data sets. A customized set of WF muxing on multiple digital files as inputs including at least a data message file and a selected digital envelop file, is configured to guarantee at least one of the multiple outputs comprising a weighted sum of all inputs with an appearance to human natural sensors substantially identical to the appearance of the selected digital envelop in a same image, video or audio format. Enveloping processing is a subset of WF muxing processing. The output file is the file with enveloped or embedded messages. The embedded message may be reconstituted by a corresponding WF demuxing processor at destination with the known a priori information of the original digital envelope. In short, digital enveloping/de-enveloping can be implemented via WF muxing and demuxing formulations. WF muxed data featured enhanced privacy and redundancy in data transport and storage on cloud. On the other hand, data enveloping is an application in an opposite direction for conventional WF muxing applications as far as redundancy is concerned. Enveloped data are intended only for limited receivers who has access to associated digital envelope data files with enhanced privacy for no or minimized redundancy. | 01-29-2015 |
20150039573 | COMPRESSING A MULTI-VERSION DATABASE - Managing a multi-version database is provided. A logical record identifier to physical record row identifier indirection mapping table on a solid-state storage device is extended to include a plurality of delta blocks. A delta block within the plurality of delta blocks is maintained for each primary key in a plurality of primary keys associated with a data table on a magnetic hard disk storage device. | 02-05-2015 |
20150046411 | Managing and Querying Spatial Point Data in Column Stores - A query of spatial data is received by a database comprising a columnar data store storing data in a column-oriented structure. Thereafter, a spatial data set is mapped to physical storage in the database using a space-filling curve. The spatial data set is then compacted and such compacted data can be used to retrieve data from the database that is responsive to the query. Related apparatus, systems, techniques and articles are also described. | 02-12-2015 |
20150058304 | STOPPING FUNCTIONS FOR GROUPING AND DIFFERENTIATING FILES BASED ON CONTENT - Methods and apparatus teach a digital spectrum of a data file. The digital spectrum is used to map a file's position in multi-dimensional space. This position relative to another file's position reveals closest neighbors. Certain of the closest neighbors are grouped together, while others are differentiated. Grouping ceases upon application of a stopping function so that rightly sized, optimum numbers of file groups are obtained. Embodiments of stopping functions relate to curve types in a mapping of numbers of groups per sequential rounds of grouping, recognizing whether groups have overlapping file members or not, and/or determining whether groups meet predetermined numbers of members, to name a few. Properly grouped files can then be further acted upon. | 02-26-2015 |
20150066878 | Efficient Context Save/Restore During Hardware Decompression of DEFLATE Encoded Data - An approach is provided in which a hardware accelerator receives a request to decompress a data stream that includes multiple deflate blocks and multiple deflate elements compressed according to block-specific compression configuration information. The hardware accelerator identifies a commit point that is based upon an interruption of a first decompression session of the data stream and corresponds to one of the deflate blocks. As such, the hardware accelerator configures a decompression engine based upon the corresponding deflate block's configuration information and, in turn, recommences decompression of the data stream at an input bit location corresponding to the commit point. | 03-05-2015 |
20150066879 | DEFRAGMENTING SLICES IN DISPERSED STORAGE NETWORK MEMORY - A method begins by a dispersed storage (DS) processing module receiving access requests, processing data set requests and issuing access responses. The method continues by monitoring slice access requests to generate access records by either storing time stamped access records indicating identities of slices requested by a timestamp or by commonality of slice names. The method continues with determining a correlation of two or more slice access based on the access records when a correlation is greater than a correlation threshold and identifying two or more slices for co-location. The method continues when the two or more slices are not co-located by selecting one or more of the two or more slices for migration to a common memory device. | 03-05-2015 |
20150066880 | Checkpoint and Restart - A method of performing a checkpoint on a set of connected processors and memories comprises the steps of creating one or more statefiles for one or more of the processors, querying available processing and/or memory resources, allocating data from one or more statefiles to the available resources, compressing the allocated data, storing the compressed data, and repeating the querying, allocating, compressing and storing steps until all of the statefile(s) are compressed and stored. | 03-05-2015 |
20150066881 | COMPRESSING TRACE DATA - Trace data are compressed by storing a compression table in a memory. The table corresponds to results of processing a set of training trace data using a table-driven compression algorithm. The trace data are compressed using the table according to the algorithm. The stored compression table is accessed read-only. The table can be determined by automatically processing a set of training trace data using the algorithm and transforming the compression table produced thereby into a lookup-efficient form. A network device includes a network interface, memory, and a processor that stores the table in the memory, compresses the trace data using the stored compression table according to the table-driven compression algorithm, the stored table being accessed read-only during the compressing, and transmits the compressed trace data via the network interface. | 03-05-2015 |
20150074066 | DATABASE OPERATIONS ON A COLUMNAR TABLE DATABASE - A computer system includes at least one processor and at least one memory operably coupled to the at least one processor. The memory includes a memory pool and a database partitioned into multiple fragments. Each of the fragments is allocated a block of memory from the memory pool and the fragments store compressed data in a columnar table format. A database operation is applied in a compressed format to the compressed data in at least one of the fragments. | 03-12-2015 |
20150081650 | DATA ACCESS PERFORMANCE USING DECOMPRESSION MAPS - Methods and apparatus, including computer program products, implementing and using techniques for decompressing data in a database system. A query is received, which pertains to a subset of data within a compressed set of data. One or more decompression strategies are evaluated using a cost model. The cost model includes an estimated filter factor. A low cost decompression strategy is selected based on the results of the evaluation of the one or more decompression strategies. One or more bytes representing the requested subset of data are located within the compressed set of data. Only a portion of the compressed data that corresponds to the subset of data is decompressed, using the selected decompression strategy, while leaving the remaining set of data in a compressed state. | 03-19-2015 |
20150081651 | Data access using decompression maps - Methods and apparatus, including computer program products, implementing and using techniques for decompressing data in a database system. A query is received, which pertains to a subset of data within a compressed set of data. One or more decompression strategies are evaluated using a cost model. The cost model includes an estimated filter factor. A low cost decompression strategy is selected based on the results of the evaluation of the one or more decompression strategies. One or more bytes representing the requested subset of data are located within the compressed set of data. Only a portion of the compressed data that corresponds to the subset of data is decompressed, using the selected decompression strategy, while leaving the remaining set of data in a compressed state. | 03-19-2015 |
20150095292 | DATA FRAGMENTATION TUNING AND CANDIDACY PERSISTENCE - A method for implementing defragmentation of a data area is provided. The method may include receiving a data change event for the data area and determining, whether the data area has exceeded a defragment threshold based on a defragment threshold value. The method may further include adding the data area to a candidacy list when the data area is determined to have exceeded the defragment threshold based on the defragment threshold value. The method may also include defragmenting the data area when the data area is determined to have exceeded the defragment threshold based on the defragment threshold value and removing the data area from the candidacy list following the determination. | 04-02-2015 |
20150095293 | MINIMIZATION OF SURPRISAL DATA THROUGH APPLICATION OF HIERARCHY FILTER PATTERN - A computer product and system of minimizing surprisal data comprising: at a source, reading and identifying characteristics of an organism's background associated with a genetic sequence of the organism; receiving an input of rank of at least two identified characteristics of the genetic sequence; generating a hierarchy of ranked, identified characteristics based on the rank of the identified characteristics; comparing the hierarchy of ranked, identified characteristics to a repository of reference genomes; and if at least one reference genome from the repository matches the ranked characteristics, breaking the matched reference genomes into pieces, combining pieces associated with the identified characteristics from the matched reference genome to form a filter pattern to be compared to the nucleotides of the genetic sequence of the organism. The differences from the comparison are used to create surprisal data representing an entire genome of the organism. | 04-02-2015 |
20150095294 | Elimination of Fragmentation of Files in Storage Medium by Utilizing Head Movement Time - Accessing a file on a sequentially accessed storage device such as a magnetic tape often involves bypassing valid files and gaps between valid files. Presently taught is a method of copying valid files being bypassed to a second sequentially accessed storage device while not copying the gaps. When a read target file is reached, the read target file is read. During a write to a file writing position, valid files are copied to the second sequentially accessed storage device until the file writing position is reached and the file is written at the end of the valid files on the second sequentially accessed storage device. | 04-02-2015 |
20150095295 | REDUCING DECOMPRESSION LATENCY IN A COMPRESSION STORAGE SYSTEM - In a compression processing storage system, using a pool of compression cores, the compression cores are assigned to process either compression operations, decompression operations, or decompression and compression operations, which are scheduled for processing. A minimal number of the compression cores are allocated for processing the compression operations, thereby increasing compression latency. Upon reaching a throughput limit for the compression operations that causes the minimal number of the plurality of compression cores to reach a busy status, the minimal number of the plurality of compression cores for processing the compression operations is increased. | 04-02-2015 |
20150100555 | METHODS AND APPARATUS FOR POINT CLOUD DATA MANAGEMENT - Methods and apparatus are provided for processing of data representing points in space wherein each is represented by components defining its position in a coordinate system and at least one parameter. For each point, the data are separated into a layer per component, and each component is assigned to a cell of a two-dimensional grid of cells such that corresponding cells of multiple layers contain the components of a point. A component of a point is retrieved by reference to a grid position corresponding to the point and to a layer corresponding to the component. Each layer is segmented into patches of cells such that a component of a point can be retrieved by reference to a grid position of a patch within a layer and to a grid position of a cell within a patch. A layer is compressed using an associated codec. | 04-09-2015 |
20150100556 | Data Compression/Decompression Device - When compressing an arrangement of fixed-length records in a columnar direction, a data compression device carries out data compression aligned with the performance of a data decompression device by computing a number of rows processed with one columnar compression from the performance on the decompression device side, such as the memory cache capacity of the decompression device or the capacity of a primary storage device which may be used by an application, and the size of one record. Thus, while improving compression ratios of large volumes of data, including an alignment of a plurality of fixed-length records, decompression performance is improved. | 04-09-2015 |
20150120683 | DATA COMPRESSION APPARATUS, DATA COMPRESSION METHOD, AND NON-TRANSITORY COMPUTER READABLE MEDIUM - A data compression apparatus includes a lossless compression unit performing lossless compression of each data unit of original data to be compressed to output compressed data; a measuring unit measuring a data amount of the compressed data; and a generating unit generating and outputting compression result management data indicating a result of the compression of each data unit of the original data. The generating unit records data indicating a range of the original data of the data unit if the data amount of the compressed data is larger than or equal to the data amount of the data unit before completion of the lossless compression. The generating unit records data indicating a range of the compressed data of the data unit if the data amount of the compressed data is smaller than the data amount of the data unit upon completion of the lossless compression of the data unit. | 04-30-2015 |
20150120684 | SELECTING FILES FOR COMPACTION - Methods, systems, and apparatus for identifying two or more files, each of which include multiple entries, determining a respective size of each of the files, each size being an estimate of how many distinct entries exist in the respective file that are not garbage entries, determining a combined size of the files, where the combined size of the files is an arithmetic sum of the respective sizes of the files, estimating a compacted size of the files, where the estimated compacted size of the files is an estimate of how many distinct entries exist in the files that are not garbage entries, selecting the two or more files for compaction, based at least on a comparison of the combined size of the files to the estimated compacted size of the files, and compacting the two or more selected files. | 04-30-2015 |
20150127623 | ALLOCATION AWARE HEAP FRAGMENTATION METRICS - An illustrative embodiment of a computer-implemented method for estimating heap fragmentation in real time, models a runtime view of free heap memory, models a runtime view of heap allocation patterns for the heap memory and takes a snapshot of the heap memory. A batch allocator simulator is executed at a predetermined event and a remaining amount of memory unused in the simulation is identified as fragmented memory. | 05-07-2015 |
20150134626 | PARTITION-BASED DATA STREAM PROCESSING FRAMEWORK - A control node of a multi-tenant stream processing service receives a request indicating an operation to be performed on data records of a particular data stream. Based on a stream partitioning policy, the control node determines an initial number of worker nodes to be used. The control node configures a worker node to perform the operation on received records. In response to a determination that the worker node is in an unhealthy state, the control node configures a replacement worker node. | 05-14-2015 |
20150142761 | Changing the Compression Level of Query Plans - In an embodiment, a query plan is compressed to data in a cache at a high compression level if a runtime of a query that the query plan implements is greater than a high time threshold. The query plan is compressed to the data in the cache at a medium compression level if the runtime of the query that the query plan implements is less than the high time threshold and greater than a low time threshold. The query plan is stored to the data in the cache at an uncompressed level if the runtime of the query that the query plan implements is less than the low time threshold. | 05-21-2015 |
20150142762 | Changing the Compression Level of Query Plans - In an embodiment, a query plan is compressed to data in a cache at a high compression level if a runtime of a query that the query plan implements is greater than a high time threshold. The query plan is compressed to the data in the cache at a medium compression level if the runtime of the query that the query plan implements is less than the high time threshold and greater than a low time threshold. The query plan is stored to the data in the cache at an uncompressed level if the runtime of the query that the query plan implements is less than the low time threshold. | 05-21-2015 |
20150142763 | BITMAP COMPRESSION FOR FAST SEARCHES AND UPDATES - Bitmap compression for fast searches and updates is provided. Compressing a bitmap includes receiving a bitmap to compress, and reading the bitmap to determine a value of a bit location for all bits in the bitmap. In one embodiment, a compressed bitmap is created by encoding a variable number of bytes to represent a distance between adjacent 1s in the uncompressed bitmap. In another embodiment, a compressed bitmap is created by representing a distance between adjacent 1s in the uncompressed bitmap using a plurality of bits, and encoding a marker word to indicate the number of bits used to represent the distance. | 05-21-2015 |
20150293934 | STORING DIFFERENCES BETWEEN PRECOMPRESSED AND RECOMPRESSED DATA FILES - A system comprises a processor and a memory. The processor is configured to decompress a precompressed file; recompress the decompressed file; and determine a difference file. The memory is coupled to the processor and configured to provide the processor with instructions. | 10-15-2015 |
20150294002 | DATA ACCELERATOR FOR MANAGING DATA TRANSMISSION - Systems and methods for managing the flow of data between a client device and a data source system including a data accelerator implemented, at least in part, between the client device and the data source. The data accelerator can function to intercept queries and determine whether responses stored locally with respect to the client device can satisfy, at least in part, the request for data of the client device. If a locally stored response can satisfy the data request at least in part, the data accelerator is configured to retrieve the response from the local storage and send it to the client device. The data accelerator is also configured to modify the query based on whether response are locally stored that can satisfy the request. Specifically, the data accelerator can modify the query to only request the remaining data that is not included as part of the locally stored responses. | 10-15-2015 |
20150312379 | HIGH EFFICIENCY BINARY ENCODING - A method and a system are provided for encoding and processing digital information. The digital information is encoded according to binary encoding formats corresponding to primitive data types. The primitive data types comprise scalar data types including Boolean, integer, float, decimal, time stamp, string, symbol, binary large object, and character large object data types. The primitive data types also comprise composite data types including structure, list and S-expression data types. The binary-encoded digital information is stored in a message with a predetermined format for transmission. No metadata is included in the message. | 10-29-2015 |
20150317327 | Hierarchical Index Based Compression - Computer-readable media, systems, and methods for hierarchical index based compression are described. In embodiments, a hierarchical data log or key-value pair based data log, such as a JSON log, is received and a tree-structured index (index tree) is recursively constructed. In one embodiment, the log comprises search-engine user interaction information. Structural information of the log is preserved by the index tree structure; for example, each node of the log has a corresponding index-tree node. Frequently repeating keys, values, and correlated key-value pairs may be stored in the index-tree node, which may be indexed using multiple levels of detail including a raw-string level for raw string representations of the node, a first level for indexing keys and common values, and a second level for indexing correlated key-value pairs. The index tree may be used to compress rows of the data log and also used to decompress and restore the log. | 11-05-2015 |
20150317381 | REAL-TIME IDENTIFICATION OF DATA CANDIDATES FOR CLASSIFICATION BASED COMPRESSION - Identification of data candidates for data processing is performed in real time by a processor device in a distributed computing environment. Data candidates are sampled for performing a classification-based compression upon the data candidates. A heuristic is computed on a randomly selected data sample from the data candidate, the heuristic computed by, for each one of the data classes, calculating an expected number of characters to be in a data class, calculating an expected number of characters that will not belong to a predefined set of the data classes, and calculating an actual number of the characters for each of the data classes and the non-classifiable data. | 11-05-2015 |
20150324371 | Data Processing Method and Device in Distributed File Storage System - A data processing method and a device in a distributed file storage system, where the method includes receiving, by a client agent, a data processing request which carries a file identifier, an offset address, a file length, and other information of a target file; obtaining, by the client agent, redundancy information according to the file identifier carried in the data processing request, where the redundancy information includes a quantity of data strips, N, of the distributed file storage system and a quantity of parity strips, M, of the distributed file storage system; determining a quantity of valid strips, DSC, of the target file according to the offset address and the length information; determining a quantity of actual strips, N′, of the target file according to the DSC and the M; and determining corresponding strips according to the N′ and processing the corresponding strips. | 11-12-2015 |
20150324384 | OFFLINE GENERATION OF COMPRESSED RADIX TREE WITH KEY SEQUENCE SKIP - Systems and methods are disclosed for compressing a radix tree. An example method includes traversing a radix tree including a plurality of containers. The method also includes identifying, based on the traversing, a parent container having a plurality of child containers, each child container including a sequence of elements. The method further includes for one or more child containers of the plurality of child containers, identifying a unique prefix of the sequence of elements included in the respective child container, identifying a remainder sequence after the unique prefix in the sequence of elements, and removing the remainder sequence from the respective child container. | 11-12-2015 |
20150324385 | SYSTEM AND METHOD FOR APPLYING AN EFFICIENT DATA COMPRESSION SCHEME TO URL PARAMETERS - Disclosed is a system and methods for data compression and decompression. The systems and methods discussed herein include an encoder, dictionary, decoder, literal string and control output. The discussed systems and methods encode data transmitted over a communications channel through the use of a dynamically compiled dictionary. Upon reviewing the characters within the transmitted data in view of the dictionary, an encoded/compressed output string is created. Such output string may also be decoded in a similar fashion via a dynamically compiled dictionary. | 11-12-2015 |
20150324401 | OFFLINE COMPRESSION FOR LIMITED SEQUENCE LENGTH RADIX TREE - Systems and methods are disclosed for compressing a radix tree. An example method of compressing a radix tree includes traversing a radix tree including a plurality of containers. The method also includes identifying, based on the traversing, a parent container having a single immediate child container. The parent container includes a first set of elements, and the child container includes a second set of elements. The method further includes determining whether a length of the first set of elements included in the parent container satisfies a threshold. The method also includes when the length of the first set of elements is determined to satisfy the threshold, combining the parent and child containers into a single container. | 11-12-2015 |
20150324484 | OFFLINE RADIX TREE COMPRESSION WITH KEY SEQUENCE SKIP - Systems and methods are disclosed for compressing a radix tree. An example method of compressing a radix tree including a plurality of containers includes traversing a radix tree including a plurality of containers. The method also includes identifying, based on the traversing, a parent container that represents a sequence of elements and has a single immediate child container. The parent container includes a prefix of the sequence of elements that is represented by the parent container, and the immediate child container includes a single element. The method further includes determining whether a length of the sequence of elements that is represented by the parent container satisfies a container threshold. The method also includes when the length is determined to satisfy the container threshold, selecting one of the parent container and immediate child container, incrementing a length of the selected container, and removing the non-selected container from the radix tree. | 11-12-2015 |
20150326245 | STORAGE OF A MATRIX ON A STORAGE COMPUTE DEVICE - A compressed format is selected for storage of a matrix based on a computation to be performed using the matrix and architecture of a storage compute device to which the matrix is stored. Data of the matrix is stored on the storage compute device according to the compressed format. The computation is performed using the data via a computation unit that resides within the storage compute device. | 11-12-2015 |
20150331913 | DATA COMPRESSION SYSTEM - The system includes a correlation extraction means for extracting at least one candidate for a correlation from a collected given data set, based on a relationship between units of data in the given data set; a correlation verification means for verifying whether or not the units of data in the given data set satisfy the correlation extracted by the correlation extraction means; and a data compression means for compressing the given data set with use of the correlation, based on the result of verification by the correlation verification means. | 11-19-2015 |
20150339341 | TECHNIQUES FOR ALIGNED RUN-LENGTH ENCODING - Techniques for Aligned Run-Length Encoding (ARLE) are described. ARLE is an encoding scheme that transforms sets of same-valued consecutive rows into one or more runs, while enforcing boundaries between the runs at set intervals (e.g. every predetermined number of rows). Consecutive rows that contain the same value, but which cross one or more interval boundaries, are encoded as multiple runs that are divided along those interval boundaries. According to one technique, a database server accelerates query processing by setting the interval size to the word size of the processor performing the predicate comparisons. According to another technique, a database server accelerates row lookup by maintaining an offset array that stores the run offsets into the ARLE data of the run that begins each interval. | 11-26-2015 |
20150347426 | REORDERING OF DATABASE RECORDS FOR IMPROVED COMPRESSION - According to embodiments of the present invention, apparatus, systems, methods and computer program products for sorting and compressing an unordered set of data records from a structured database are provided. Fields of the unordered set of data records are prioritized based on an impact of those fields to a compression scheme for column-oriented compression. The unordered set of data records are sorted based on the prioritized field(s) with a greatest impact on the performance metric. Data of the sorted data records are compressed according to a compression scheme. In some embodiments, prioritizing the fields may be based on an anticipated level of usage of data within those fields and/or a cost function associated with a performance metric as well as optimization of compression. A performance metric may include a faster computational time, reduced I/O computation, faster scan time, etc. | 12-03-2015 |
20150347441 | MEDIA ASSET PROXIES - Disclosed herein are systems, methods, and non-transitory computer-readable storage media for creating and using media asset proxies. The media asset proxies represent a digital media asset and are created by filtering and modifying elements from the digital media asset. The media asset proxies can be queried in the same manner as their corresponding digital media asset. | 12-03-2015 |
20150347443 | SEARCHABLE DATA ARCHIVE - A method and apparatus are provided to store transaction records in a retrievable form and to enable subsequent search and retrieval of stored transaction records. Transaction records are captured and then grouped according to predetermined grouping criteria such that they may be indexed to a first level and then efficiently compressed for bulk storage. In the event that records need to be retrieved subsequently, the first level index may be used to select one or more groups of records satisfying first level search criteria and, following retrieval of the selected groups from storage and de-compression, a second level index may be created to enable a more detailed record-level search for matching records in the retrieved groups. Preferably, the same indexing technique is used for both the first and second level of indexing. | 12-03-2015 |
20150347629 | DISTANCE QUERIES ON MASSIVE NETWORKS - Distance query techniques are provided that are robust to network structure, scale to large and massive networks, and are fast, straightforward, and efficient. A hierarchical hub labeling (HHL) technique is described to determine a distance between two nodes or vertices on a network. The HHL technique provides indexing by ordering vertices by importance, then transforming the ordering into an index, which enables fast exact shortest-path distance queries. The index may be compressed without sacrificing its correctness. | 12-03-2015 |
20150356147 | SYSTEMS, METHODS AND COMPUTER-ACCESSIBLE MEDIUMS FOR UTILIZING PATTERN MATCHING IN STRINGOMES - Exemplary systems, methods and computer-accessible mediums can receive first data related to at least one first string arranged in a directed acyclic graph, compress the first data into second data, and can search the second data for a match of at least one second string. A node of the directed acyclic graph can encode at least one substring, and an edge of the directed acyclic graph can encode instructions for concatenating substrings. | 12-10-2015 |
20150363456 | HIERARCHICAL DATABASE COMPRESSION AND QUERY PROCESSING - Embodiments relate to hierarchical database compression. An aspect includes applying a first level of a first type of compression to a first partition of a column of a database. Another aspect includes applying a second level of the first type of compression to a subset of the first partition, wherein the first level of the first type of compression comprises a first first-level dictionary and the second level of the first type of compression comprises a first second-level dictionary, and wherein a code size of the first first-level dictionary is larger than a code size of the first second-level dictionary. | 12-17-2015 |
20150363510 | INDEXED SHAPED GRAPH CREATION - Index shaped graph creation can receive a number of bit-strings through a communication link. Index shaped graph creation can create a binary tree from a number of nodes that represent the number of bit-strings. Index shaped graph creation can define an index table based on the binary tree that includes the number of bit-strings and a number of indexes for the number of bit-strings. Index shaped graph creation can create a shaped graph based on the binary tree, wherein the shaped graph compresses a portion of the number of nodes. Index shaped graph creation can convert the shaped graph into an indexed shaped graph by assigning each of a compressed number of nodes in the shaped graph an offset value that can be associated with the number of indexes in the index table. | 12-17-2015 |
20150370822 | METHOD AND SYSTEM FOR HASH KEY MEMORY REDUCTION - A system and method are disclosed for storing data in a hash table. The method includes receiving data, determining a location identifier for the data wherein the location identifier identifies a location in the hash table for storing the data and the location identifier is derived from the data, compressing the data by extracting the location identifier; and storing the compressed data in the identified location of the hash table. | 12-24-2015 |
20150379008 | MAXIMIZING THE INFORMATION CONTENT OF SYSTEM LOGS - In a method for maximizing information content of logs, a log message from an executing software program is received. The log message includes a timestamp, a source code location ID, a session ID, and a log message text. The timestamp, the source code location ID, and the session ID of the log message are stored in a lossless buffer. A hash function value of the session ID is determined. It is determined that the hash function value of the session ID is less than a hash value threshold. The log message text is stored in a session buffer in response to determining that the hash function value of the session ID is less than the hash value threshold, wherein the session buffer contains log message texts of log messages with corresponding hash function values less than the hash value threshold. | 12-31-2015 |
20150379056 | Transparent access to multi-temperature data - A system, a computer-implemented method, and a computer readable medium having stored thereon a computer executable program code for providing access to a database on the system. The database comprises entries stored across partitions. The system comprises a first storage device, a second storage device, and a computing device. The first storage device comprises one partition of the partitions. The second storage device comprises the other partitions except the one of the first storage device. Each of the partitions has a respective partition identification. Each of the entries comprises at least one data value indicative of allocation of the each of the entries in one of the partitions. Each of the entries is stored in one or more data rows of data tables stored in the database. Each of the data rows comprises a respective primary key for identification of that data row. The computing device comprises a memory storing processor-executable program code and a computer processor to execute the processor-executable program code in order to cause the computing device to perform the computer-implemented method. | 12-31-2015 |
20150379068 | TABLE BOUNDARY DETECTION IN DATA BLOCKS FOR COMPRESSION - Data is converted into a minimized data representation using a suffix tree by sorting data streams according to symbolic representations for building table boundary formation patterns. The converted data is fully reversible for reconstruction while retaining minimal header information. A scanning operation is performed by searching a suffix of each of the sorted data streams for identifying a data sequence that includes a first symbol representing textual data, and a second symbol representing numerical data. The suffix tree for the converted data is then built. | 12-31-2015 |
20150379072 | INPUT PROCESSING FOR MACHINE LEARNING - A record extraction request for a data set is received at a machine learning service. A plan to perform one or more chunk-level operations (such as sampling, shuffling, splitting or partitioning for parallel computation) on chunks of the data set is generated. A set of data transfers that results in a particular chunk being stored in a particular server's memory is initiated to implement the first chunk-level operation of the sequence. A second operation such as another filtering operation or a feature processing operation is performed on a result set of the first chunk-level operation. | 12-31-2015 |
20160004715 | Minimizing Metadata Representation In A Compressed Storage System - Embodiments of the invention relate to compressed storage systems, and reducing metadata representing compressed data. Compressed data is stored in units referred to as partitions, with each partition having a header that contains a virtual address of data stored in the partition. A linear function is providing to represent a mapping between a virtual address segment and a compressed data extent, with a slope of the function representing an associated compression ratio. A read operation is supported by consulting the mapping and using the mapping to locate the corresponding compressed extent. Similarly, a write operation is supported by writing a new segment, compressing content in the segment, and computing a new mapping of the compressed segment metadata in memory. The new mapping is represented in the linear function. | 01-07-2016 |
20160004735 | Column Store Optimization Using Telescope Columns - A data set of spatial data having a plurality of dimensions and including linestrings can be processing by decomposing each linestring of the plurality of linestrings into a plurality of line segments. Each coordinate dimension appears in at least one line segment of the plurality of line segments can be listed in one of a plurality of dimensional dictionaries that each correspond to a dimension of the plurality of dimensions. A linestring of the plurality of linestrings can be represented as a set of the line segments using the plurality of dimensional dictionaries. | 01-07-2016 |
20160006456 | COMPRESSION DEVICE, COMPRESSION METHOD, DICTIONARY GENERATION DEVICE, DICTIONARY GENERATION METHOD, DECOMPRESSION DEVICE, DECOMPRESSION METHOD, INFORMATION PROCESSING SYSTEM, AND RECORDING MEDIUM - A compression device includes a processor configured to execute a process. The process includes: storing dictionary information in which a first compressed code assigned to a plurality of pieces of character information different from one another is associated with the pieces of character information; acquiring, when a first piece of character information among the pieces of character information is acquired, the first compressed code associated with the first piece of character information from the dictionary information; and writing the acquired first compressed code in a storage area to store compressed data. | 01-07-2016 |
20160012085 | COMPRESSING TIME STAMP COLUMNS | 01-14-2016 |
20160012089 | MAIN MEMORY DATABASE MANAGEMENT USING PAGE INDEX VECTORS | 01-14-2016 |
20160019265 | DATABASE CONSOLIDATION ADVISOR - Techniques are described for generating automated advice with respect to consolidating a plurality of sources. In an embodiment, a set of one or more parameters relating to a proposed consolidation for a plurality of consolidation sources is received. In response to receiving the set of one or more parameters, a set of one or more recommendations for consolidating the plurality of consolidation sources is generated and stored on at least one of a volatile or non-volatile computer-readable storage medium. In some embodiments, the set of one or more recommendations may indicate how to improve a performance associated with consolidating the plurality of sources to a set of one or more destinations based on a particular consolidation scenario. The set of one or more recommendations may be displayed during consolidation planning for the plurality of consolidation sources. | 01-21-2016 |
20160034487 | SELECTIVE FRAGMENTATION REPAIR - Selective repair of fragmentation in a synthetic backup, based at least in part on a dynamically-determined repair criteria, is disclosed. In various embodiments, a locality measure is computed with respect to a group of segments comprising a portion of a file. The computed locality measure is compared to an at least partly dynamically determined fragmentation repair criteria, and a repair decision is made based at least in part on the comparison. | 02-04-2016 |
20160034488 | METHOD AND APPARATUS FOR MODIFYING COMPRESSED FILES - A method, apparatus and computer program product are provided for preparing and installing update packages for compressed files. In the context of a method, a method for preparing an update package is provided that includes receiving an original file and a modified file, causing the original file and the modified file to be decompressed, and generating one or more delta files based on the decompressed original file and the decompressed modified file. A corresponding method for installing an update package is also provided that includes receiving the update package comprising one or more delta files corresponding to an original file, causing the original file to be decompressed, generating one or more modified subfiles based on the one or more delta files and the decompressed original file, and generating a compressed modified file by compressing the one or more modified subfiles. | 02-04-2016 |
20160034522 | AGGREGATING DATA IN A MEDIATION SYSTEM - Records received from one or more sources in a network are processed. For each of multiple intervals of time, a matching procedure is attempted on sets of one or more records, including comparing identifiers associated with different records to generate the sets and determining whether or not a completeness criterion is satisfied for one or more of the sets. The processing also includes, for at least some of the intervals of time, processing at least one complete set, consisting of one or more of the received records on which the matching procedure is first attempted during the interval of time and one or more records stored in a data store before the interval of time, and for at least some of the intervals of time, processing at least one incomplete set, consisting of one or more records stored in the data store before the interval of time. | 02-04-2016 |
20160034527 | ACCURATE PARTITION SIZING FOR MEMORY EFFICIENT REDUCTION OPERATIONS - Embodiments of the invention relate to processing data records, and for a multi-phase partitioned data reduction. The first phase relates to processing data records and partitioning the records into a first partition of records having a common characteristic and a second partition of records that are not members of the first partition. The data records in each partition are subject to intra-partition data reduction responsive to a resource constraint. The data records in each partition are also subject to an inter-partition data reduction, also referred to as an aggregation to reduce a footprint for storing the records. Partitions and/or individual records are logically aggregated and a data reduction operation for the logical aggregation of records takes place in response to available resources. | 02-04-2016 |
20160042006 | System and Method of Optimizing the User Application Experience - A system and method of optimizing the performance of an information handling system is disclosed herein. One or more data samples are generated by identifying one or more files accessed during the user application experience while in a sampling interval. An identifier and access frequency for each of the identified files are stored in a data sample. One or more data samples are merged into a merged data sample. A compression ratio is calculated for each of the identified files. One or more of the files identified in the merged data sample are selected for uncompression. The files selected for uncompression are uncompressed. | 02-11-2016 |
20160042037 | QUERY-AWARE COMPRESSION OF JOIN RESULTS - A method is provided for compressing results of a join query. A join order of a result set is determined from the join query, where the result set includes a plurality of tuples. A plurality of dictionary entries for the result set is received. A nested hierarchy of dictionaries is created based on the join order and the dictionary entries. A plurality of encoded tuples is received. The nested hierarchy of dictionaries is used by a processor to decode the plurality of encoded tuples so as to produce the plurality of tuples of the result set. | 02-11-2016 |
20160048531 | Adaptive Rate Compression Hash Processor - An input file is processed according to hash algorithm that references sets of literals to preceding sets of literals to facilitate copy-offset command generation. Preceding instances are identified by generating a hash of the literal set and looking up a corresponding entry in a hash table. The hash table may be accessed by placing look-up requests in a FIFO buffer. When the FIFO buffer is full, generation of the hash chain is suspended until it is no longer full. When repeated literals are found, generation of the hash chain is likewise suspended. The hash chain is used to generate a command file, such as according to the LZ algorithm. Runs of consecutive literals are replaced by a run-length command. The command file may then be encoded using Huffman encoding. | 02-18-2016 |
20160050297 | SYSTEMS AND METHODS FOR TRANSFORMATION OF LOGICAL DATA OBJECTS FOR STORAGE - Methods and systems for compressing a logical data object for storage in a storage device in a distributed network are provided. One method includes creating, in the storage device, a compressed logical data object including a header and one or more allocated compressed sections with a predefined size, compressing the one or more obtained chunks of data corresponding to the logical data object, thus giving rise to the compressed data chunks, and storing the compressed data chunks in the compressed sections, wherein the compressed sections serve as atomic elements of compression or decompression operations during input/output transactions on the logical data object. | 02-18-2016 |
20160055190 | EVENT DETECTION AND CHARACTERIZATION IN BIG DATA STREAMS - Methods, systems, and apparatus, including computing device programs encoded on computing device storage media, for characterizing events in a data stream. In one of the methods a General Method, is used to construct a Specific Method, which performs the characterization of behavioral types of a particular system or set of systems. The Specific Method includes event extraction, dimensional reduction, and signature identification in the reduced dimensional space that map the events of a specific system into behavioral types. | 02-25-2016 |
20160055213 | SYSTEM AND METHOD FOR PERFORMING LONGEST COMMON PREFIX STRINGS SEARCHES - A method and system a method for compressing and searching a plurality of strings. The method includes inputting a plurality of strings into a compression engine. The method also includes converting each of the plurality of strings into a new, prefix-preserving compressed string, using the compression engine. For every string P that is a strict prefix of a string S, P's resulting compressed string is a strict prefix of S's resulting compressed string. | 02-25-2016 |
20160063009 | PROACTIVELY CLEARING DIGITAL STORAGE - A device may monitor an amount of storage available on a user device; determine, based on the monitoring, that the amount of storage is below a particular threshold; score multiple data files stored by the user device; determine particular data files, of the multiple data files, that should be deleted from the user device based on the scoring and based on the amount of storage available on the user device; and cause, based on determining that the amount of storage is below the particular threshold, the user device to delete the particular data files from the user device. The user device may have an amount of available storage space exceeding the particular threshold after deleting the particular data files. | 03-03-2016 |
20160070859 | FAST AND SECURE RETRIEVAL OF DNA SEQUENCES - Sequence models are retrieved from a sequences index. The sequence models model DNA or RNA sequences stored in a database, and each comprises a finite memory tree source model and parameters for the finite memory tree source model. One or more DNA or RNA sequences stored in the database are identified as being most similar to a query DNA or RNA sequence based on fitting of the retrieved sequence models to the query DNA or RNA sequence. The sequence models may be context tree weighting (CTW) models {S | 03-10-2016 |
20160078027 | METHOD AND APPARATUS FOR DATA PROCESSING METHOD - A method and apparatus for data processing. The present invention provides a data processing apparatus that includes: a series acquisition section for acquiring a data series in which multiple pieces of data are arranged; a fragmentation section for fragmenting the data series to obtain multiple partial data series; a pattern extraction section for extracting multiple patterns of one or more pieces of data appearing in at least one of the multiple partial data series; and a generation section for generating a feature vector having element values, which vary according to whether to include each of the multiple patterns, for each of the multiple partial data series, respectively. There is also provided a method for data processing. The present invention allows for the generation of a feature vector from time-series data indicating a phenomenon the occurrence time of which is temporally irregular to detect features. | 03-17-2016 |
20160078045 | SELECTIVE COMPRESSION OF OBJECTS IN A STORAGE COMPUTE DEVICE - Methods and apparatuses facilitate receiving a command via a host interface of a storage compute device to perform a computation on one or more data objects. The computations producing intermediate objects that are stored in data storage section of the storage compute device. A determination is made to compress and decompress the intermediate objects as they are moved between the data storage section and a compute section based on wear of a storage medium being reduced in response to the compression and decompression. The intermediate objects are compressed and decompressed as they are moved between the data storage section and the compute section in response to the determination. | 03-17-2016 |
20160078069 | METHOD FOR IMPROVING ENERGY EFFICIENCY OF MAP-REDUCE SYSTEM AND APPARATUS THEREOF - This technique improves energy efficiency of MapReduce system by using system performance model without changing any component of the MapReduce system. This involves determining presence of any hardware bottleneck in any node of MapReduce system based on a system performance model and if any hardware bottleneck is present in any node, then the maximum bandwidth value of hardware associated with the bottleneck of each node is determined. Thereafter, an energy efficient value of Central Processing Unit (CPU) frequency of each node having the bottleneck is determined by using the system performance model and the maximum bandwidth value of hardware associated with the bottleneck. Further, the CPU frequency of each node having the bottleneck is set at the energy efficient value determined in the earlier step. | 03-17-2016 |
20160085817 | SYSTEM AND METHOD FOR INVESTIGATING LARGE AMOUNTS OF DATA - A data analysis system is proposed for providing fine-grained low latency access to high volume input data from possibly multiple heterogeneous input data sources. The input data is parsed, optionally transformed, indexed, and stored in a horizontally-scalable key-value data repository where it may be accessed using low latency searches. The input data may be compressed into blocks before being stored to minimize storage requirements. The results of searches present input data in its original form. The input data may include access logs, call data records (CDRs), e-mail messages, etc. The system allows a data analyst to efficiently identify information of interest in a very large dynamic data set up to multiple petabytes in size. Once information of interest has been identified, that subset of the large data set can be imported into a dedicated or specialized data analysis system for an additional in-depth investigation and contextual analysis. | 03-24-2016 |
20160085834 | PRIORITIZING REPOPULATION OF IN-MEMORY COMPRESSION UNITS - To prioritize repopulation of in-memory compression units (IMCU), a database server compresses, into an IMCU, a plurality of data units from a database table. In response to changes to any of the plurality of data units within the database table, the database server performs the steps of: (a) invalidating corresponding data units in the IMCU; (b) incrementing an invalidity counter of the IMCU that reflects how many data units within the IMCU have been invalidated; (c) receiving a data request that targets one or more of the plurality of data units of the database table; (d) in response to receiving the data request, incrementing an access counter of the IMCU; and (e) determining a priority for repopulating the IMCU based, at least in part, on the invalidity counter and the access counter. | 03-24-2016 |
20160092461 | INLINE KEYED METADATA - An encoding system may include a metadata manager, a key manager, and an encoder. The metadata manager may interface with one or more metadata sources to determine whether to include a metadata item from the one or more metadata sources. The key manager may determine whether the metadata item can be represented using one of already-allocated keys or an inline key must be used to represent the metadata item. The encoder may encode the metadata. If an inline key must be used to represent the metadata item, the encoder may associate the inline key and the type of the metadata item to the media file, and the encoder may encode the metadata item using the inline key in the media file. | 03-31-2016 |
20160092492 | SHARING INITIAL DICTIONARIES AND HUFFMAN TREES BETWEEN MULTIPLE COMPRESSED BLOCKS IN LZ-BASED COMPRESSION ALGORITHMS - A string of data is partitioned into a set of blocks. Each block is compressed based on a set of initial dictionaries and a set of Huffman trees. Each block is associated by a pointer with an initial dictionary in the set of initial dictionaries and a Huffman tree in the set of Huffman trees used to compress that block. A compressed string of data includes the set of initial dictionaries, the set of Huffman trees, and the compressed blocks and associated pointers. | 03-31-2016 |
20160092493 | EXECUTING MAP-REDUCE JOBS WITH NAMED DATA - Various embodiments execute MapReduce jobs. In one embodiment, at least one MapReduce job is received from one or more user programs. At least one input file associated with the MapReduce job is divided into a plurality of data blocks each including a plurality of key-value pairs. A first unique name is associated with each of the data blocks. Each of a plurality of mapper nodes generates an intermediate dataset for at least one of the plurality of data blocks. A second unique name is associated with the intermediate dataset generated by each of the plurality of mapper nodes. The second unique name is based on at least one of the first unique name, a set of mapping operations performed on the at least one of the plurality of data blocks, and a number associated with a reducer node in a set of reducer nodes assigned to the intermediate dataset. | 03-31-2016 |
20160092497 | DATA DICTIONARY WITH A REDUCED NEED FOR REBUILDING - A processor receives statistical information about a data set included in a column of a data table. The processor receives additional information about the data set that indicates a data format utilized by the data set and a type of information represented by the data set. The processor generates a data dictionary for compression of the data set based, at least in part, on the statistical information and the additional information. The data dictionary is created such that the data dictionary is capable of compressing data that is statistically predicted to be received at a future point. | 03-31-2016 |
20160098420 | HARDWARE ACCELERATION FOR A COMPRESSED COMPUTATION DATABASE - According to embodiments of the present invention, machines, systems, methods and computer program products for hardware acceleration are presented. A plurality of computational nodes for processing data is provided, each node performing a corresponding operation for data received at that node. A metric module is used to determine a compression benefit metric pertaining to performance of the corresponding operations of one or more computational nodes with recompressed data. An accelerator module recompresses data for processing by the one or more computational nodes based on the compression benefit metric indicating a benefit gained by using the recompressed data. A distribution function may be used to distribute data among a plurality of nodes. | 04-07-2016 |
20160098439 | HARDWARE ACCELERATION FOR A COMPRESSED COMPUTATION DATABASE - According to embodiments of the present invention, machines, systems, methods and computer program products for hardware acceleration are presented. A plurality of computational nodes for processing data is provided, each node performing a corresponding operation for data received at that node, A metric module is used to determine a compression benefit metric pertaining to performance of the corresponding operations of one or more computational nodes with recompressed data, An accelerator module recompresses data for processing by the one or more computational nodes based on the compression benefit metric indicating a benefit gained by using the recompressed data. A distribution function may be used to distribute data among a plurality of nodes. | 04-07-2016 |
20160103869 | SYSTEM, METHOD AND DATA STRUCTURE FOR FAST LOADING, STORING AND ACCESS TO HUGE DATA SETS IN REAL TIME - A computerized system including a processor and a computer-readable non-transient memory in communication with the processor, the memory storing instructions that when executed manage a novel data structure and related group of algorithms that can be used as a method for representing a set and as a base for very efficient indexing, hash and compression. SHB is an improvement of hierarchical bitmap. An improved database system that can utilize the innovative data structure which includes a raw data stream provided to the system via a data processing module, data blocks, fields indexes tables and a keys table. There is provided an index creating process and a columns creating process, for transforming the data blocks and tables into index blocks and data columns. | 04-14-2016 |
20160117343 | PREDICATE APPLICATION THROUGH PARTIAL COMPRESSION DICTIONARY MATCH - Methods and apparatus, including computer program products, implementing and using techniques for predicate application using partial compression dictionary match. A search strategy is developed for each predicate to be applied to compressed data. The compressed data is searched using the search strategy to locate the compression symbols identified in the search strategy. In response to locating a compression symbol from the search strategy in the compressed data, a respective row and applying the predicate is decompressed and a respective row that matches the predicate is returned to a database engine or an application. | 04-28-2016 |
20160124983 | SECURE COMPRESSION - In embodiments, secure compression algorithms are provided that may be employed as a single operation on raw data to produce compressed and encrypted data. In embodiments, the algorithms described herein may be performed using any type of dictionary based encryption. In one embodiment, upon adding a new prefix to a dictionary table, the dictionary table may be permuted to randomize the entries into the table. The randomization may be based upon a permutation value generated by a deterministic pseudo-random generator and/or pseudo-random function. Other embodiments of randomization may be employed to provide secure compression. For example, instead of permuting the entire table upon adding a prefix, the prefix may be randomly added to the table. | 05-05-2016 |
20160124984 | STORAGE AND COMPRESSION OF AN AGGREGATION FILE - A method and system for storage of an aggregation file and method and system for compression of the same. The method for compressing an aggregation file includes: acquiring the aggregation file to be compressed; copying remaining files in the acquired aggregation file into a new aggregation file based on metadata of a deleted object stored in a deletion file corresponding to the acquired aggregation file; and removing the acquired aggregation file. The present invention also provides a system for compressing an aggregation file and a method and system for storing an aggregation file. | 05-05-2016 |
20160140196 | COMPUTER PRODUCT, PROCESSING SYSTEM, AND PROCESSING METHOD - A non-transitory, computer-readable recording medium having stored therein a processing program causes a computer to execute a process including reconstituting a specific portion of received data, based on reference information specifying a referenced portion of the received data for selecting processing data from the received data, the processing data being subject to a query processing; determining whether to discard the received data, based on the specific portion of the received data and a selecting condition for selecting the processing data; and reconstituting the received data when the determining determines not to discard the received data. | 05-19-2016 |
20160142519 | METHODS AND SYSTEMS FOR ENCODING/DECODING FILES AND TRANSMISSIONS THEREOF - In one embodiment, the instant invention includes a computer system that includes at least the following components: a) a first computer that performs, in concurrent manner, at least the following tasks: dividing a computer file into a plurality of segments, compressing segments, and sending the compressed segments to a second computer over a network; b) the second computer that performs, in concurrent manner, at least the following tasks: decompressing the compressed segments and assembling the decompressed segment to reconstruct the computer file, where the compressing task performed by the first computer and the decompressing task performed by the second computer are synchronized and performed concurrently. | 05-19-2016 |
20160147784 | SYSTEMS AND METHODS FOR TRANSFORMATION OF LOGICAL DATA OBJECTS FOR STORAGE - Systems and methods for compressing a raw logical data object ( | 05-26-2016 |
20160147820 | Variable Sized Database Dictionary Block Encoding - Dictionary encoding in a table of a database system is initiated using a single page chain. The database system includes a plurality of processor cores and each page chain includes a plurality of chained pages. Thereafter, n additional page chains are generated for use by the dictionary encoding when the count of pages used by the dictionary encoding reaches a pre-determined limit. Generation of additional page chains is later ceased once the number of additional page chains n is equivalent to a number of available processor cores. Related apparatus, systems, techniques and articles are also described. | 05-26-2016 |
20160154815 | UNIFIED ARCHITECTURE FOR HYBRID DATABASE STORAGE USING FRAGMENTS | 06-02-2016 |
20160154831 | COMPRESSION-AWARE PARTIAL SORT OF STREAMING COLUMNAR DATA | 06-02-2016 |
20160154835 | COMPRESSION-AWARE PARTIAL SORT OF STREAMING COLUMNAR DATA | 06-02-2016 |
20160162504 | INFORMATION SEARCHING APPARATUS, INFORMATION MANAGING APPARATUS, INFORMATION SEARCHING METHOD, INFORMATION MANAGING METHOD, AND COMPUTER PRODUCT - A computer-readable recording medium stores therein an information searching program that causes a computer having access to archives including a compressed file group of compressed files that are to be searched and that have described therein character strings, to execute: sorting the compressed files in descending order of access frequency of the compressed files; combining the compressed files in descending order of access frequency after the sorting at the sorting such that a storage capacity of a cache area for a storage area that stores therein the compressed file group is not exceeded by a combined size of the compressed files combined; and writing, from the storage area into the cache area, the compressed files combined at the combining, the compressed files combined being written prior to a search of the compressed files combined. | 06-09-2016 |
20160162505 | System and Methods for Accelerated Data Storage and Retrieval - Systems and methods for providing accelerated data storage and retrieval utilizing lossless data compression and decompression. A data storage accelerator includes one or a plurality of high speed data compression encoders that are configured to compress data. The compressed data is subsequently stored in a target memory or other storage device whose input data storage bandwidth is lower than the original input data stream bandwidth. Similarly, a data retrieval accelerator includes one or a plurality of high speed data decompression decoders that are configured to decompress data at a rate equivalent to or faster than the input data stream from the target memory or storage device. The decompressed data is then output at rate data that is greater than the output rate from the target memory or data storage device. | 06-09-2016 |
20160171053 | ADAPTIVE INDEX LEAF BLOCK COMPRESSION | 06-16-2016 |
20160171074 | MULTI-DIMENSIONAL DECOMPOSITION COMPUTING METHOD AND SYSTEM | 06-16-2016 |
20160173122 | System That Reconfigures Usage of a Storage Device and Method Thereof | 06-16-2016 |
20160173125 | SEMICONDUCTOR DEVICE AND OPERATING METHOD THEREOF | 06-16-2016 |
20160179837 | DEFINING PAIRING RULES FOR CONNECTIONS | 06-23-2016 |
20160179858 | OPTIMIZATION OF METADATA VIA LOSSY COMPRESSION | 06-23-2016 |
20160179896 | OPTIMIZATION OF METADATA VIA LOSSY COMPRESSION | 06-23-2016 |
20160188622 | Using a distributed prime data sieve for efficient lossless reduction, search, and retrieval of data - Systems and techniques for losslessly reducing input data using a distributed system comprising multiple computers that maintain portions of a data structure that organizes prime data elements based on names of the prime data elements. During operation, a first computer can determine a first name for the element, and send the element to a second computer based on the first name. The second computer can losslessly reduce the element by determining a second name for the element, and using the second name to navigate through a portion of the data structure maintained at the second computer. | 06-30-2016 |
20160188646 | EFFICIENT DATABASE SCREENING AND COMPRESSION - There is provided, in accordance with an embodiment, a method comprising using one or more hardware processor for receiving two or more electronic documents from two or more computerized sources, where each of the electronic documents comprise alphanumeric text. A hierarchical mapping database is retreived, where the hierarchical mapping database comprises records that map between two or more map terms, each comprising two or more words, phrases, and codes, and between a tree structure of unique codes, where the tree structure comprises unique codes for each of at least four classes, and where each of the map terms is mapped to one of the unique codes. The electronic documents are screened to obtain a subset of electronic documents by locating a matching between some of the map terms in some of the classes. The subset is stored in a database on a non-transitory storage medium. | 06-30-2016 |
20160191076 | Compressively-accelerated read mapping framework for next-generation sequencing - A method of compressive read mapping. A high-resolution homology table is created for the reference genomic sequence, preferably by mapping the reference to itself. Once the homology table is created, the reads are compressed to eliminate full or partial redundancies across reads in the dataset. Preferably, compression is achieved through self-mapping of the read dataset. Next, a coarse mapping from the compressed read data to the reference is performed. Each read link generated represents a cluster of substrings from one or more reads in the dataset and stores their differences from a locus in the reference. Preferably, read links are further expanded to obtain final mapping results through traversal of the homology table, and final mapping results are reported. As compared to prior techniques, substantial speed-up gains are achieved through the compressive read mapping technique due to efficient utilization of redundancy within read sequences as well as the reference. | 06-30-2016 |
20160196276 | SOURCE-TO-PROCESSING FILE CONVERSION IN AN ELECTRONIC DISCOVERY ENTERPRISE SYSTEM | 07-07-2016 |
20160196277 | DATA RECORD COMPRESSION WITH PROGRESSIVE AND/OR SELECTIVE DECOMPRESSION | 07-07-2016 |
20160196278 | HIERARCHICAL DATA COMPRESSION AND COMPUTATION | 07-07-2016 |
20160197622 | HIERARCHICAL DATA COMPRESSION AND COMPUTATION | 07-07-2016 |
20160203151 | ENHANCED COMPRESSION, ENCODING, AND NAMING FOR RESOURCE STRINGS | 07-14-2016 |
20160203152 | ENHANCED COMPRESSION, ENCODING, AND NAMING FOR RESOURCE STRINGS | 07-14-2016 |
20160203154 | ENHANCED COMPRESSION, ENCODING, AND NAMING FOR RESOURCE STRINGS | 07-14-2016 |
20160203155 | Storing Data Files in a File System | 07-14-2016 |
20160203167 | MEDIA COMPRESSION IN A DIGITAL DEVICE | 07-14-2016 |
20160253339 | DATA MIGRATION SYSTEMS AND METHODS INCLUDING ARCHIVE MIGRATION | 09-01-2016 |
20160378834 | MEANS FOR CONSTRUCTING AND POPULATING A TIER SET, A COMPACTABLE TIER SET AND/OR A PRIMARY SORT-ORDER MAX TIER SET - A method of generating one or more primary sort-order location n-tuples and placing one or more of the primary sort-order location n-tuples into one or more tier sets. The method includes obtaining two or more component sequences and selecting one of the component sequences as the primary sort-order component sequence. The method also includes creating one or more tier sets and populating a locations index for at least one component sequence other than the primary sort-order component sequence. The method further includes adding each location index to a location index set and creating a primary sort-order item counter. | 12-29-2016 |
20170235752 | SHARED DECOMPRESSION ENGINE | 08-17-2017 |
20170235774 | COMPRESSING DATA IN DEPENDENCE UPON CHARACTERISTICS OF A STORAGE SYSTEM | 08-17-2017 |
20180026649 | METHOD FOR DATA COMPRESSION | 01-25-2018 |
20190146950 | Method and System for Content Agnostic File Indexing | 05-16-2019 |
20190146953 | DATA HANDLING | 05-16-2019 |
20190147069 | METADATA JOURNAL IN A DISTRIBUTED STORAGE SYSTEM | 05-16-2019 |