Entries |
Document | Title | Date |
20100070508 | INFORMATION CORRELATION SYSTEM, USER INFORMATION CORRELATING METHOD, AND PROGRAM - When a collation result of user information is not matched, user information in one system need be prevented from being leaked to the other system. When hash values for an item serving as a key of correlation match with each other between a plurality of user information lists, it is determined that relevance is present, and then user information of the corresponding user is correlated. | 03-18-2010 |
20100070509 | System And Method For High-Dimensional Similarity Search - A computer-implemented method for searching a plurality of stored objects. Data objects are placed in a hash table, an ordered sequence of locations (probing sequence) in the hash table from a query object is generated and data objects in the hash table locations in the generated ordered sequence are examined to find objects whose relationships with the query object satisfy a certain predetermined function defined on pairs of objects. | 03-18-2010 |
20100082636 | Methods and Apparatus for Content-Defined Node Splitting - A region of a node is searched to find a content-defined split point. A split point of a node is determined based at least in part on hashes of entries in the node and the node is split based on the determined split point. The search region is searched for the first encountered split point and the node is split based on that split point. That split point is based on a predetermined bitmask of the hashes of the entries in the node satisfying a predetermined condition. | 04-01-2010 |
20100106729 | SYSTEM AND METHOD FOR METADATA SEARCH - A system, a method, and a computer readable article of manufacture for metadata searching. The system includes: a receiving module for receiving a search query with respect to a plurality of metadata resources; a query module for parsing the search query and searching related metadata resources and the structural information among the related metadata resources according to the parsing of the search query; and an output module for outputting the search results so as to realize a fuzzy structural search with respect to the plurality of metadata resources. The method includes the steps of: receiving a search query that does not designate complete structural information of the metadata resources; parsing the search query and searching related metadata resources to form search results that include the complete structural information; and outputting the search results. | 04-29-2010 |
20100114906 | METHOD FOR PAGINATING A DOCUMENT STRUCTURE OF A DOCUMENT FOR VIEWING ON A MOBILE COMMUNICATION DEVICE - A process for transmitting a document from a server to a mobile device on a per page basis, comprising building a graph structure within the server representing a map of the document, transmitting a page size limit from the mobile device to the server indicative of the size of a single page of the document to be displayed by the mobile device, traversing and paginating the graph structure into successive pages based on the page size limit, caching the pages within the server, and transmitting the successive pages from the server to said the mobile device for display by said the mobile device. | 05-06-2010 |
20100121856 | METHOD AND APPARATUS FOR GENERATING INDEX AS WELL AS SEARCH METHOD AND SEARCH APPARATUS - The present invention provides a method and apparatus for generating index as well as a search method and a search apparatus. Index entry comprises at least a search item identifier corresponding to a search item, one or a plurality of index items generated from one or plural pieces of search information, and an accumulator for the search information. The accumulator is generated by accumulating the search information, or accumulating ciphertext of information containing the search information, or accumulating data mapped from information containing the search information. At the time of searching, the index items and the accumulator are provided to a searcher. The searcher extracts search information from the index items and checks whether the extracted search information is complete by using the accumulator. In one embodiment, the accumulator is incorporated in an encrypted inverted index. | 05-13-2010 |
20100125584 | INTELLIGENT EVENT QUERY PUBLISH AND SUBSCRIBE SYSTEM - Indexing and routing to event data is described. Event data is assigned an identifier that identifies the data type and the contents of event data within an enterprise system. The event data may be real-time event data. With the identifier, a source of the event data is determined, and the source can be queried for the event data in real-time. The identifier is indexed along with other event data identifiers. Based on the location of the event data, the system sends out a query toward the data source to obtain the information, but also to route the query to the data source, rather than attempting to pull data towards the query source and process it at the query source. | 05-20-2010 |
20100145951 | METHODS FOR ESTABLISHING LEGITIMACY OF COMMUNICATIONS - A method for sending a message to a recipient, which comprises determining a data set associated with the message; accessing an ensemble of precomputed tags corresponding to respective initial data sets, each precomputed tag representing a solution to a computational problem involving the respective initial data set; identifying the precomputed tag in the ensemble for which the corresponding initial data set corresponds to the data set associated with the message; and sending the message and informing the recipient of the identified precomputed tag. The recipient executes a method comprising obtaining a tag associated with the message; determining a data set associated with the message, the data set in some embodiments including a portion extrinsic to the message; determining whether the tag represents a solution to a computational problem involving the data set associated with the message; determining whether the tag was specifically generated for the message, based on the portion extrinsic to the message; and establishing the legitimacy of the message based on the outcomes of the previous steps. | 06-10-2010 |
20100145952 | ELECTRONIC DOCUMENT PROCESSING APPARATUS AND METHOD - An electronic document processing apparatus includes: a document set storage unit storing hash tables including hash values of documents to be processed; a content extraction unit for extracting body contents from a newly input electronic document; and a sentence separation unit for separating sentences from the extracted body contents. The apparatus further includes a duplicate document determination unit for converting the separated sentences into unique hash values by a hash algorithm, determining each of the separated checking if there is a duplicate sentence depending on whether or not there is a collision between the converted hash values and the hash values in the hash tables of the document set storage unit, and determining if the electronic document is a duplicate document based on the ratio of duplicate sentences to all of the sentences in the electronic document. | 06-10-2010 |
20100153403 | METHOD FOR DATA ENCRYPTION AND METHOD FOR CONJUNCTIVE KEYWORD SEARCH OF ENCRYPTED DATA - A server provides the user's desired data without the server knowing the contents or keywords of data by using a method of searching the desired data without decrypting the encrypted data, such that the privacy for the important data of the user can be secured. Also, the present invention shortens the operation time when searching the encrypted data, such that it can prevent the degradation in efficiency due to excess operation involved in the previous existing methods based on the pairing operation. | 06-17-2010 |
20100179954 | Quick Mass Data Manipulation Method Based on Two-Dimension Hash - For the massive data of physical memory on the computer system, data indexing can be created base on the two-dimensional hash indexing algorithm, using specific mapping relationship conversion between the index keyword and index sequence address under hash algorithm, which realize the fast addressing while introducing two-dimensional hash list to solve the ‘confliction’ problem of mapping relations in hash queue, which caused by the same keyword index or hash algorithm. | 07-15-2010 |
20100211573 | INFORMATION PROCESSING UNIT AND INFORMATION PROCESSING SYSTEM - A recording medium stores a program that causes a processer to execute a procedure. The procedure includes: calculating registration positions of data based on a total amount of data of existing tables and a hash method, and registering the data at the registration positions, when registering the data in a plurality of tables; adding or deleting a table from the plurality of tables; calculating the registration position of the data based on the total amount of data of the existing tables and the hash method and judging whether data to be referred to is present at the registration position, when data registered in a table is referred to after the table is added or deleted; and when the data to be referred to is not present at the registration position, recalculating the registration position of the data. | 08-19-2010 |
20100228736 | Recognizing a disc - A method and a system are provided for recognizing a disc. In one example, the system receives a disc, such as, for example, a Blu-ray Disc. The disc includes a content certificate and data files. The content certificate includes a unique text file that certifies the disc complies with predetermined disc parameters. The data files are the actual audio and/or video content on the disc. The system reads the content certificate or reads the data related to the data files. The system generates a hash value by applying a hash function to the content certificate or to the data related to the data files. The hash value is a unique identifier for the disc. | 09-09-2010 |
20100228737 | HTTP Range Checksum - A method and apparatus that include a hashing and retrieval module that receives an indicator of a portion of a file to access, generates a hash value of a local copy of the portion and generates a request for a remote copy of the portion of the file, the request including the hash value. A verification and response module receives the request for a portion of a server copy of a data file, the request including a first hashing value. The verification and response module retrieves the portion from a server storage device, generates a second hashing value from the portion, compares the first hash value and the second hash value and returns the portion of the file in response to a failed comparison of the first hash value and second hash value. | 09-09-2010 |
20100241632 | SYSTEMS AND METHODS OF DIRECTORY ENTRY ENCODINGS - In general, the invention relates to supporting multiple different character encodings in the same file system. In one embodiment, a method is provided for filename lookup that supports multiple character encodings. The method comprises storing filename data in a first character encoding into an indexed data structure. The method further comprises receiving filename data in a second encoding. The method also comprises looking up filename data in the indexed data structure using the second encoding. | 09-23-2010 |
20100262609 | SYSTEM AND METHOD FOR LINKING MULTIMEDIA DATA ELEMENTS TO WEB PAGES - A method for linking between a multimedia data element (MMDE) and a web page. The method comprises receiving a MMDE from a source; generating a signature representative of the MMDE using a signature generator; matching the generated signature with a plurality of signatures stored in a database to find at least one matching signature; extracting a universal resource locator (URL) of a web page associated with the matching signature, wherein the URL is part of a metadata of the matching signature; and providing the URL to the source over a network. | 10-14-2010 |
20100299333 | METHOD FOR IMPROVING THE EFFECTIVENESS OF HASH-BASED DATA STRUCTURES - A method to improve the effectiveness of hash-based data structures includes configuration of a data structure and transformation of hash codes as produced by a hash function, to yield a more uniform distribution of data amongst the slots in a data structure. Transformation results in a non-uniform but predictable distribution of hash codes. Configuration exploits the predictable nature of the transformed hash codes to accomplish more uniform and therefore more efficient distribution of items stored in a hash-based data structure. | 11-25-2010 |
20100332481 | Secure and scalable detection of preselected data embedded in electronically transmitted messages - A method and apparatus for detecting preselected data embedded in electronically transmitted messages is described. In one embodiment, the method comprises monitoring messages electronically transmitted over a network for embedded preselected data and performing content searches on the messages to detect the presence of the embedded preselected data using an abstract data structure derived from the preselected data. | 12-30-2010 |
20110004599 | A SYSTEM AND METHOD FOR WORD INDEXING IN A CAPTURE SYSTEM AND QUERYING THEREOF - Searching of objects captured by a capture system can be improved by eliminating irrelevant objects from a query. In one embodiment, the present invention includes receiving such a query for objects captured by a capture system, the query including at least one search term. This search term is then hashed to a term bit position using a hash function. Then objects can be eliminated if, in a word index associated with the object, the term bit position is not set. | 01-06-2011 |
20110016132 | ARCHIVE DEVICE - An achieve device includes: a storage for storing divided data and attribute information, the divided data being received from an external device and divided from original data by a predetermined size, the attribute information being associated with a hash value and identification information, the hash value being calculated from the divided data, the identification information identifying the original data before being divided; and a controller for calculating a hash value for divided data that is received from the external device, writing the divided data and the attribute information corresponding to the divided data to the storage when the calculated hash value is not included in the attribute information stored in the storage, and adding the identification information corresponding to the calculated hash value to the attribute information when the calculated hash value is included in the attribute information stored in the storage. | 01-20-2011 |
20110022601 | BLOCK LEVEL TAGGING WITH FILE LEVEL INFORMATION - Embodiments for data tagging in a computing environment are provided. A write operation in an operating system (OS) file system level storage layer is intercepted. A set of signatures in a sub-chunk level is calculated. The set of signatures are aligned to the beginning of an OS file system-level object and stored in a memory location, such as a cache, along with file system information relating to the write operation and to the file system-level object that the data is written into. Following file system processing, and as the data is written into storage in blocks, the write operation is intercepted in the block level storage layer. A secondary set of signatures in a sub-block level is calculated using a common algorithm used to create the original set of signatures. The sets of signatures are compared against each other, and blocks of the data having matching signatures are tagged with the file system information stored in the memory location. | 01-27-2011 |
20110047165 | Network cache, a user device, a computer program product and a method for managing files - A network cache ( | 02-24-2011 |
20110055221 | METHOD AND APPARATUS FOR OBTAINING DECISION DIAGRAMS FROM UNKNOWN HASH IDENTIFIERS - An approach is provided for reducing decision diagram related communication traffic and cost by querying unknown hash identifiers. A hash identifier application receives a hash identifier that is computed based on a reduced ordered binary decision diagram constructed for a resource description framework graph. The hash identifier application determines whether the received hash identifier matches a predetermined one of a plurality of hash identifiers. The hash identifier application reconstructs or queries for the decision diagram, when there is no match. | 03-03-2011 |
20110078152 | METHOD AND SYSTEM FOR PROCESSING TEXT - An exemplary embodiment of the present invention provides a method of processing an electronic text document. The method includes obtaining a character from the document. The method also includes obtaining a hash input code from a character map, the hash input code corresponding to the character. The method also includes modifying a hash value based on the hash input code if the hash input code indicates that the character is part of a token, or asserting the hash value if the hash input code indicates that character is not part of a token. | 03-31-2011 |
20110078153 | EFFICIENT RETRIEVAL OF VARIABLE-LENGTH CHARACTER STRING DATA - Prefixes are registered on a first list as index elements for respective registration patterns. Each prefix is selected as the longest of different-length prefixes that are extractable from a registration pattern in accordance with an extraction rule. Suffixes, which are the remaining parts of the registration patterns excluding the respective prefixes, are registered on a second list. Using different-length prefixes that are extracted from a retrieval key in accordance with the extraction rule, a prefix retriever searches the first list to retrieve a registration pattern whose prefix matches any of the prefixes of the retrieval key. A suffix checker carries out a check on the suffix of the registration pattern retrieved by the prefix retriever, among the suffixes on the second list, as to whether the suffix of the registration pattern matches the suffix of the retrieval key. | 03-31-2011 |
20110093471 | LEGAL COMPLIANCE, ELECTRONIC DISCOVERY AND ELECTRONIC DOCUMENT HANDLING OF ONLINE AND OFFLINE COPIES OF DATA - Systems and methods of electronic document handling permit organizations to comply with legal or regulatory requirements, electronic discovery and legal hold requirements, and/or other business requirements. The systems described provide a unified approach to data management that enables compliance, legal and IT personnel to focus efforts on, e.g., a single data repository. The systems permit users to define and utilize information governance policies that help automate and systematize different compliance tasks. In some examples, organizations may push data in any third-party data format to the systems described herein. The systems may permit compliance or IT personnel to detect when a legally sensitive production file has been changed or deleted. The systems may also provide a unified dashboard user interface. From a dashboard interface, users may perform searches, participate in collaborative data management workflows, obtain data management reports, and adjust policies. Other elements and features are disclosed herein. | 04-21-2011 |
20110099175 | PLUPERFECT HASHING - Various embodiments herein include one or more of systems, methods, software, and/or data structures to implement a “pluperfect” hash function. Generally, a pluperfect hash function is a hash function that maps distinct elements in a set S to distinct hash values H with no collisions (i.e., perfect hash function) and also includes an additional constraint that the hash function does not map other elements outside the set S into the set of distinct hash values H. In some example embodiments, pluperfect hash functions are used to implement a multi-way branch statement in a computer programming language. The implementation may include generating hash values for each of the case labels of the branch statement according to a pluperfect hash function. | 04-28-2011 |
20110106815 | Method and Apparatus for Selectively Re-Indexing a File System - A method and apparatus are disclosed for re-indexing a file system is disclosed. A detection module detects a reconnection of a storage device to an electronic device. The storage device was previously connected to and then disconnected from the electronic device. The storage device comprises a file system and the electronic device stores first metadata indexing the file system. A determination module determines if the file system is changed since the previous connection. An access module accesses the file system using the first metadata in response to the file system not changing since the previous connection. A re-index module re-indexes the file system in response to the file system changing since the previous connection. | 05-05-2011 |
20110113036 | EFFICIENT FILE ACCESS IN A LARGE REPOSITORY USING A TWO-LEVEL CACHE - A two-level cache to facilitate resolving resource path expressions for a hierarchy of resources is described, which includes a system-wide shared cache and a session-level cache. The shared cache is organized as a hierarchy of hash tables that mirrors the structure of a repository hierarchy. A particular hash table in a shared cache includes information for the child resources of a particular resource. A database management system that manages a shared cache may control the amount of memory used by the cache by implementing a replacement policy for the cache based on one or more characteristics of the resources in the repository. The session-level cache is a single level cache in which information for target resources of resolved path expressions may be tracked. In the session-level cache, the resource information is associated with the entire path expression of the associated resource. | 05-12-2011 |
20110113037 | Matching a Fingerprint - A method and a system are provided for matching a fingerprint, for example, an audio fingerprint. In one example, the system receives, from a user device, a chapter and a query about the chapter. The chapter includes computer readable data generated from a waveform of an audio signal. The query is a request to receive data related to the chapter. The system generates, at a computer, a fingerprint of the chapter. The fingerprint includes at least a digital measure of certain properties of the waveform of the audio signal. The system generates, at a computer, a hash value of the fingerprint by applying a hash function to at least a portion of the fingerprint of the chapter. The hash value serves as an identifier for the fingerprint. The system looks up, in a database system, a matching hash value for the hash value of the fingerprint. | 05-12-2011 |
20110119274 | FILE LISTENER SYSTEM AND METHOD - The system enables a data processing framework that polls for files using a listener, controls the workflow using event driven logic, processes financial transactions, and creates a business to business trading partner network. The listener ensures the reliability and data integrity of input files and performs archiving functions. The payment management functions automatically processes transactions, including payor (e.g., the buyer) initiated payments to a payee (e.g., a supplier) by utilizing a flexible, decoupled processing architecture. A payment management computer identifies a universal identifier for each entity and forms relationships and hierarchies in order to increase efficiency of the trading partner network. Metadata describes the format, validation and relationships for a wide variety of financial account data. | 05-19-2011 |
20110137915 | SINGLE PARSE, DIAGRAM-ASSISTED IMPORT INTO A UNIFIED MODELING LANGUAGE BASED META-MODEL - Systems and methods for conversion and importation of models that describe system behavior into a UML meta model-based representation, include parsing through the textual model for the plurality of elements, searching for an element semantic definition, element view definitions corresponding to a semantic definition, or an element view containing diagram definition within the textual model for each of the plurality of elements, generating element reference nodes for placement on an internally constructed custom tree, attaching a listener to each of the element reference nodes, wherein the listener is configured to await population of the element reference node with an equivalent unified modeling language semantic element, wherein a listener awaiting population is an awaiting sequenced listener, completing an inheritance hierarchy between the element reference nodes up to a parent node inferred from the diagramming definitions and resolving awaiting sequenced listeners that are made aware of an awaited unified modeling language value. | 06-09-2011 |
20110137916 | SYSTEM AND METHOD FOR SYNCHRONIZED CONTENT DIRECTORIES ON CLUSTER DEVICES - According to embodiments of the invention, a system, method and computer program product for a computer program product for synchronizing content directories on cluster devices are provided. Embodiments generate a binary tree for each device in a cluster of devices, the binary tree representing the locations of all copies of content residing in the device. The binary tree for a plurality of other devices in the cluster may be stored in each device the binary tree. The binary trees for the plurality of other devices may be used to determine availability of content and the available content may be displayed to a user. | 06-09-2011 |
20110137917 | RETRIEVING A DATA ITEM ANNOTATION IN A VIEW - A method of retrieving an annotation associated with a data item in a view generated by an information management system querying a data source, includes receiving an output of a query; analyzing the output of the query to identify one or more data items having a data value and an attribute associated therewith; for each identified data value and attribute, identifying a unique value associated with the data value and the attribute, wherein an identified unique value associated with the data value and an identified unique value associated with the attribute forms a unique set of values; identifying from a data store a previously logged set of unique values corresponding to the set of unique values; in response to a positive determination, determining whether the previously logged unique set of values are an associated annotation; and in response to a positive second determination retrieving the annotation from the data store. | 06-09-2011 |
20110145259 | SYSTEM AND METHOD FOR IDENTIFYING DATA FIELDS FOR REMOTE ADDRESS CLEANSING - A system and method for identifying data fields for remote address cleansing, whereby a plurality of address file hash values are stored and associated with a plurality of known address data file profiles. An uploaded address file is received at the processing site from a sender who wishes to have his address list processed. A received address data file profile is identified for the uploaded address file. A first hash value is calculated based on the identified received address data file profile. The first hash value is compared with the stored plurality of address file hash values. If the first hash value matches one of the stored plurality of hash values, then the known address data profile of the matching stored hash value is associated with the uploaded address file. If the first hash value does not match any of the stored plurality of hash values, then preparing a new address file profile, generating a new hash of the new profile, and storing the new profile along with the associated new hash. | 06-16-2011 |
20110145260 | SEARCH DEVICE, A SEARCH METHOD AND A PROGRAM - The present invention provides a search device, a search method and a program which improves the search speed in a longest prefix or suffix match search. | 06-16-2011 |
20110167072 | INDEXING AND FILTERING USING COMPOSITE DATA STORES - Data stores may be combined into a composite data store. A method includes referencing a first index entry for a user specified first parameter pattern. The first index entry includes references to record addresses for records in the composite data store which include the first parameter pattern. A first beginning composite data store address of a first selected data store is referenced. A determination is made that the first beginning composite data store address is at or above an address at or above a predetermined threshold above the first record address. Based on determining that the first beginning composite data store address is at or above a predetermined threshold above the first record address, a speed-up data structure is used to eliminate one or more comparisons of record entries in the first index entry between the first record address and the first beginning composite data store address. | 07-07-2011 |
20110173209 | METHOD FOR LOSSLESS DATA REDUCTION OF REDUNDANT PATTERNS - The present application describes methods and systems for compressing and/or decompressing data. As blocks of data are processed, the processed blocks are placed into a circular buffer at a compressor and indexed based on patterns of data present in the processed blocks. A circular buffer is maintained at the decompressor so that the decompressor circular buffer is consistent with the compressor circular buffer. When a new block of data is processed, the compressor checks the index to the circular buffer to determine whether the new block of data contains a pattern that is redundant with a pattern in one or more blocks of data that have already been processed. If a redundancy is detected, the compressor informs the decompressor of the redundancy and provides information allowing the decompressor to reconstruct the redundant pattern from the decompressor's circular buffer. In this way, redundant data need not be retransmitted or stored. | 07-14-2011 |
20110179040 | NAME HIERARCHIES FOR MAPPING PUBLIC NAMES TO RESOURCES - A resource set comprising a set of resources may be provided to the public. It may be desirable to associate with the resources a set of public names, such as friendly URLs that may be more memorable, may indicate to users the type of resource so named, and may promote indexing of the resources by search engines. A name hierarchy (such as a portion of a file system) may store at least one reference that associates a public name with a resource. A name hierarchy navigation logic may facilitate navigation through the name hierarchy, and may specify a particular location within the name hierarchy where a reference associated with the public name is to be stored. This manner of associating public names with resources may promote the scalability and efficiency in associating public names with resources and in retrieving a resource associated with a particular public name. | 07-21-2011 |
20110196874 | SYSTEM AND METHOD FOR DISCOVERING STORY TRENDS IN REAL TIME FROM USER GENERATED CONTENT - The present invention is directed towards systems and methods for discovering story trends. The method and system according to one embodiment includes receiving a fixed size data stream, identifying a first set of words within the data stream and electronically determining which words in the first set of words are present in a word cache. The method and system then identifies a second set of words within the data stream for each word present within the word cache and electronically determines which words in the second set of words are present in a subword cache, updating the subword cache based on the determination and identifying a third set of words based on the determination. The method and system then electronically determines at least one story trend associated with the third set of words and electronically generating a story hash associated with the third set of words. Finally, the method and system stores the third set of words in a story lookup table and stores the story hash in a story trend cache. | 08-11-2011 |
20110208746 | SYSTEMS AND METHODS FOR MESSAGE-BASED DISTRIBUTED COMPUTING - Systems and methods are provided for message-based distributed computing systems and execution of message-based distributed applications on such systems. The present invention provides a Distributed Application Platform (DAP). The DAP architecture is “distributed” because functions of an application may be performed by processes within a single node, spread across nodes in a network, or spread across processor cores within CPUs. Some embodiments of the DAP provide efficient programming constructs, called sets. A set is a data structure describing an N-dimensional space. Each spatial location is either empty or holds a member. The set construct allows automatic extraction and processing of members with a single query, and makes programming an application for a distributed, parallel, or single computer environment easier for a user. In some embodiments, the DAP may be a message-based distributed computing system. The system receives instructions, builds and initiates an application, and may create a set in memory. | 08-25-2011 |
20110208747 | MEMORY EFFICIENT INDEXING FOR DISK-BASED COMPRESSION - A network optimization device may receive a stream of data and generate a signature for a plurality of fixed length overlapping windows of the stream of data. The device may select a predetermined number of the generated signatures for each L | 08-25-2011 |
20110213784 | SEMANTIC OBJECT CHARACTERIZATION AND SEARCH - Semantic object characterization and its use in indexing and searching a database directory is presented. In general, a first binary hash code is generated to represent a first representation or view of a semantic object which when compared to a characterized version of a second representation or view of the same semantic object in the form of a second binary hash code, the first and second binary hash codes exhibit a degree of similarity indicative of the objects being the same object. In one implementation the semantic objects correspond to peoples' names and the first and second representations or views correspond to two different languages. Thus, a user can search a database of information in one language with a search query in another language. | 09-01-2011 |
20110219010 | METHOD AND APPARATUS FOR PACKET CLASSIFICATION USING BLOOM FILTER - The present disclosure provides an apparatus and method for packet classification using a Bloom filter and includes determining a matching length of how long each field value of one or more fields in an input packet coincides with a field value of the corresponding field stored in a rule set by performing a field-by-field search on the fields in the input packet, and generating a tuple list made up of a combination of one or more of the matching length for the respective fields; selecting particular tuples existing in the rule set from the tuple list; filtering each of the selected tuples by using the Bloom filter; and searching for a best matching rule as a search pool exclusively within the tuples with the positive result of the filtering. According to the present disclosure, the object tuples to search can be substantially relieved to improve the searching performance. | 09-08-2011 |
20110225167 | METHOD AND SYSTEM TO STORE RDF DATA IN A RELATIONAL STORE - A method (and structure) of storing schema-less data of a dataset in a relational database, includes constructing a hash table for the schema-less data, using a processor on a computer. Data in the dataset is stored in a tuple format including a subject along with at least one other entity associated to the subject. Each row of the hashtable will be dedicated to a subject of the dataset, and at least one of the at least one other entity associated with the subject in the row is to be stored in a pair-wise manner in that row of the hashtable. In an exemplary embodiment, RDF data that uses triples (subject, predicate, object) is stored with the predicate/object stored in the pair-wise manner in its associated subject row. | 09-15-2011 |
20110225168 | HASH PROCESSING IN A NETWORK COMMUNICATIONS PROCESSOR ARCHITECTURE - Described embodiments provide coherent processing of hash operations of a network processor having a plurality of processing modules. A hash processor of the network processor receives hash operation requests from the plurality of processing modules. A hash table identifier and bucket index corresponding to the received hash operation request are determined. An active index list is maintained for active hash operations for each hash table identifier and bucket index. If the hash table identifier and bucket index of the received hash operation request are in the active index list, the received hash operation request is deferred until the hash table identifier and bucket index corresponding to the received hash operation request clear from the active index list. Otherwise, the active index list is updated with the hash table identifier and bucket index of the received hash operation request and the received hash operation request is processed. | 09-15-2011 |
20110246480 | SYSTEM AND METHOD FOR INTERACTING WITH A PLURALITY OF DATA SOURCES - System and method for interacting with a plurality of data sources are provided. A request may be parsed and an identification parameter identifying a data set may be determined. A field included in the request may be designated as a distribution key. At least one data source may be selected based on a value associated with the distribution key. At least a portion of the request may be sent to a selected data source. Other embodiments are described and claimed. | 10-06-2011 |
20110264669 | METHOD FOR COMPRESSING A .NET FILE - The invention discloses a method for compressing a .net file, characterized by at least one of the following steps of obtaining and compressing reference type in a .net tile; obtaining and compressing definition method in a .net file; obtaining and compressing method body of the definition method in a .net file; obtaining and compressing Namespace in a .net file; obtaining and compressing definition type in a .net file. By compressing the .net file, the invention efficiently reduces the storage space occupied by the .net file, and makes it stored in a small-sized medium, such as a smart card. | 10-27-2011 |
20120011128 | Non-Parametric Measurement of Media Fingerprint Weak Bits - A value is computed for a feature in an instance of query content and compared to a threshold value. Based on the comparison, first and second bits in a hash value, which is derived from the query content feature, are determined. Conditional probability values are computed for the likelihood that quantized values of the first and the second bits equal corresponding quantized bit values of a target or reference feature value. The conditional probabilities are compared and a relative strength determined for the first and second bits, which directly corresponds to the conditional probability. The bit with the lowest bit strength is selected as the weakbit. The value of the weakbit is toggled to generate a variation of the query hash value. The query may be extended using the query hash value variation. | 01-12-2012 |
20120016882 | DELTA CHUNKS AND DELTA HASHES - Example apparatus, methods, and computers control processing delta chunks with delta hashes. One example method includes computing a first hash for a chunk for which a duplicate determination is to be made. The first hash is suitable for making the duplicate chunk determination. The method also includes computing a delta hash for the chunk. The delta hash is suitable for making a delta chunk determination. The method controls a de-duplication logic to process the chunk as a duplicate upon determining that the first hash matches a stored first hash. The method controls the de-duplication logic to process the chunk as a delta chunk upon determining that the first hash does not match a stored first hash and that the delta hash matches a stored delta hash. Processing a chunk as a delta chunk may include storing a reference to a stored chunk and storing delta hash information. | 01-19-2012 |
20120016883 | Enhanced Query Performance Using Fixed Length Hashing of Multidimensional Data - Methods, systems and apparatus, including computer program products, for enhancing query performance through fixed length hashing of multidimensional data. According to one method, a fixed length hash of a multidimensional data record is created where the hash has respective fixed length sections for each data dimension of the record being hashed. The composite fixed length hash is stored with a reference to the original data record to which it corresponds. Query parameters are hashed and compared to a corresponding section of the fixed length hash to determine a set of candidate records. | 01-19-2012 |
20120016884 | PERSONAL COMPUTING DEVICE-BASED MECHANISM TO DETECT PRESELECTED DATA - A method and apparatus for detecting pre-selected data stored on a personal computing device is described. In one embodiment, contents of data storage media of a personal computing device are searched for pre-selected sensitive data. In one embodiment, if at least a portion of the pre-selected sensitive data is detected, a notification of the detection of the pre-selected data is sent to a system via a network. In another embodiment, if at least a portion of pre-selected sensitive data is detected, the access to this data is blocked. | 01-19-2012 |
20120036133 | COMPUTING DEVICE AND METHOD FOR SEARCHING FOR PARAMETERS IN A DATA MODEL - A computing device and method for searching for parameters in a data model uses a first hash table to store index values of all parameter names in a data model, and uses a second hash table to store index values of data paths of duplicate parameters names in the data model, and changes the comparison sequence when searching for parameters in the data model. That is, comparing beginning from the last character in the input data string, if a leaf node in the data model matches the last character, then comparing parent nodes of the leaf node with the remaining characters of the input data string. Then, a data path of a parameter having a unique name is located according to the first hash table. Furthermore, data paths of parameters having duplicate names is located according to the second hash table. | 02-09-2012 |
20120036134 | PERFORMING CONCURRENT REHASHING OF A HASH TABLE FOR MULTITHREADED APPLICATIONS - In one embodiment, the present invention includes a method for allocating a second number of buckets for a hash table shared concurrently by a plurality of threads, where the second number of buckets are logically mapped onto a corresponding parent one of the first number of buckets, and publishing an updated capacity of the hash table to complete the allocation, without performing any rehashing, such that the rehashing can later be performed in an on-demand, per bucket basis. Other embodiments are described and claimed. | 02-09-2012 |
20120041958 | EFFICIENT RETRIEVAL OF VARIABLE-LENGTH CHARACTER STRING DATA - Prefixes are registered on a first list as index elements for respective registration patterns. Each prefix is selected as the longest of different-length prefixes that are extractable from a registration pattern in accordance with an extraction rule. Suffixes, which are the remaining parts of the registration patterns excluding the respective prefixes, are registered on a second list. Using different-length prefixes that are extracted from a retrieval key in accordance with the extraction rule, a prefix retriever searches the first list to retrieve a registration pattern whose prefix matches any of the prefixes of the retrieval key. A suffix checker carries out a check on the suffix of the registration pattern retrieved by the prefix retriever, among the suffixes on the second list, as to whether the suffix of the registration pattern matches the suffix of the retrieval key. | 02-16-2012 |
20120066229 | SYSTEMS AND METHODS FOR OPERATING A SATURATED HASH TABLE - Systems and methods for operating a saturated hash table are disclosed. In one embodiment, a system includes a hash table located in memory of a computer and a hash program in communication with the hash table. The hash table may include a plurality of index positions, and the hash program may be operable to populate the hash table with a first new digest value, where the first new digest value is associated with a first data item. The first new digest value may be stored at least at a first index position and a second index position of the hash table. | 03-15-2012 |
20120078915 | SYSTEMS AND METHODS FOR CLOUD-BASED DIRECTORY SYSTEM BASED ON HASHED VALUES OF PARENT AND CHILD STORAGE LOCATIONS - Embodiments relate to systems and methods for a cloud-based directory system based on hashed values of parent and child storage locations. Platforms and techniques are provided to store a data object to cloud storage resources in two or more locations recorded in a consistent hash structure. A file management tool can store one copy of the data object to a location corresponding to the hashed value of the file path or name, and a second copy to a location corresponding to the hashed value of the parent directory of the data object. All files sharing a common parent directory or other location therefore have at least one copy stored to the same location, in common with the parent. Directory-wide read, write, and/or search operations can therefore be performed more efficiently, since the constituent files of a directory or other location can be accessed from one location rather than distributed locations. | 03-29-2012 |
20120089612 | FLEXIBLE FULLY INTEGRATED REAL-TIME DOCUMENT INDEXING - A system for real-time document indexing is provided that includes a browser that is executing on a client system. The browser includes functionalities allowing it to communicate with a remote computer system. A query interface executes within the framework of the browser. The query interface receives one or more query searches from an end-user and sends the one or more query searches to be processed by the remote computer system. The remote computer system sends to the query interface the results of the one or more query searches via the browser. The query interface assigns the results of the one or more query searches to a folder where the folder includes a unique identifier. The query interface indexes the results of the one or more query searches to the unique identifier of the folder. | 04-12-2012 |
20120117080 | INDEXING AND QUERYING HASH SEQUENCE MATRICES - Embodiments are directed to indexing and querying a sequence of hash values in an indexing matrix. A computer system accesses a document to extract a portion of text from the document. The computer system applies a hashing algorithm to the extracted text. The hash values of the extracted text form a representative sequence of hash values. The computer system inserts each hash value of the sequence of hash values into an indexing matrix, which is configured to store multiple different hash value sequences. The computer system also queries the indexing matrix to determine how similar the plurality of hash value sequences are to the selected hash value sequence based on how many hash values of the selected hash value sequence overlap with the hash values of the plurality of stored hash value sequences. | 05-10-2012 |
20120117081 | REPRESENTING AND MANIPULATING RDF DATA IN A RELATIONAL DATABASE MANAGEMENT SYSTEM - Techniques for generating hash values for instances of distinct data values. In the techniques, each distinct data value is mapped to hash value generation information which describes how to generate a unique hash value for instances of the distinct data value. The hash value generation information for a distinct data value is then used to generate the hash value for an instance of the distinct data value. The hash value generation information may indicate whether a collision has occurred in generating the hash values for instances of the distinct data values and if so, how the collision is to be resolved. The techniques are employed to normalize RDF triples by generating the UIDS employed in the normalization from the triples' lexical values. | 05-10-2012 |
20120143876 | Method and Apparatus for Efficiently Organizing Hierarchical QoS Policies - Consistent with embodiments of the present invention, a method may be provided comprising receiving a search string corresponding to a desired node comprising a target parameter, a policy parameter, and a class parameter. The target parameter may be referenced with a target index table to determine which interfaces to search. The policy parameter may be referenced with a policy index table to determine a node-id of a policy node corresponding to the policy parameter. A level for the desired node may be determined based on the node-id. The class parameter may be referenced with the determined node-id with a class index table to access a bucket location. The desired node may then be searched for with the determined node-id at the determined level. | 06-07-2012 |
20120143877 | Method and Apparatus for High Performance, Updatable, and Deterministic Hash Table for Network Equipment - An apparatus comprising a storage device comprising a hash table including a plurality of buckets, each bucket being capable of storing at least one data item, and a processor configured to apply at least a first and a second hash function upon receiving a key to generate a first index and a second index, respectively, the first and second indices identifying first and second potential buckets in the hash table for storing a new data item associated with the key, determine whether at least one of the first and second potential buckets have space available to store the new data item, and responsive to determining that at least one of the first and second potential buckets have available space, insert the new data item into one of the first or second potential buckets determined to have available space. | 06-07-2012 |
20120150869 | METHOD FOR CREATING A INDEX OF THE DATA BLOCKS - An method for creating a index of the data blocks is applicable in data de-duplication procedure, includes loading an index file, the index file includes a plurality of location blocks, each location block includes a plurality of storage fields, and each storage field records a primary Hash value corresponding to the data block; performing a first Hash procedure on a primary Hash value of the data block and calculating a block number; performing a second Hash procedure on the primary Hash value in the same data block and calculating a field number; loading a location conflict list; comparing the field number with the field number in the location conflict list to search whether the same field number is stored in the location conflict list; writing the primary Hash value into the corresponding block number and the field number if the field number does not exist in the location conflict list. | 06-14-2012 |
20120158736 | VIRTUAL R-TREE MAPPED TO AN EXTENDIBLE-HASH BASED FILE SYSTEM - Techniques for mapping a virtual R-Tree to an extensible-hash based file system for databases are provided. Spatial data is identified within an existing file system, which stores data for a database. Rows of the spatial data are organized into collections; each collection represents a virtual block. The virtual blocks are used to form an R-Tree spatial index that overlays an existing index for the database on the existing file system. Each row within its particular virtual block includes a pointer to its native storage location within the existing file system. | 06-21-2012 |
20120166448 | Adaptive Index for Data Deduplication - The subject disclosure is directed towards a data deduplication technology in which a hash index service's index and/or indexing operations are adaptable to balance deduplication performance savings, throughput and resource consumption. The indexing service may employ hierarchical chunking using different levels of granularity corresponding to chunk size, a sampled compact index table that contains compact signatures for less than all of the hash index's (or subspace's) hash values, and/or selective subspace indexing based on similarity of a subspace's data to another subspace's data and/or to incoming data chunks. | 06-28-2012 |
20120173541 | Distributed Cache for Graph Data - A distributed caching system for storing and serving information modeled as a graph that includes nodes and edges that define associations or relationships between nodes that the edges connect in the graph. | 07-05-2012 |
20120179691 | DEVICE DISCOVERY IN A UBIQUITOUS COMPUTING ENVIRONMENT - Technologies are generally described for methods, instructions, and client applications for device discovery in a ubiquitous computing environment. In some examples, the methods, instructions, and client applications may facilitate the organization of features of devices in a ubiquitous computing environment into a series of hierarchical hash numbers, the ordering of the hierarchical hash numbers corresponding to the respective devices, and the searching for a particular one of the devices by attempting to match hashed search criteria to the ordered hierarchical hash numbers at one of the devices in the ubiquitous computing environment. | 07-12-2012 |
20120185487 | METHOD, DEVICE AND SYSTEM FOR PUBLICATION AND ACQUISITION OF CONTENT - A method for establishing content indexes includes: determining the size of a content space; determining a content address space according to the size of the content space; establishing the mapping relationship from the content space to the content address space and obtaining the content address; monitoring the corresponding content address and accepting the content publication or the content acquisition request of the content mapping space, by the content indexing node. | 07-19-2012 |
20120191724 | STORAGE OF DATA OBJECTS BASED ON A TIME OF CREATION - Techniques for storage of data objects based on a time of creation are disclosed. A computing device may receive a request to store a data object and, in response, identify a particular storage location that maintains data for the interval of time including a time of creation of the data object. | 07-26-2012 |
20120197901 | Public Electronic Document Dating List - Systems and methods are disclosed which enable the establishment of file dates and the absence of tampering, even for documents held in secrecy and those stored in uncontrolled environments, but which does not require trusting a timestamping authority or document archival service. A trusted timestamping authority (TTSA) may be used, but even if the TTSA loses credibility or a challenger refuses to acknowledge the validity of a timestamp, a date for an electronic document may still be established. Systems and methods are disclosed which enable detection of file duplication in large collections of documents, which can improve searching for documents within the large collection. | 08-02-2012 |
20120215789 | Media Fingerprinting and Identification System - The overall architecture and details of a scalable video fingerprinting and identification system that is robust with respect to many classes of video distortions is described. In this system, a fingerprint for a piece of multimedia content is composed of a number of compact signatures, along with traversal hash signatures and associated metadata. Numerical descriptors are generated for features found in a multimedia clip, signatures are generated from these descriptors, and a reference signature database is constructed from these signatures. Query signatures are also generated for a query multimedia clip. These query signatures are searched against the reference database using a fast similarity search procedure, to produce a candidate list of matching signatures. This candidate list is further analyzed to find the most likely reference matches. Signature correlation is performed between the likely reference matches and the query clip to improve detection accuracy. | 08-23-2012 |
20120226699 | DEDUPLICATION WHILE REBUILDING INDEXES - Systems and methods of deduplicating while loading index entries are disclosed. An example method includes loading a first group of index entries into an index. The example method also includes deduplicating data using the index before loading the first group of index entries is completed. | 09-06-2012 |
20120233176 | EFFICIENT INDEXING AND SEARCHING OF ACCESS CONTROL LISTED DOCUMENTS - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for storing a plurality of documents in computer-readable memory, each document of the plurality of documents having a corresponding access control list (ACL), each ACL defining a plurality of users that are authorized to access a respective document, generating an index based on the plurality of users, the index comprising a plurality of partitions, each partition corresponding to a user of the plurality of users, and, for each document of the plurality of documents: ranking the users of the plurality of users, selecting a user as an indexing user based on the ranking, and storing the document in a partition of the index, the partition corresponding to the indexing user. | 09-13-2012 |
20120254193 | Processing Data in a Mapreduce Framework - A computer-implemented method for processing input data in a mapreduce framework includes: receiving, in the mapreduce framework, a data processing request for input data; initiating, based on the data processing request, a map operation on the input data by multiple mappers in the mapreduce framework, each of the mappers using an aggregator to partially aggregate the input data into one or more intermediate key/value pairs; initiating a reduce operation on the intermediate key/value pairs by multiple reducers in the mapreduce framework, wherein, without sorting the intermediate key/value pairs, those of the intermediate key/value pairs with a common key are handled by a same one of the reducers, each of the reducers using the aggregator to aggregate the intermediate key/value pairs into one or more output values; and providing the output values in response to the data processing request. | 10-04-2012 |
20120254194 | MANAGEMENT AND STORAGE OF DISTRIBUTED BOOKMARKS - Managing user bookmark information includes receiving a bookmark-related action request and determining a type of action associated with the bookmark-related action request and user information associated with the bookmark-related action request. In the event that the type of action corresponds to an add bookmark action, managing user bookmark information further includes generating a bookmark data record, the bookmark data record comprising the user information and information to be bookmarked; determining, using the user information, bookmark database information associated with a bookmark database to which the bookmark data record is to be stored, the bookmark database being one of a plurality of bookmark databases; generating index information based on the user information and the bookmark database information; storing the index information in an index database that is separate from the plurality of bookmark databases; and storing the bookmark data record in the bookmark database. | 10-04-2012 |
20120259863 | Low Level Object Version Tracking Using Non-Volatile Memory Write Generations - Data versioning in a non-volatile memory. An object key associated with a data object is created. An index into an object table is generated using the object key. A version number is stored in conjunction with the data object stored in the non-volatile memory. In an object linked-list, the object key and the location information of the data object in the non-volatile memory are stored. A record associated with the data object is created in an object table. The record includes an index, a reference to the object linked-list, and the version number. The index is generated based on the object key. | 10-11-2012 |
20120265765 | SELF-INDEXER AND SELF INDEXING SYSTEM - An improved self-indexer comprising a find function that caches a last found position and occurrence count of a symbol on each node level of a word-based wavelet tree for a particular symbol lookup and only uses a select function to call on data to the right of the position. | 10-18-2012 |
20120265766 | COMPRESSION ON THIN PROVISIONED VOLUMES USING EXTENT BASED MAPPING - For facilitating data compression, a set of logical extents, each having compressed logical tracks of data, is mapped to a head physical extent and, if the head physical extent is determined to have been filled, to at least one overflow extent having spatial proximity to the head physical extent. Pursuant to at least one subsequent write operation and destage operation, the at least one subsequent write operation and destage operation determined to be associated with the head physical extent, the write operation is mapped to one of the head physical extent, the at least one overflow extent, and an additional extent having spatial proximity to the at least one overflow extent. | 10-18-2012 |
20120303634 | In-Memory Data Grid Hash Scheme Optimization - Systems and methods of managing an in-memory data grid (IMDG) may involve conducting a data distribution analysis of the IMDG on a periodic basis, and selecting a hash scheme from a plurality of hash schemes based on the data distribution analysis. In one example, the selected hash scheme is used to conduct a repopulation of the IMDG, wherein the repopulation increases the distribution evenness of database records across the IMDG. | 11-29-2012 |
20130007006 | System and Method for Using Network Equipment to Provide Targeted Advertising - A search request received from a user is converted to a search request integer value using an operational portion of a chip in network equipment. The search request integer value is compared to representative data integer values that were previously converted from a dataset of search terms using the operational portion, the representative integer values being stored on the chip. If the comparing is successful, a signal is transmitted to a second database, the signal being used to determine a message to be transmitted to the user that corresponds to the representative data integer. | 01-03-2013 |
20130007007 | METHOD AND APPARATUS FOR PROVIDING A LIST-BASED INTERFACE TO KEY-VALUE STORES - An approach is provided for providing a list-based interface to key-value stores. The library interface platform determines one or more key-value pairs of at least one key-value store, the one or more key-value pairs comprising one or more data entries. Next, the library interface platform causes, at least in part, an association of at least one list object with the one or more key-value pairs, one or more sub-list objects, or a combination thereof. Then, the library interface platform provides at least one interface for performing one or more operations on the at least one list object to interact with the one or more data entries, the one or more key-value pairs, the one or more sub-list objects, or a combination thereof. | 01-03-2013 |
20130007008 | HASH ALGORITHM-BASED DATA STORAGE METHOD AND SYSTEM - A hash algorithm-based data storage method and apparatus are disclosed, including: pre-configuring L number of backend storage modules and a mapping relationship between identifiers of the backend storage modules and a modulo L operation; calculating a key value of data to be stored using a hash algorithm; performing a modulo L operation on the obtained key value and, using the mapping relationship between identifiers of the backend storage modules and the modulo L operation, outputting the key value in the modulo L operation and the corresponding data to a backend storage module with a corresponding backend storage module identifier; determining a preconfigured hash table in the backend storage module does not contain data to be stored, and storing the data to be stored and the corresponding key value. By using the present invention, requirements on storage devices can be lowered and the storage efficiency can be improved. | 01-03-2013 |
20130013618 | METHOD OF REDUCING REDUNDANCY BETWEEN TWO OR MORE DATASETS - A method for reducing redundancy between two or more datasets of potentially very large size. The method improves upon current technology by oversubscribing the data structure that represents a digest of data blocks and using positional information about matching data so that very large datasets can be analyzed and the redundancies removed by, having found a match on digest, expands the match in both directions in order to detect and eliminate large runs of data by replace duplicate runs with references to common data. The method is particularly useful for capturing the states of images of a hard disk. The method permits several files to have their redundancy removed and the files to later be reconstituted. The method is appropriate for use on a WORM device. The method can also make use of L2 cache to improve performance. | 01-10-2013 |
20130013619 | PEER-TO-PEER REDUNDANT FILE SERVER SYSTEM AND METHODS - Peer-to-peer redundant file server system and methods include clients that determine a target storage provider to contact for a particular storage transaction based on a pathname provided by the filesystem and a predetermined scheme such as a hash function applied to a portion of the pathname. Servers use the same scheme to determine where to store relevant file information so that the clients can locate the file information. The target storage provider may store the file itself and/or may store metadata that identifies one or more other storage providers where the file is stored. A file may be replicated in multiple storage providers, and the metadata may include a list of storage providers from which the clients can select (e.g., randomly) in order to access the file. | 01-10-2013 |
20130046767 | APPARATUS AND METHOD FOR MANAGING BUCKET RANGE OF LOCALITY SENSITIVE HASH - An apparatus for managing a bucket range of Locality Sensitive Hash is provided. The apparatus includes a range setting unit configured to set bucket ranges of Locality Sensitive Hash by dividing at least one vector based on distribution of data that are projected to the at least one vector. | 02-21-2013 |
20130066883 | DATA MANAGEMENT APPARATUS AND SYSTEM - A data management apparatus sends specific data and key information corresponding to the specific data to another apparatus, when executing a process to change a storage destination of the specific data in which the hash value obtained by applying a predetermined hash function to corresponding key information belongs to a certain range, from the data management apparatus to the other apparatus, and sends the identification information of the other apparatus stored in correspondence with the certain range to a request source of an operation request, when the operation request with respect to data corresponding to key information is received after the process. | 03-14-2013 |
20130086073 | REJECTING ROWS WHEN SCANNING A COLLISION CHAIN - Provided are techniques for rejecting rows while locating a target row. For a row that is stored in a hash space, a row filter value is generated for that row, and the row filter value is stored with the row. While trying to locate the target row in a collision chain in the hash space, a row filter value is calculated for the target row. For the row in the collision chain, the stored row filter value of the row in the collision chain is compared with the computed row filter value of the target row. In response to determining that the stored row filter value does not match the computed row filter value, it is determined that the row in the collision chain is not the target row. | 04-04-2013 |
20130086074 | EXTENDED WIDTH ENTRIES FOR HASH TABLES - A hash table supports extended entries. The extended entries permit a base entry to extend its associated data into one or more neighboring entries. Extended entries thereby provide a mechanism through which a hash table entry may store additional data compared to a base entry. Extended entries may coexist with base entries in the hash table. The hash table thereby provides the flexibility to adapt dynamically to meet system requirements and to balance the needs of additional data storage by blending the number of extended entries (that each store more data than a base entry) and the number of base entries (each storing less data than an extended entry). | 04-04-2013 |
20130086075 | Methods and Systems for Providing Unique Signatures - Presented are systems and methods for creating a set of signatures including acquiring a data set and converting the data set into a plurality of data matrices. The system determines a prime number and determines a plurality of primitive roots to the prime number. The system calculates a template matrix using a first and second primitive root, of the plurality of primitive roots, and selects a data matrix property of interest. The system calculates a first hash function for each of the data matrices to create a first signature for each data matrix such that a main set of signatures is formed, wherein the first hash function is calculated using said property of interest, the prime number, and the template matrix. The system generates a main set of signatures based on the first hash functions, wherein the main set of signatures comprises a first signature for each data matrix. | 04-04-2013 |
20130097175 | Efficient File Access In A Large Repository Using A Two-Level Cache - A two-level cache to facilitate resolving resource path expressions for a hierarchy of resources is described, which includes a system-wide shared cache and a session-level cache. The shared cache is organized as a hierarchy of hash tables that mirrors the structure of a repository hierarchy. A particular hash table in a shared cache includes information for the child resources of a particular resource. A database management system that manages a shared cache may control the amount of memory used by the cache by implementing a replacement policy for the cache based on one or more characteristics of the resources in the repository. The session-level cache is a single level cache in which information for target resources of resolved path expressions may be tracked. In the session-level cache, the resource information is associated with the entire path expression of the associated resource. | 04-18-2013 |
20130103694 | PREFIX AND PREDICTIVE SEARCH IN A DISTRIBUTED HASH TABLE - In one embodiment, a method comprises identifying prefix groups for searchable character symbols, each prefix group having a corresponding searchable character symbol comprising at least one searchable character; assigning at least one prefix group to each of a plurality of distributed hash table nodes in a network, each distributed hash table node containing at least one of the prefix groups, each distributed hash table node assigned a corresponding prescribed keyspace range of a prescribed keyspace, each distributed hash table node configured for storing data records having respective primary data record keys within the corresponding prescribed keyspace range; and assigning secondary indexes that start with one of the searchable character symbols to the corresponding prefix group in the associated distributed hash table node, enabling any prefix search starting with the one searchable character symbol to be directed to the corresponding prefix group in the associated distributed hash table node. | 04-25-2013 |
20130117276 | METHODS AND APPARATUS FOR DISCOVERY OF ATTRIBUTES USING A SOCIAL MOBILE APPLICATION - In one general aspect, a computer-readable storage medium can be configured to store instructions that when executed cause a processor to perform a process. The instructions can include instructions to receive, at a first device, a target attribute associated with a first user account and to access a code representing the target attribute and including a plurality of values. The instructions can include instructions to send, to the second device, a portion of the code and an indicator of a relative location within the code of the portion of the code, and to receive an indicator from the second device that the portion of the code is included at the relative location within at least one code from a plurality of codes associated with a plurality of attributes associated with a second user account. | 05-09-2013 |
20130132399 | INCREMENTAL CONTEXT ACCUMULATING SYSTEMS WITH INFORMATION CO-LOCATION FOR HIGH PERFORMANCE AND REAL-TIME DECISIONING SYSTEMS - Provided are techniques for incrementally integrating and persisting context over an available observational space. At least one feature associated with a new observation is used to create at least one index key. The at least one index key is used to query one or more reverse lookup tables to locate at least one previously persisted candidate observation. The new observation is evaluated against the at least one previously persisted candidate observation to determine at least one relationship. In response to determining the at least one relationship, a threshold is used to make a new assertion about the at least one relationship. The new observation is used to review previous assertions to determine whether a previous assertion is to be reversed. In response to reversing the previous assertion, the new observation, the new assertion, and the reversed assertion are incrementally integrated into persistent context. | 05-23-2013 |
20130132400 | INCREMENTAL CONTEXT ACCUMULATING SYSTEMS WITH INFORMATION CO-LOCATION FOR HIGH PERFORMANCE AND REAL-TIME DECISIONING SYSTEMS - Provided are techniques for incrementally integrating and persisting context over an available observational space. At least one feature associated with a new observation is used to create at least one index key. The at least one index key is used to query one or more reverse lookup tables to locate at least one previously persisted candidate observation. The new observation is evaluated against the at least one previously persisted candidate observation to determine at least one relationship. In response to determining the at least one relationship, a threshold is used to make a new assertion about the at least one relationship. The new observation is used to review previous assertions to determine whether a previous assertion is to be reversed. In response to reversing the previous assertion, the new observation, the new assertion, and the reversed assertion are incrementally integrated into persistent context. | 05-23-2013 |
20130151535 | DISTRIBUTED INDEXING OF DATA - Indexing a data set of objects, where the data set is partitioned into plural work units with plural objects and distributed to multiple data process nodes. Each data processing node maps the plural objects in corresponding work units into respective ones of given sub-indexes. A composite index is constructed for the objects in the data set by reducing the mapped objects, where reducing the mapped objects is distributed among multiple data processing nodes. | 06-13-2013 |
20130166569 | INTELLIGENT EVENT QUERY PUBLISH AND SUBSCRIBE SYSTEM - Indexing and routing to event data is described. Event data is assigned an identifier that identifies the data type and the contents of event data within an enterprise system. The event data may be real-time event data. With the identifier, a source of the event data is determined, and the source can be queried for the event data in real-time. The identifier is indexed along with other event data identifiers. Based on the location of the event data, the system sends out a query toward the data source to obtain the information, but also to route the query to the data source, rather than attempting to pull data towards the query source and process it at the query source. | 06-27-2013 |
20130179452 | Media Fingerprinting and Identification System - The overall architecture and details of a scalable video fingerprinting and identification system that is robust with respect to many classes of video distortions is described. In this system, a fingerprint for a piece of multimedia content is composed of a number of compact signatures, along with traversal hash signatures and associated metadata. Numerical descriptors are generated for features found in a multimedia clip, signatures are generated from these descriptors, and a reference signature database is constructed from these signatures. Query signatures are also generated for a query multimedia clip. These query signatures are searched against the reference database using a fast similarity search procedure, to produce a candidate list of matching signatures. This candidate list is further analyzed to find the most likely reference matches. Signature correlation is performed between the likely reference matches and the query clip to improve detection accuracy. | 07-11-2013 |
20130204879 | WEB PAGE RETRIEVAL METHOD AND DEVICE - Embodiments of the present application relate to a web page retrieval method, a web page retrieval device, and a computer program product for retrieving a web page. A web page retrieval method is provided. The method includes receiving a query, retrieving an attribute identifier of a web page to be retrieved and a query range related to the attribute identifier, based on the query, obtaining a range of attribute values, determining whether an intersection between the range of attribute values to be retrieved and a plurality of index ranges established in advance in a system receiving the inputted query exists, in the event that the intersection exists, retrieving a web page relating to intersecting index ranges, the attribute identifier of the web page corresponding to the attribute identifier of the web page to be retrieved, and the attribute values intersecting the query range, and returning the retrieved web page. | 08-08-2013 |
20130218901 | CORRELATION FILTER - In one embodiment, the correlation filter can use one of several data structure to track each migration unit and reject successive accesses within a period of time to each migration unit. In one embodiment, the correlation filter uses a space efficient data structure, such as a hash indexed correlation array to store the address of referenced migration units, and to filter accesses to a single migration unit that are correlated accesses resulting from multiple accesses to the same migration unit during a sequential I/O stream. In one embodiment, the correlation array contains a global timeout, which resets each element to a default value, clearing all store migration unit address values from the correlation array. In one embodiment, each element of the migration array can time-out separately. | 08-22-2013 |
20130238632 | SYSTEM AND METHOD FOR INDEXING OF GEOSPATIAL DATA USING THREE-DIMENSIONAL CARTESIAN SPACE - Embodiments of a system and method for indexing of geospatial data using three-dimensional Cartesian space are generally described herein. In an aspect, such example methods may include calculating endpoints of a segment, wherein the endpoints are specified in Cartesian coordinates and are located on a substantially spherical surface, defining a boundary of a polygon according to the segment, computing one or more normals corresponding to one or more planes, wherein each of the one or more planes contain a test point and a boundary point associated with the boundary, obtaining a boundary sine value of an angle defined by an arc subtended by the endpoints, summing each of a group of angle values derived from the boundary sine value to obtain an angle sum, wherein the group contains the boundary sine value, and determining whether the test point is inside the polygon based on the angle sum. | 09-12-2013 |
20130262472 | DATA EXISTENCE JUDGING DEVICE AND DATA EXISTENCE JUDGING METHOD - A data existence judging device includes: L number of first storage areas each associated with one of L hash values; M number of second storage areas each associated with some of the L hash values; an information setting part, for each data in a data set, to calculate k hash values about the data, and, for each calculated hash value, when a count value in the first storage area associated with the calculated hash value is less than 2 | 10-03-2013 |
20130282732 | ELECTRONIC DEVICE AND METHOD FOR MANAGING NAMES OF ELECTRONIC DEVICE - In a method for naming an electronic device, a hash table including indices and corresponding character strings is preset. The electronic device is controlled to enter into a searchable state, and inserts a default name of the electronic device in a searchable device list. A shaking angle of the electronic device is calculated according to detected coordinate values acquired from a gravity sensor, when the electronic device is shaking in the searchable state and the default name needs to be changed. A hash value is calculated using the shaking angle, a UNIX timestamp, a number of the indices in the hash table, and a predetermined formula. The method further determines an index that is the same as the hash value, determines a character string corresponding to the determined index, and renames the electronic device using the determined character string. | 10-24-2013 |
20130332466 | Linking Data Elements Based on Similarity Data Values and Semantic Annotations - Data elements from data sources and having a data value set are linked by using hash functions to determine a dimensionally reduced instance signature for each data element based on all data values associated with that data element to yield a plurality of dimensionally reduced instance signatures of equivalent fixed size such that similarities among the data values in the data value sets across all data elements is maintained among the plurality of instance signatures. Candidate pairs of data elements to link are identified using the plurality of instance signatures in locality sensitive hash functions, and a similarity index is generated for each candidate pair using a pre-determined measure of similarity. Candidate pairs of data elements having a similarity index above a given threshold are linked. | 12-12-2013 |
20130332467 | Linking Data Elements Based on Similarity Data Values and Semantic Annotations - Data elements from data sources and having a data value set are linked by using hash functions to determine a dimensionally reduced instance signature for each data element based on all data values associated with that data element to yield a plurality of dimensionally reduced instance signatures of equivalent fixed size such that similarities among the data values in the data value sets across all data elements is maintained among the plurality of instance signatures. Candidate pairs of data elements to link are identified using the plurality of instance signatures in locality sensitive hash functions, and a similarity index is generated for each candidate pair using a pre-determined measure of similarity. Candidate pairs of data elements having a similarity index above a given threshold are linked. | 12-12-2013 |
20140032569 | SYSTEMS, METHODS AND COMPUTER PROGRAM PRODUCTS FOR REDUCING HASH TABLE WORKING-SET SIZE FOR IMPROVED LATENCY AND SCALABILITY IN A PROCESSING SYSTEM - System, method and computer program products for storing data by computing a plurality of hash functions of data values in a data item, and determining a corresponding memory location for one of the plurality of hash functions of data values in the data item. Each memory location is of a cacheline size wherein a data item is stored in a memory location. Each memory location can store a plurality of data items. A key portion of all data items is contiguously stored within the memory location, and a payload portion is contiguously stored within the memory location. Payload portions are packed as bit-aligned in a fixed-sized memory location, comprising a bucket in a bucketized hash table, each bucket sized to store multiple key portions and payload portions that are packed as bit-aligned in a fixed-sized bucket. Corresponding key portions are stored as compressed keys in said fixed-sized bucket. | 01-30-2014 |
20140052736 | CUSTOM OBJECT-IN-MEMORY FORMAT IN DATA GRID NETWORK APPLIANCE - Techniques are disclosed for implementing custom object-in-memory formats in a data grid network appliance. The techniques include maintaining a record of format definitions on a client device of the data grid and a corresponding record of format definitions on a server device of the data grid. Each format definition may indicate one or more attributes of an object class and data types and byte ranges of the attributes. The client device may serialize one or more objects for storage in the data grid based on respective format definitions associated with the one or more objects and retrieved from the record of format definitions maintained on the client device. Further, the server device may perform one or more data grid operations using format definitions retrieved from the record of format definitions maintained on the server device. | 02-20-2014 |
20140052737 | Media Fingerprinting and Identification System - The overall architecture and details of a scalable video fingerprinting and identification system that is robust with respect to many classes of video distortions is described. In this system, a fingerprint for a piece of multimedia content is composed of a number of compact signatures, along with traversal hash signatures and associated metadata. Numerical descriptors are generated for features found in a multimedia clip, signatures are generated from these descriptors, and a reference signature database is constructed from these signatures. Query signatures are also generated for a query multimedia clip. These query signatures are searched against the reference database using a fast similarity search procedure, to produce a candidate list of matching signatures. This candidate list is further analyzed to find the most likely reference matches. Signature correlation is performed between the likely reference matches and the query clip to improve detection accuracy. | 02-20-2014 |
20140067824 | DATABASE TABLE FORMAT CONVERSION BASED ON USER DATA ACCESS PATTERNS IN A NETWORKED COMPUTING ENVIRONMENT - An approach for conversion between database formats (e.g., from a relational database format to a hash table or a “big table” database format) based on user data access patterns in a networked computing environment is provided. A first set of database tables having a first format is identified based on a set of access patterns stored in a computer storage device. A second set of database tables having a second database format corresponding to the first set of database tables may then be provided (e.g., accessed, augmented, and/or generated). A mapping between the first set of database tables and the second set of database tables may then be created. A column set may then be generated based on at least one condition of the set of queries. The column set may then be used as a key for the second set of database tables. | 03-06-2014 |
20140081986 | COMPUTING DEVICE AND METHOD FOR GENERATING SEQUENCE INDEXES FOR DATA FILES - In a method for generating sequence indexes for data files of a computing device, index information of a data file is received from an input device of the computing device. The last index of a data index list stored in a database of a storage device is established, and an m-digits number is generated according to the storage capacity of the database. The m-digits number of the last index is calculated to obtain a sequence number, and a sequence index of the data file is generated according to the sequence number of the last index and the index information of the data file. The sequence index of the data file is inserted into the last position of the data index list, and the data index list is saved into the database of the storage device. | 03-20-2014 |
20140095512 | RANKING SUPERVISED HASHING - Aspects of the present invention provide a tool for hash-based indexing. In an embodiment, a ranked dataset having a plurality of data items is obtained. Every data item in the ranked dataset has a ranking with respect to every other data item in the ranked dataset. A ranking triplet matrix is created based on the ranked dataset. The ranking triplet matrix has a set of ranking triplets, each of which indicates the relative ranking for a pair of the data items in the ranked dataset. This ranking triplet can be merged with a hash table obtained using a standard hash function and the data items can be indexed based on the results. | 04-03-2014 |
20140108421 | PARTITIONING DATABASE DATA IN A SHARDED DATABASE - A sharded database system configured for partitioning data amongst a plurality of shard servers is provided. In one implementation the sharded database system comprises a sharded database including a first shard server, a second shard server, and a shard control record. The shard control record is configured to define a first data structure for distributing a first plurality of data records or rows based on a first sharding by monotonic key range across the first and second shard servers. The sharded database is also configured to further distribute the first plurality of records or rows across the first shard server and the second shard server via a subsidiary hashing method. A method of partitioning data of a database is also provided. | 04-17-2014 |
20140129568 | REDUCED COMPLEXITY HASHING - Hashing complexity is reduced by exploiting a hashing matrix structure that permits a corresponding hashing function to be implemented such that an output vector of bits is produced in response to an input vector of bits without combining every bit in the input vector with every bit in any row of the hashing matrix. | 05-08-2014 |
20140164391 | DATA BLOCK SAVING SYSTEM AND METHOD - An assignment server receives a data block of the file from a client. The assignment server determines if the obtained data block is a repetitive data block. The assignment server uploads the obtains data block from the client into a storage server when the obtained data block is not the repetitive data block. | 06-12-2014 |
20140181119 | METHOD AND SYSTEM FOR ACCESSING FILES ON A STORAGE SYSTEM - A method for accessing files on a storage system is provided. A hash memory table including a plurality of hash buckets respectively corresponding to a plurality of index hash codes is built. Each of the hash buckets has a pointer pointing towards at least one entry. Each of the entries has a physical address field and a hash code field. The physical address fields respectively record physical addresses storing the files, and the hash code fields respectively record verification hash codes corresponding to the files. The index hash codes are generated by inputting keys of the files to an index hash function and the verification hash codes are generated by inputting keys of the files to a verification hash function. Then, the hash memory table is loaded into the buffer with a bucket-based replacement policy so that the files are able to be accessed according to the hash memory table. | 06-26-2014 |
20140188893 | DATA RETRIEVAL APPARATUS, DATA STORAGE METHOD AND DATA RETRIEVAL METHOD - A computer executes a process including dividing a data set into a plurality of data sets, determining hash functions for the data sets, producing hash coefficient value information for specifying the hash functions and correspondence information between the hash coefficient values and the data sets, and producing hash information for the data sets. In the hash function determination, a hash value is calculated using a candidate hash function based on keys of the data of the data set, and the candidate hash function is determined as the hash function of the data set with regard to which it is decided that all data can be stored into a first address based on the hash value or a second address contiguous to the first address. In the hash information production, the hash information for the data set is produced by storing the data and keys into the first or second address. | 07-03-2014 |
20140195545 | HIGH PERFORMANCE HASH-BASED LOOKUP FOR PACKET PROCESSING IN A COMMUNICATION NETWORK - The present invention relates to methods and apparatus for performing a lookup on a hash table stored in external memory. An index table stored in local memory is used to perform an enhanced lookup on the hash table stored in external memory. The index table stores signature patterns that are derived from the hash keys stored in the hash entries. Using the stored signature patterns, the packet processing node predicts which hash key is likely to store the desired data. The prediction may yield a false positive, but will never yield a false negative. Thus, the hash table is accessed only once during a data lookup. | 07-10-2014 |
20140214855 | REDUCING COLLISIONS WITHIN A HASH TABLE - Collisions in hash tables are reduced by removing each empty bucket from a hash table and compacting the non-empty buckets, generating a map of the hash table indicating a status of the buckets of the hash table, and accessing data in the hash table by applying a hash key to the generated map to determine a corresponding bucket containing the data. | 07-31-2014 |
20140214856 | PROVIDING A CONTENT PREVIEW - A content preview of a content item stored in an online storage system can be viewed on a client device without the content item itself being downloaded to the client device and without the use of software associated with the content item being installed on the client device. Furthermore, data storage and processing requirements can be minimized by creating and storing only one content preview for each unique content item. The content item can be identified by using the content item as a hash key in a hashing algorithm. The resulting unique identifier can be used to search a preview index that lists all created content previews and their location. A content preview is only created if one does not exist. The unique identifier can be used to locate the content preview and return it in response to a preview request by a client device. | 07-31-2014 |
20140222829 | Systems for Storing Files in a Distributed Environment - A system and method for storing data-files stored on distributed devices connected to a network. Data-elements of the data-files are allocated to data-blocks stored on the distributed devices. Key-identifiers are calculated for each of the data-blocks based on the allocated data-elements. The key-identifiers are stored in distributed tables stored on the distributed devices. Index-nodes are generated for the data-files based on the data-blocks. A Paxos algorithm is executed for the index-nodes based on the key-identifiers to provide a consensus of the data-files that are stored on the plurality of distributed devices. | 08-07-2014 |
20140236963 | IMAGE RETRIEVAL METHOD - “A system and method for linking a hash code to a portion of an image. A plurality of lattice points is selected in a multidimensional lattice to form a smallest enclosing region about a feature vector representing the portion of the image and a lattice point is determined from the selected plurality of lattice points according to a distribution criteria. The determined lattice point is common to the smallest enclosing region and a region of the lattice adjacent to the smallest enclosing region located within a query radius distance of the feature vector. When the feature vector is located within the query radius of a query vector the feature vector is considered a match. The method assigns the feature vector to the determined lattice point and stores a link between a hash code associated with the determined lattice point and the portion of the image.” | 08-21-2014 |
20140280201 | ATTRIBUTE DETECTION - The present disclosure is directed to computer-implemented methods and systems for identifying an attribute and/or attribute value in a text string. In embodiments, the text string comprises a search query submitted by a user. Embodiments of the present disclosure include identifying an attribute value from a search query by comparing the search query string to a list of known attribute values and comparing the candidate attribute value to a knowledge base to confirm that the string represents an attribute value rather than a non-attribute concept. In embodiments, a Bloom filter is employed to execute a relatively efficient comparison between a candidate attribute value and known non-attribute concepts. | 09-18-2014 |
20140297653 | ONTOLOGY-BASED QUERY METHOD AND APPARATUS - An ontology-based query method and apparatus include acquiring a to-be-queried triple input by the user, where a known element is a query condition. One or more unknown elements in the to-be-queried triple is a query object and searching is performed, in the key-value pairs stored in each of the plurality of computing nodes, for a key-value pair matching the query condition. An element corresponding to the query object is determined from three elements included in a key value of the matched key-value pair, to acquire elements corresponding to the query objects determined in each of the plurality of computing nodes. A query result is acquired according to the elements corresponding to the query objects determined in each of the plurality of computing nodes. | 10-02-2014 |
20140297654 | RECORD ADDRESSING INFORMATION RETRIEVAL BASED ON USER DATA DESCRIPTORS - Record addressing information retrieval is achieved using a plurality of user data descriptors. When a threshold number of user data descriptors from a set of user data descriptors are received, the threshold number of user data descriptors can be converted into a computed record index that is compared to a list of record indexes associated with a plurality of records. When the computed record index compares favorably to a record index in the list of record indexes, the record addressing information for a particular record is retrieved based on the record index. | 10-02-2014 |
20140304275 | OPTIMIZING WIDE DATA-TYPE STORAGE AND ANALYSIS OF DATA IN A COLUMN STORE DATABASE - Data structures can provide for a column store of a database table. Methods can use the data structures for efficiently responding to a query. Unique field values of a column of a database table can be identified. The unique values can be stored in a dictionary table along with reference keys that point to a row of the database table. A reference store column can replace the original column, where the reference store column stores index values of the dictionary table. A hash table can be used in accessing the database. A hash function can provide a hash value of a query term, and the hash value can be used to access a hash table to obtain a stored value of an index value of the dictionary table. The index value can be used to access the dictionary table to obtain reference keys corresponding to rows of the database table. | 10-09-2014 |
20140330839 | SYSTEMS AND METHODS INVOLVING A MULTI-PASS ALGORITHM FOR HIGH CARDINALITY DATA - This disclosure describes methods, systems, computer-readable media, and apparatuses for calculating a summary statistic. Calculating the summary statistic can be performed by identifying multiple subsets of a set of variable observations and assigning the subsets to grid-computing devices such that no two of the subsets are assigned to a same one of the grid-computing devices. A parallel processing operation that involves multiple processing phases at each of the grid-computing devices is then coordinated. The parallel processing operation includes each of the grid-computing devices inventorying the respectively assigned subset and generating inventory information representative of the respectively assigned subset. Subsequently, the inventory information generated by the grid-computing devices is received, and a summary statistic is determined by synthesizing the received inventory information. | 11-06-2014 |
20140330840 | Distributed Cache for Graph Data - In one embodiment, a system includes a database; and a cache layer comprising one or more cache nodes, the one or more cache nodes operative to: maintain in a memory one or more data structures storing association information describing associations between nodes in a graph a plurality of distributed cache clusters for storing information in the form of a graph, the graph comprising a plurality of nodes, each uniquely identified by a node identifier, and edge information indicating associations between nodes; respond to queries for associations between nodes in the graph by accessing the memory; and forward other queries to the database for processing. | 11-06-2014 |
20140344285 | String Hashing Using a Random Number Generator - String hashing using a random number generator is disclosed. A method of implementations includes dividing an input stream provided to a hashing module into a plurality of subsets of bits, wherein each subset comprises a same number of bits and wherein each of the subsets of bits comprises an overlapping subset, augmenting a subset of the subsets of bits with a constant, entangling, by a mixer of the hashing module, the subset by an output of a number generator, adding a result of the entangling to an accumulator of the hashing module, repeating the augmenting, the entangling, and the adding on at least a portion of a next sequential subset of the subset of bits, and when all of the subsets of bits have been processed, returning a value in the accumulator as a hash result value. | 11-20-2014 |
20140358937 | SYSTEMS AND METHODS FOR SNP ANALYSIS AND GENOME SEQUENCING - In one embodiment, a system comprising a processor and a memory storing instructions executable by the processor creates an index for a nucleic acid sequence. The index comprises a plurality of elements. Each element corresponds to a permutation of a nucleic acid sequence. Data representing a nucleic acid sequence is received. A subsequence of the nucleic acid sequence is identified in the data at a first position of the nucleic acid sequence. A hash of the subsequence is computed to determine a corresponding element of the index. Position data reflecting the first position is stored in the corresponding element of the index. | 12-04-2014 |
20140358938 | FILE UPLOAD BASED ON HASH VALUE COMPARISON - A server determines whether a the stored on a computing device matches a file stored on the server by comparing hash values for a first portion of the files. Based on the comparing, the server determines whether to upload the first portion of the file. The server uploads second portion of the file. The server generates the file for download by appending the first portion of the file stored on the server to the second portion of the file uploaded from the computing device. | 12-04-2014 |
20140365501 | CONTENT DISTRIBUTION METHOD AND CONTENT DISTRIBUTION SERVER - A content distribution method executed by a computer includes referring to a result of comparing information identifying content data stored in a storage unit with information identifying content data stored in one or more other storage units included in one or more other computers; and collecting representative image data of content data not stored in the storage unit from the one or more other storage units included in the one or more other computers storing the content data not stored in the storage unit, starting from one of the one or more other computers having a greatest number of pieces of the content data. | 12-11-2014 |
20150019563 | Label Masked Addressable Memory - A network device receives data packets and derives a key from headers in the packets. A search engine in the device searches, or performs a table lookup, for information based on the key and multiple programmable masks. The search engine includes a hash based search engine that comprises multiple mask modules each to mask an input key with a respective programmable mask, to produce multiple masked keys. The search engine also includes an array of hash modules each corresponding to a respective one of the masked keys and including a hash table. Each of the hash modules searches its hash table for a data value based on a hash of the corresponding masked key, and outputs a found data value, if any, resulting from the search. A selector selects among the found data values and output the selected data value. | 01-15-2015 |
20150039626 | BUILDING A HASH TABLE USING VECTORIZED INSTRUCTIONS - Techniques for performing database operations using vectorized instructions are provided. In one technique, a hash table build phase involves executing vectorized instructions to determine whether a bucket in a hash table includes a free slot for inserting a key. A number of data elements from the bucket are loaded in a register. A vectorized instruction is executed against the register may be used to determine a position, within the register, that contains the “smallest” data element. If the data element at that position is zero (or negative), then it is determined that the corresponding position in the bucket is an available slot for inserting a key and corresponding data value. | 02-05-2015 |
20150039627 | PROBING A HASH TABLE USING VECTORIZED INSTRUCTIONS - Techniques for performing database operations using vectorized instructions are provided. In one technique, a hash table probe phase involves executing vectorized instructions to determine where in a bucket a particular key is located. This determination may be preceded by one or more vectorized instructions that are used to determine whether the bucket contains the particular key. | 02-05-2015 |
20150039628 | PERFORMING AN AGGREGATION OPERATION USING VECTORIZED INSTRUCTIONS - Techniques for performing database operations using vectorized instructions are provided. In one technique, an aggregation operation involves executing vectorized instructions to update a data value that corresponds to a particular key. The aggregation operation may be one of count, sum, minimum, maximum, or average. | 02-05-2015 |
20150039629 | METHOD FOR STORING AND SEARCHING TAGGED CONTENT ITEMS IN A DISTRIBUTED SYSTEM - A method for storing tagged content items in a distributed data exchange system, comprising: A1. generating a Bloom 1 filter for each tag associated with a content item; A2. generating a key consisting of the juxtaposition of a membership word of the Bloom 1 filter and a membership word index inside the Bloom 1 filter; A3, generating a value comprising a compact representation of all tags, and a reference to the content item; and A4. adding the key-value pair to a distributed hash table. and for searching tagged content items, comprising: B1. receiving a multiple keyword search query; B2. choosing a keyword; B3. retrieving from the distributed hash table a first list of content items having the keyword as associated tag; and B4. filtering the first list via the compact representation of all tags to obtain a second list of content items that comprise all keywords as associated tags. | 02-05-2015 |
20150052150 | On-Demand Hash Index - Disclosed herein are system, method, and computer program product embodiments for populating a hash index and returning a handle to the hash index. An embodiment operates by determining, by at least one processor, during query optimization that a first database query has a query execution plan comprising a sub-query which executes N times a correlated predicate having an operator being one of equal and not equal to a base column. A cost of creating and probing the hash index N times and a cost of fully scanning the base column N times are compared based on the correlated predicate. Based on the comparing, it is determined whether to create on-demand a hash index. | 02-19-2015 |
20150058356 | REJECTING ROWS WHEN SCANNING A COLLISION CHAIN THAT IS ASSOCIATED WITH A PAGE FILTER - Provided are techniques for locating a row. A page filter in a page is stored, wherein the page filter is associated with a collision chain and includes a portion of a hash value of the row in the collision chain that has overflowed to an overflow area. In response to a request to locate a target row, the page filter is used to determine that the row has overflowed based on a portion of a hash value of the target row matching the portion of the hash value of the row that has overflowed. | 02-26-2015 |
20150081720 | REFERENCE COUNT PROPAGATION - Methods and systems are provided for tracking object instances stored on a plurality of network nodes, which tracking enables a global determination of when an object has no references across the networked nodes and can be safely de-allocated. According to one aspect of the invention, each node has a local object store for tracking and optionally storing objects on the node, and the local object stores collectively share the locally stored instances of the objects across the network. One or more applications, e.g., a file system and/or a storage system, use the local object stores for storing all persistent data of the application as objects. | 03-19-2015 |
20150095346 | EXTENT HASHING TECHNIQUE FOR DISTRIBUTED STORAGE ARCHITECTURE - In one embodiment, a technique is provided for distributing data and associated metadata within a distributed storage architecture. A set of hash tables that embody mappings of cluster-wide identifiers associated with storage locations are stored for write data of write requests organized into extents. A hash value is generated from a hash function applied to each extent. The hash value is overloaded and used for multiple purposes within the distributed storage architecture, including (i) a remainder computation on the hash value to select a bucket of a plurality of buckets representative of the extents, (ii) a hash table selector of the hash value to select a hash table from the set of hash tables, and (iii) a hash table index computed from the hash value to select an entry from a plurality of entries of the selected hash table having a cluster-wide identifier identifying a storage location for the extent. | 04-02-2015 |
20150095347 | EXTENT HASHING TECHNIQUE FOR DISTRIBUTED STORAGE ARCHITECTURE - In one embodiment, an extent hashing technique is used to efficiently distribute data and associated metadata substantially evenly among nodes of a cluster. The data may be write data associated with a write request issued by a host and received at a node of the cluster. The write data may be organized into one or more extents. A hash function may be applied to the extent to generate a result which may be truncated or trimmed to generate a hash value. A hash space of the hash value may be divided into a plurality of buckets representative of the write data, i.e., the extents, and the associated metadata, i.e., extent metadata. A number of buckets may be assigned to each extent store instance of the nodes to distribute ownership of the buckets, along with their extents and extent metadata, across all of the extent store instances of the nodes. | 04-02-2015 |
20150100586 | SOCIAL MEDIA CONTENT MANAGEMENT SYSTEM AND METHOD - A social media content management system coupled to a social media network includes database files and execution instructions to assemble, manage and transmit social media content. The database files store a feed content file, a social content management file, and management tables including a schedule file and an export content file. Execution instructions include a feed channel content system for receiving and storing feed content items in the feed content file. A selection subsystem is used for selecting and storing export content as social media posts in the social content management file A scheduling subsystem enables a user to create and store schedules, to select a schedule and to merge the selected schedule with the stored content in the management file to form the export content file. An export system transmits the stored content as social media content posts to the social media network according to the selected schedule. | 04-09-2015 |
20150120754 | Systems and Methods for Generating Bit Matrices for Hash Functions Using Fast Filtering - A lookup circuit evaluates hash functions that map keys to addresses in lookup tables. The circuit includes multiple hash function sub-circuits, each of which applies a respective hash function to an input key, producing a hash value. Candidate pairs of hash functions to be implemented by the hash function sub-circuits may be generated and tested for suitability in hashing a particular collection of keys. The suitability testing may include computing hash value bit vectors by applying each hash function in a candidate pair to a given key, and determining (using a modified union-find type operation that organizes objects in each set as a directed graph whose root points to itself) whether the resulting hash value bit vectors belong to the same set. The union-find type operation may include a limited distance-from-root test, path compression, or exception handling for special cases, but not a rank test. | 04-30-2015 |
20150127658 | KEY_VALUE DATA STORAGE SYSTEM - According to an aspect, a key-value store (KVS) system includes a data management unit that stores a data KVS storing a pair of a data KVS key including information on a storage location of application data to be an access target object and the application data; and a key KVS storing a pair of an application key and the data KVS key. The data KVS includes a normal partition in which a size of a record for storing one pair is a predetermined specific size; and a special partition in which the size of the record for storing one pair is a size set according to a data size of the pair to be stored. A data relocation unit relocates a pair of a relocation target object to the special partition having the record size suitable for the data size of the pair. | 05-07-2015 |
20150134672 | Data Copy Management Apparatus and Data Copy Method Thereof - A data copy management apparatus and a data copy method thereof. The data copy method includes obtaining, using a hash algorithm, hash values of multiple source data blocks obtained by dividing source data; sending the hash values to a target storage side, so that the target storage side determines, based on the received hash values, whether the target storage side directly generates the source data blocks or a source storage side sends the source data blocks to the target storage side; ignoring the source data blocks when a first feedback fed back by the target storage side is received; and sending the source data blocks to the target storage side when a second feedback fed back by the target storage side is received. Thus, a speed of copying a special data block can be improved, saving central processing (CPU) and network resources and reducing copy time. | 05-14-2015 |
20150149481 | TABLE AS QUERY LANGUAGE PARAMETER - A system includes generation of a query to retrieve, from a first database table, a result set conforming to query parameters for all entries of a second table stored in a volatile memory of a query client, serialization of the second table into the volatile memory, copying of the serialized second table into a second volatile memory of a data server, de-serialization of the serialized second table into the second volatile memory, determination of a plurality of entries of the first database table which are associated with the second table, and determination of the result set from the plurality of entries based on the query parameters. | 05-28-2015 |
20150317307 | PROVIDING A CONTENT PREVIEW - A content preview of a content item stored in an online storage system can be viewed on a client device without the content item itself being downloaded to the client device and without the use of software associated with the content item being installed on the client device. Furthermore, data storage and processing requirements can be minimized by creating and storing only one content preview for each unique content item. The content item can be identified by using the content item as a hash key in a hashing algorithm. The resulting unique identifier can be used to search a preview index that lists all created content previews and their location. A content preview is only created if one does not exist. The unique identifier can be used to locate the content preview and return it in response to a preview request by a client device. | 11-05-2015 |
20150317323 | INDEXING AND SEARCHING HETEROGENOUS DATA ENTITIES - A method of performing a search of heterogeneous data based on an input query includes: generating an index including at least two hash tables, where each hash table corresponds to a different data domain of the heterogeneous data and includes hash code sets, where at least one of the hash code sets is mapped to a hash code set of another one of the tables. The method further includes performing a hash on the input query to generate a hash code, by referring to the index, determining a first hash code set that the generated hash code belongs to, and determining a second hash code set that the determined first hash code set is mapped to, and providing at least one result based on the determined second hash code set. | 11-05-2015 |
20150365418 | EFFICIENT INDEXING AND SEARCHING OF ACCESS CONTROL LISTED DOCUMENTS - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for storing a plurality of documents in computer-readable memory, each document of the plurality of documents having a corresponding access control list (ACL), each ACL defining a plurality of users that are authorized to access a respective document, generating an index based on the plurality of users, the index comprising a plurality of partitions, each partition corresponding to a user of the plurality of users, and, for each document of the plurality of documents: ranking the users of the plurality of users, selecting a user as an indexing user based on the ranking, and storing the document in a partition of the index, the partition corresponding to the indexing user. | 12-17-2015 |
20150370794 | HASH BASED READ AND WRITE OPERATIONS IN A STORAGE SYSTEM - A method for hash-based writing, the method comprises: receiving a received data entity to be stored in a storage system, wherein the received data entity is associated with received data entity metadata; selecting a selected data structure out of a set of data structures that comprises K data structures; wherein K is a positive integer; wherein for each value of a variable k that ranges between 2 and K, stored data entity metadata that is stored in a k'th data structure out of the set collided with stored data entity metadata that is stored in each one of a first till (k−1)'th data structures of the set; calculating an index by applying, on the received data entity metadata, a hash function that is associated with the selected data structure; determining whether an entry that is associated with the index and belongs to the selected data structure is empty; writing to the entry, if the entry is empty, the received data entity metadata, and storing the received data entity in the storage system in response to a location of the entry in the set; and selecting, if the entry is not empty, a new data structure of the set and repeating at least the stages of calculating and determining. | 12-24-2015 |
20150379009 | DISTRIBUTED KEY-VALUE STORE - Techniques are disclosed for managing a high performance, fault-tolerant, strongly consistent, distributed key-value store system. The key-value store may store information, such as metadata for a distributed file system. Fault-tolerance means that the distributed key-value store continues to provide access to values in the key-value store in spite of a certain number of node failures. To provide this capability, the key-value store may store copies of (key, value) pair on N+1 nodes in order to provide fault tolerance for the failure of up to N nodes. In addition, metadata describing which nodes store a given value is stored on 2N+1 nodes and the distributed key-value store is sized such that there are 3N+1 nodes in a cluster. Doing so allows the key, value store to tolerate a failure of N nodes, while still maintaining a consistent and available key-value store. | 12-31-2015 |
20150379032 | AUGMENTED DIRECTORY HASH FOR EFFICIENT FILE SYSTEM OPERATIONS AND DATA MANAGEMENT - Embodiments relate to scheduling operations to perform on objects. A method for scheduling operations to perform on objects is provided. The method identifies a plurality of operations to perform on a plurality of objects each having at least one attribute. At least one of the operations has scheduling dependency on another operation. The method generates a numeric value for each of the objects from the attribute of the object. The method schedules the operations to perform on the objects based on the numeric values of the objects and the scheduling dependency. | 12-31-2015 |
20160019210 | SHARING UNMANAGED CONTENT USING A CONTENT MANAGEMENT SYSTEM - In some implementations, a subscriber to an online content management system can share content items that are external to the subscriber's content library. A computing device can include managed content items associated with the subscriber's content library. The computing device can include unmanaged content items that are stored externally to the subscriber's content library. The subscriber can provide input associated with an unmanaged content item to generate a link (e.g., URL, file path, location reference, etc.) for the unmanaged content item. When generating the link, the unmanaged content item can be uploaded to the online content management system and stored separately from the subscriber's content library. The generated link can be shared with recipient user (e.g., subscriber or non-subscriber). The generated link can be used by the recipient to access the unmanaged content item uploaded to and stored by the online content management system. | 01-21-2016 |
20160019211 | A PROCESS FOR OBTAINING CANDIDATE DATA FROM A REMOTE STORAGE SERVER FOR COMPARISON TO A DATA TO BE IDENTIFIED - The invention presents a process for obtaining candidate reference data to compare to a data to be identified, implemented in a system comprising a client unit and a storage server comprising two databases, in which: —the first database comprises indexed memory blocks each comprising a corresponding encrypted indexed reference data, and—the second database comprises memory blocks indexed by all possible hash values obtained by a plurality of k indexed hash functions, and wherein each block contains a list of the indexes of the reference data which hashing by one of said hash function results in the hash value corresponding to said block, said process comprising the steps during which: —the client unit hashes the data to be identified with each of the plurality of hash functions, and reads the k memory blocks of the second database corresponding to the hash values thus obtained, the client unit identifies indexes contained in at least t out of k read memory blocks, and—the client unit reads the memory blocks of the first database indexed by the identified indexes in order to obtain the corresponding indexed reference data, said data being candidate data to compare to the data to be identified, the steps of reading memory blocks of the databases being carried out by executing a protocol preventing the storage server from learning which memory blocks of the databases are read. Another object of the invention is a system for the secure comparison of data. | 01-21-2016 |
20160034452 | Media Fingerprinting and Identification System - The overall architecture and details of a scalable video fingerprinting and identification system that is robust with respect to many classes of video distortions is described. In this system, a fingerprint for a piece of multimedia content is composed of a number of compact signatures, along with traversal hash signatures and associated metadata. Numerical descriptors are generated for features found in a multimedia clip, signatures are generated from these descriptors, and a reference signature database is constructed from these signatures. Query signatures are also generated for a query multimedia clip. These query signatures are searched against the reference database using a fast similarity search procedure, to produce a candidate list of matching signatures. This candidate list is further analyzed to find the most likely reference matches. Signature correlation is performed between the likely reference matches and the query clip to improve detection accuracy. | 02-04-2016 |
20160034486 | Multi-Range and Runtime Pruning - A system, apparatus, and method for managing data storage and data access with querying data and filtering value ranges using only a constant amount of computer memory in the implementation of bloom filters based on a first consumption of a relation. | 02-04-2016 |
20160063021 | Metadata Index Search in a File System - An apparatus comprising an input/output (IO) port configured to couple to a large-scale storage device, a memory configured to store a plurality metadata databases (DBs) for a file system of the large-scale storage device, wherein the plurality of metadata DBs comprise key-value pairs with empty values, and a processor coupled to the IO port and the memory, wherein the processor is configured to partition the file system into a plurality of partitions by grouping directories in the file system by a temporal order, and index the file system by storing metadata of different partitions as keys in separate metadata DBs. | 03-03-2016 |
20160070701 | INDEXING ACCELERATOR WITH MEMORY-LEVEL PARALLELISM SUPPORT - According to an example, an indexing accelerator with memory-level parallelism (MLP) support may include a request decoder to receive indexing requests. The request decoder may include a plurality of configuration registers. A controller may be communicatively coupled to the request decoder to support MLP by assigning an indexing request of the received indexing requests to a configuration register of the plurality of configuration registers. A buffer may be communicatively coupled to the controller to store data related to an indexing operation of the controller for responding to the indexing request. | 03-10-2016 |
20160078031 | SORT-MERGE-JOIN ON A LARGE ARCHITECTED REGISTER FILE - Methods and arrangements for joining data sets. There are accepted: a first data set which forms a table in a relational database, and a second data set which forms a table in a relational database, each of the first and second data sets comprising a key value. Each of the first and second data sets is hashed based on the key value, and are thereupon sorted based on the key value. The sorted first and second data sets are joined with one another based on the key value. Other variants and embodiments are broadly contemplated herein. | 03-17-2016 |
20160085839 | Computer Implemented Method for Dynamic Sharding - The present disclosure relates to systems and methods for dynamic sharding of a database comprising data identifiable by keys comprised within a global ordered range. When handling a request for data of at least one key: providing the request to a predetermined shard store; the predetermined shard store verifying, by means of its local subrange collection, whether the at least one key is present in a local subrange of a shard stored on the predetermined shard store; and when at least one key is not present in a local subrange of a shard stored on the predetermined shard store, returning a message comprising the local subrange collection of the predetermined shard store. | 03-24-2016 |
20160103831 | DETECTING HOMOLOGIES IN ENCRYPTED AND UNENCRYPTED DOCUMENTS USING FUZZY HASHING - Techniques are provided for automatically detecting homologies between documents based on structural characteristics. Various statistics relating to the COS structure of a PDF document are compiled. The statistics are input into a rolling hash function to generate a digital fingerprint of the document. Fingerprints from two similar documents will have small edit distances between them, and can therefore be classified similarly or provided as results to a fingerprint-based search. For example, an unclassified document may be classified in the same class as a representative document where the fingerprints of the two documents have a small edit distance between them. Since the structure of the document is used instead of the text content or renderings, it is possible to operate on encrypted documents. Further, representative elements of a particular class of documents can be selected for comparison against a target document for improved resolution of the results. | 04-14-2016 |
20160103865 | TECHNOLOGY FOR PROVIDING CONTENT OF A PUBLISH-SUBSCRIBE TOPIC TREE - Content of a publish-subscribe topic tree is provided. This includes receiving a path for a requested topic. The path specifies topics leading from a highest to a lowest level topic in the path. Content in the topic tree is retrieved for matching topics that match the lowest level topic in the path independently of whether the matching topics are on the path in the tree. | 04-14-2016 |
20160110356 | HASH TABLE CONSTRUCTION FOR UTILIZATION IN RECOGNITION OF TARGET OBJECT IN IMAGE - Technologies are generally described to construct a hash table for utilization in a recognition of a target object in an image. According to some examples, a system to serve topical image recognition hash tables to user devices may construct a lookup hash table union from disjoint hash tables of particular objects. For example, a server may receive a request for a category or list of items, interpret which objects to send, and compose a joined image hash lookup table from the disjoint objects that match the target set. In other examples, the category information may be expanded into an object list and hash collections associated with the object list may be retrieved. | 04-21-2016 |
20160117323 | BUILDING A HASH TABLE USING VECTORIZED INSTRUCTIONS - Techniques for performing database operations using vectorized instructions are provided. In one technique, a hash table build phase involves executing vectorized instructions to determine whether a bucket in a hash table includes a free slot for inserting a key. A number of data elements from the bucket are loaded in a register. A vectorized instruction is executed against the register may be used to determine a position, within the register, that contains the “smallest” data element. If the data element at that position is zero (or negative), then it is determined that the corresponding position in the bucket is an available slot for inserting a key and corresponding data value. | 04-28-2016 |
20160124950 | DATA PROCESSING DEVICE, DATA PROCESSING METHOD, AND NON-TRANSITORY COMPUTER READABLE MEDIUM - According to one embodiment, a data processing device, includes: a request receiver, a buffer, a first circuitry and a second circuitry. The request receiver receives a write request containing a first key and first data. The buffer temporarily buffers the first data contained in the write request. The first circuitry, according to a buffering status of the first data in the buffer, reads second data which is partial data of the first data that is not read yet out of the first data buffered in the buffer and generates a second key according to a position of the second data in the first data, based on the first key. The second circuitry associates a data structure containing the second data with the second key and adds the data structure into a data structure set whose elements are associated with second keys. | 05-05-2016 |
20160125023 | DERIVED TABLE JOIN PROCESSING - Systems and methods for processing tables for query operations referencing the tables are described. A method may include determining whether a table is referenced one time or more than one time in a query that includes at least one operation referencing the table. The method may further include creating a single materialized view of the table when the table is determined to be referenced more than one time in the query. The method may also include creating two or more hash tables based, at least in part, on the single materialized view of the table by creating a hash table for each operator in the query that references the table, and evaluating the query using the two or more hash tables. | 05-05-2016 |
20160132535 | ACCELERATION METHOD FOR DATABASE USING INDEX VALUE OPERATION AND MIXED-MODE LEVELED CACHE - The present invention provides an acceleration method for database using index value operation and mixed-mode leveled cache. While building a database, an algorithm is adopted for operating a plurality of field conditions and giving an index value. At least a file record in the database satisfying the plurality of field conditions is related to the index value. While querying, the input plurality of field conditions are operated using the algorithm, giving the index value. According to the index value, the file records in the database satisfying the plurality of field conditions are listed. Thereby, the time for comparing the plurality of fields can be saved. | 05-12-2016 |
20160147750 | Versioned Insert Only Hash Table for In-Memory Columnar Stores - At least one read operation is concurrently performed with at least one write operation that each insert a key/value pair into a backing array of a backing hash table of a hash table forming part of a columnar in-memory database. The backing array maps a plurality of pointers each to a respective bucket. Each bucket includes at least one state bit and a hashed value of a corresponding key. Thereafter, for each write operation, a first available position in the backing array at which a pointer to a new bucket containing the key/value pair can be inserted is iteratively determined (such that each first available position has no corresponding pre-existing pointer). Subsequently, for each write operation, the pointer to the new bucket containing the key/value pair is inserted at the corresponding first determined position in the backing array. Related apparatus, systems, techniques and articles are also described. | 05-26-2016 |
20160154853 | BATCHING TUPLES | 06-02-2016 |
20160162523 | MULTIDIMENSIONAL DATA STORAGE AND RETRIEVAL METHOD AND DEVICE FOR MONITORING SYSTEM - The storage method comprises: acquiring a plurality of monitoring dimensionalities associated with each monitoring service and at least one pair of monitoring indicator data including an indicator name and an indicator value; converting the plurality of monitoring dimensionalities into a plurality of row keyword mapping values having the same length, respectively; combining monitoring time with the plurality of row keyword mapping values to form a row keyword, and setting the monitoring time in a fixed time position in the row keyword; and using the row keyword as an index to store a monitoring indicator in a distributed database. A dynamic adjustment of monitoring dimensionalities of data storage for the monitoring system is achieved in accordance with different monitoring services, thereby reducing maintenance costs and improving scalability. | 06-09-2016 |
20160170987 | Efficient Reference Counting in Content Addressable Storage | 06-16-2016 |
20160171025 | METHOD AND APPARATUS FOR OBJECT STORAGE | 06-16-2016 |
20160179802 | METHOD, APPARATUS AND COMPUTER PROGRAM PRODUCT FOR IMPROVED STORAGE OF KEY-VALUE PAIRS | 06-23-2016 |
20160188590 | SYSTEMS AND METHODS FOR NEWS EVENT ORGANIZATION - Generally discussed herein are systems, apparatuses, and methods for organizing and/or searching news events. In one or more embodiments, a method can include encoding a news event based on named entities, actors, and actions mentioned in the news event, calculating a locality sensitive hash (LSH) key on the news event encoding, comparing the calculated LSH key to a plurality of LSH keys of respective stories, wherein each story of the respective stories comprises one or more associated news events that include LSH keys that are within a specified distance from each other, and associating the news event with a story of the respective stories that includes an LSH key that has a smallest distance from the LSH key of the received news event and is less than the specified distance. | 06-30-2016 |
20160188748 | METHOD FOR PROVIDING INFORMATION TO DETERMINE A GRAPH ASSOCIATED WITH A DATA ITEM - In one embodiment, it is proposed a method for providing information to determine a graph associated with a data item, said graph being representative of a history of said data item in a network, and said graph comprising a set of vertices and a set of edges, each vertex being associated with an electronic device of said network, and each edge linked at least two vertices being representative of a dataflow between said at least two vertices. The method is executed by an electronic device and is remarkable in that it comprises:
| 06-30-2016 |
20160203135 | IN-MEMORY LATCH-FREE INDEX STRUCTURE | 07-14-2016 |
20160253322 | METHOD AND APPARATUS FOR DATA STORAGE AND RETRIEVAL | 09-01-2016 |
20160378750 | DATABASE VALUE IDENTIFIER HASH MAP - The subject matter disclosed herein provides methods for inserting and retrieving value identifiers from a dictionary encoded database using hash maps. A first value identifier and a first value can be accessed from a dictionary storing one or more value identifiers and one or more values. Each value identifier can correspond to a different value. The hash map and the first value can be used to determine a first index in a bucket list for inserting the first value identifier. The bucket list can have one or more indices. Each index can store at least one value identifier. The hash map can include a vector of one or more pointers. Each pointer can refer to at least one of the indices. Based on the determining, the first value identifier can be inserted at the first index without inserting the first value. Related apparatus, systems, techniques, and articles are also described. | 12-29-2016 |
20160378751 | FAST QUERY PROCESSING IN COLUMNAR DATABASES WITH GPUS - According to one exemplary embodiment, a method for processing a query associated with a database is provided. The method may include receiving the query. The method may include estimating a number of groups. The method may include copying a plurality of data from the database to graphics processing unit (GPU) memory. The method may include creating a hash table in GPU memory. The method may include determining if a group associated with the database is present in the hash table. The method may include adding the group to the hash table based on determining that the group is not present in the hash table. The method may include aggregating a value associated with the group in the hash table based on determining that the group is present in the hash table. The method may include determining a plurality of results. The method may then include retrieving the plurality of results. | 12-29-2016 |
20160378752 | Comparing Data Stores Using Hash Sums on Disparate Parallel Systems - Aspects described herein relate to methods and systems for comparing data stored in disparate parallel systems using hash sums. A database having a parallel system architecture may comprise a plurality of nodes each storing a plurality of records. A central node may initiate parallel calculation of a set of node hash sums for each individual node. Calculating a node hash sum for an individual node may comprise calculating, by the individual node, a set of hash values for each individual record of the plurality of records stored by the individual node and combining each hash value of the set of hash values to generate the node hash sum for the individual node. The central processor may combine each node hash sum to generate a database hash sum. The central processor may store the generated database hash sum and/or utilize it in comparisons with database hash sums for other databases. | 12-29-2016 |
20160378753 | DATASTORE FOR AGGREGATED MEASUREMENTS FOR METRICS - A computing resource monitoring service receives a request to store a measurement for a metric associated with a computing resource. The request includes the measurement itself and metadata for the measurement, which specifies attributes of the measurement. Based at least in part on the metadata, the computing resource monitoring service generates a fully qualified metric identifier and, using the identifier, selects a logical partition for placement of the measurement. From the logical partition, the computing resource monitoring service transmits the measurement to an aggregator sub-system comprising one or more in-memory datastores. The computing resource monitoring service stores the measurement in an in-memory datastore within the aggregator sub-system. | 12-29-2016 |
20160378754 | FAST QUERY PROCESSING IN COLUMNAR DATABASES WITH GPUS - According to one exemplary embodiment, a method for processing a query associated with a database is provided. The method may include receiving the query. The method may include estimating a number of groups. The method may include copying a plurality of data from the database to graphics processing unit (GPU) memory. The method may include creating a hash table in GPU memory. The method may include determining if a group associated with the database is present in the hash table. The method may include adding the group to the hash table based on determining that the group is not present in the hash table. The method may include aggregating a value associated with the group in the hash table based on determining that the group is present in the hash table. The method may include determining a plurality of results. The method may then include retrieving the plurality of results. | 12-29-2016 |
20160380942 | HIGHLY PARALLEL SCALABLE DISTRIBUTED EMAIL THREADING ALGORITHM - Systems, apparatuses, methods, and computer readable mediums for implementing a scalable distributed email threading algorithm. A database is created for storing a plurality of emails organized by subjects and relaxed checksums. Each node of a plurality of nodes retrieves a different subject for processing, and each node reconstructs an email discussion thread from a corresponding retrieved subject. A given node may merge incomplete threads which are related but which have different subjects. Then, the nodes may write the reconstructed threads back to the database. | 12-29-2016 |
20170235749 | File System | 08-17-2017 |
20190146962 | SYSTEMS AND METHODS FOR SNP ANALYSIS AND GENOME SEQUENCING | 05-16-2019 |
20190147110 | INTERNET OF THINGS SEARCH AND DISCOVERY GRAPH ENGINE CONSTRUCTION | 05-16-2019 |