Patent application number | Description | Published |
20140089294 | SQL GENERATION FOR ASSERT, UPDATE AND DELETE RELATIONAL TREES - Presented is a system and method for evaluating relational database queries in a distributed system. An optimized query plan is received by a control node. The query plan is decoded to a SQL statement that is semantically equivalent to the query plan, even though the query plan contains elements that have no direct analogue in SQL. The decoded SQL is transmitted to a compute node for execution. | 03-27-2014 |
20140114950 | FORMULATING GLOBAL STATISTICS FOR DISTRIBUTED DATABASES - The present invention extends to methods, systems, and computer program products for formulating global statistics for parallel databases. In general, embodiments of the invention merge (combine) information in multiple compute node level histograms to create a global histogram for a table that is distributed across a number of compute nodes. Merging can include aligning histogram step boundaries across the compute node histograms. Merging can include aggregating histogram step-level information, such as, for example, equality rows and average range rows (or alternately equality rows, range rows, and distinct range rows), across the compute node histograms into a single global step. Merging can account for distinct values that do not appear at one or more compute nodes as well as distinct values that are counted at multiple compute nodes. A resulting global histogram can be coalesced to reduce the step count. | 04-24-2014 |
20140114952 | OPTIMIZING QUERIES OF PARALLEL DATABASES - The present invention extends to methods, systems, and computer program products for optimizing queries of parallel databases. Queries can be partially optimized at an optimizer that is unaware of its use to optimize queries for parallel processing. The optimizer can produce a data structure (e.g., a SQL Server MEMO) that encapsulates a logical serial plan search space. The logical serial plan search space may not incorporate any notion of parallelism into the plan space itself. A parallel-aware optimizer can parallelize the logical serial plan search space by augmenting the data structure (e.g., transforming the SQL Server MEMO into a parallel MEMO). Augmentation can be with data movement operations that move data associated one or more compute nodes in a distributed architecture. Cost estimates can be calculated for the operations contained in the parallelized data structure. The parallel plan with the lowest estimated cost can be selected for the query. | 04-24-2014 |
20140137237 | SINGLE SYSTEM IMAGE VIA SHELL DATABASE - A single system image is provided for a parallel data warehouse system by exposing a shell database within a database management system comprising metadata and statistics regarding externally stored data. Further, functionality of the database management system can be exploited to perform pre-execution tasks. In one instance, one or more execution plans can be generated by the database management system for an input command and subsequently employed to generate a distributed execution plan. | 05-15-2014 |
20140164353 | OPTIMIZING PARALLEL QUERIES USING INTERESTING DISTRIBUTIONS - The present invention extends to methods, systems, and computer program products for optimizing parallel queries using interesting distributions. For each logical operator in an SQL server MEMO, in a top down manner from a root operator to the leaf operators, interesting distributions for the operators can be identified based on the properties of the operators. Identified interesting distributions can be propagated down to lower operators by annotating the lower operators with the interesting distributions. Thus, a SQL server MEMO can be annotated with interesting distributions propagated top down from root to leaf logical operators to generate an annotated SQL server MEMO. Parallel query plans can then be generated from the annotated SQL server MEMO in a bottom up manner from leaf operators to a root operator. Annotated interesting properties can be used to prune operators, thereby facilitating a more tractable search space for a parallel query plan. | 06-12-2014 |
20140379692 | SKEW-AWARE STORAGE AND QUERY EXECUTION ON DISTRIBUTED DATABASE SYSTEMS - Distributing rows of data in a distributed table distributed across a plurality of nodes. A method includes identifying skewed rows of a first table to be distributed in a distributed database system. The skewed rows include a common data value in a column such that the skewed rows are skewed, according to a predetermined skew factor, with respect to other rows in the first table not having the common data value. Non-skewed rows of the first table that are not skewed according to the skew factor are identified. The skewed rows of the first table are distributed across nodes in a non-deterministic fashion. The non-skewed rows of the first table are distributed across nodes in a deterministic fashion. The rows of the first table distributed across the nodes, whether distributed in a deterministic fashion or non-deterministic fashion, are stored in a single table at each of the nodes. | 12-25-2014 |
20150379083 | CUSTOM QUERY EXECUTION ENGINE - A custom query execution engine can be generated that captures a query. More particularly, the custom query execution engine can be generated based on combination of a query and an execution engine. Subsequent to generation, a custom query execution engine can be submitted to a system configured to execute the custom query execution engine and evaluate the query over a data store. | 12-31-2015 |
20160078090 | OPTIMIZING PARALLEL QUERIES USING INTERESTING DISTRIBUTIONS - The present invention extends to methods, systems, and computer program products for optimizing parallel queries using interesting distributions. For each logical operator in an SQL server MEMO, in a top down manner from a root operator to the leaf operators, interesting distributions for the operators can be identified based on the properties of the operators. Identified interesting distributions can be propagated down to lower operators by annotating the lower operators with the interesting distributions. Thus, a SQL server MEMO can be annotated with interesting distributions propagated top down from root to leaf logical operators to generate an annotated SQL server MEMO. Parallel query plans can then be generated from the annotated SQL server MEMO in a bottom up manner from leaf operators to a root operator. Annotated interesting properties can be used to prune operators, thereby facilitating a more tractable search space for a parallel query plan. | 03-17-2016 |