Patent application number | Description | Published |
20100250563 | PROFILING IN A MASSIVE PARALLEL PROCESSING ENVIRONMENT - A computer-implemented method of profiling a data set in a parallel processing environment includes vertically partitioning an initial data set. One or more attribute subsets are then profiled. A list of subjects is generated each corresponding to a specific attribute value identified in the profiling. Values of multiple attributes are extracted for each identified subject, and the sample results are assembled and merged. | 09-30-2010 |
20100250648 | CLIENT-SERVER SYSTEMS AND METHODS FOR ACCESSING METADATA INFORMATION ACROSS A NETWORK USING PROXIES - Embodiments of the present invention include a computer-implemented systems and methods for accessing metadata across a network. A metadata server receives requests to access a data source from one or more clients. The metadata server is coupled between one or more backend servers and the clients. The backend servers may be coupled to the data sources of interest. The metadata server provides a metadata service proxy for establishing communications with the backend servers and for signaling the backend servers to establish connections to data sources. Data sources may be stateful or stateless. For stateless data sources, the metadata server may dynamically create reusable metadata service provider proxies that receive metadata from metadata service providers on the backend servers. For stateful data sources, unique metadata service provider proxies may be dynamically created and used to service client requests. | 09-30-2010 |
20110145005 | METHOD AND SYSTEM FOR AUTOMATIC BUSINESS CONTENT DISCOVERY - A system and method for automatic business content discovery are described. In various embodiments, a system includes modules to bind business terms to data validation rules and search data sources for data matching data validation rules. In various embodiments, the system binds matching data to data validation rules. In various embodiments, a user interface is provided for creating and managing business terms and data validation rules. In various embodiments, a method for profiling and monitoring data via graphical controls is presented. | 06-16-2011 |
20120284223 | Dataset Previews for ETL Transforms - Disclosed is a user interface on a display for editing data transformations comprising an ETL process. A first display area presents a data representation of a data transformation. A second display area presents a view of input data, and a third display are presents a view of output data. User input to modify the data transformation is received. In response to receiving the user input, the third display area is updated with output data generated by applying the modified data transformation to the input data. | 11-08-2012 |
20130159385 | System and Method for Performing Centralized Common Tasks for a Set of Functions - Methods, systems, and apparatuses for processing function calls using a hooking routines to perform pre-execution tasks are disclosed. A server computer receives a function call including a hooking routine and a request to run a particular function from a client computer. Based on a function group identifier and the credentials of the user or client computer, the server computer can perform various pre-execution tasks. The tasks can be common to all functions in the function group or be customized Both pre-execution tasks can be stored remotely from the client computer, so that updates to the pre-execution task can be made without the need to update computer readable code stored on the client computer. If the pre-execution tasks, such as an authorization, passes, then the server computer can execute the requested function, the server computer can reject or abort the requested function. | 06-20-2013 |
20130166515 | GENERATING VALIDATION RULES FOR A DATA REPORT BASED ON PROFILING THE DATA REPORT IN A DATA PROCESSING TOOL - In one embodiment, the method includes profiling a data file comprising one or more fields of data. The one or more fields of data contain an item of data; that is, a character, or group of characters that are related. Further, the method includes generating one or more profiling attributes based on profiling the data file. In an example, the one or more profiling attributes refer to profiling information relating to pattern, structure, content and format of data. Further, the method includes selecting at least one of the generated one or more profiling attributes and generating a validation rule based on the selected at least one profiling attribute. | 06-27-2013 |
20130238669 | Using Target Columns in Data Transformation - A data transform leverages a known hierarchy within a target data structure, in order to improve query and mapping capabilities and enhance performance. Where a target data structure is hierarchical, output data of that target data structure is often built in the document order of the nodes in the structure (from top down and from left to right). Hence, when the data for a child node in the target structure is being built, the data for the parent nodes of the child node has been built. Embodiments utilize this available portion of the target data in the form of target columns, to increase processing efficiency of the transformation process. Use of target columns according to embodiments may also allow powerful and concise expression of mapping logic in the transform, facilitating the use of functions such as selection (e.g. Where clauses), uniqueness (e.g. DISTINCT), ordering (Order By, Group By), and Aggregation. | 09-12-2013 |
20130262417 | Graphical Representation and Automatic Generation of Iteration Rule - Embodiments relate to graphical representation and/or automatic generation of an iteration rule in mapping design that is to integrate or transform one or more input data sets into another target data set. The input and output data set can be of flat or hierarchical in nature. In an embodiment, a graphical interface allows users to specify an iteration rule (e.g. JOIN operation in a relational database) in a tree-like structure (e.g. a JOIN tree). The interface allows users to visualize and implement complicated and powerful combinations of multiple data sets, including data sets exhibiting hierarchical structure. Drag-and-drop techniques may be employed to reduce the need for manual typing. Also disclosed are procedures automatically generating an iteration rule based on the data mapping information, thereby reducing a need for manual mapping. | 10-03-2013 |
20130282740 | System and Method of Querying Data - A system and method of querying data. The method includes transforming first data according to a unified data model. The unified data model has a hierarchical structure with tree nodes and leaf nodes. A leaf node contains a table. The method further includes executing a unified data model query on the first data (having been transformed) to result in second data. The method further includes outputting the second data. | 10-24-2013 |
20140129965 | GUIDED ACTIVITY WITH USER'S DEFINED STEPS - A dynamic wizard having guided activity with user-defined steps is disclosed. A predefined guided procedure framework is presented to an end user. The predefined guided procedure framework comprises at least one step. The predefined guided procedure framework is modified in response to prompting from the end user. In some embodiments, modifying the predefined guided procedure framework comprises creating a user-defined step and inserting the user-defined step into the predefined guided procedure framework. | 05-08-2014 |
20140136593 | RETRY MECHANISM FOR DATA LOADING FROM ON-PREMISE DATASOURCE TO CLOUD - A method and system of retrying to load data from a data source to a cloud target system are disclosed. A client device sends a data packet to a cloud server via a communication connection. The data packet comprises data. The client device receives an indication of a failure in the communication connection. The client device configures, in response to receiving the indication of the failure in the communication connection, the data packet to prompt the cloud server to perform an upsert operation with the data in the data packet. The client device sends the configured data packet to the cloud server. The client device can wait a predetermined amount of time before sending the configured data packet to the cloud server. | 05-15-2014 |
20150058292 | RESUMING BIG DATA TRANSFORMATIONS - Systems and methods for resuming data transformations, such as broken or otherwise unsuccessful data transformations, are described. In some example embodiments, the systems and methods receive a message that indicates a broken data transformation of a data table between a source database and a destination database, identify a maximum and/or largest and greatest value for a date attribute contained within an index column for all rows of the data table that were successfully loaded to the destination database during the data transformation, and select a group of rows of data of the data table stored in the source database by querying the source database to identify rows that include a value for the date attribute that is greater than the identified value. | 02-26-2015 |