Ab Initio Technology LLC Patent applications |
Patent application number | Title | Published |
20160125197 | Database Security - A method includes automatically determining a component of a security label for each first record in a first table of a database having multiple tables, including: identifying a second record related to the first record according to a foreign key relationship; identifying a component of the security label for the second record; and assigning a value for the component of the security label for the first record based on the identified component of the security label for the second record. The method includes storing the determined security label in the record. | 05-05-2016 |
20160070733 | CONDITIONAL VALIDATION RULES - Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating conditional validation rules. One of the methods includes rendering a plurality of cells arranged in a two-dimensional grid having a first axis and a second axis, the two-dimensional grid including one or more subsets of the cells, each subset associated with a respective field of an element of the dataset, and multiple subsets of the cells extending in a direction along the second axis of the two-dimensional grid, one or more of the multiple subsets associated with a respective validation rule. The method includes applying one or more validation rules to an element of the dataset based on user input received from at least some of the cells. A condition cell associated with a field includes an input element for receiving input. | 03-10-2016 |
20150347193 | WORKLOAD AUTOMATION AND DATA LINEAGE ANALYSIS - Methods, systems, and apparatus, including computer programs encoded on computer storage media, for workload automation and job scheduling information. One of the methods includes obtaining job dependency information, the job dependency information specifying an order of execution of a plurality of jobs. The method also includes obtaining data lineage information that identifies dependency relationships between data stores and transformation, wherein at least one transformation accepts data from a first data store and produces data for a second data store. The method also includes creating links between the job dependency information and the data lineage information. The method also includes determining an impact of a change in a planned execution of an application of the plurality of applications based on the job dependency information, the created links, and the data lineage information. | 12-03-2015 |
20150302075 | PROCESSING DATA FROM MULTIPLE SOURCES - In a first aspect, a method includes, at a node of a Hadoop cluster, the node storing a first portion of data in HDFS data storage, executing a first instance of a data processing engine capable of receiving data from a data source external to the Hadoop cluster, receiving a computer-executable program by the data processing engine, executing at least part of the program by the first instance of the data processing engine, receiving, by the data processing engine, a second portion of data from the external data source, storing the second portion of data other than in HDFS storage, and performing, by the data processing engine, a data processing operation identified by the program using at least the first portion of data and the second portion of data. | 10-22-2015 |
20150301861 | INTEGRATED MONITORING AND CONTROL OF PROCESSING ENVIRONMENT - A method of managing components in a processing environment is provided. The method includes monitoring (i) a status of each of one or more computing devices, (ii) a status of each of one or more applications, each application hosted by at least one of the computing devices, and (iii) a status of each of one or more jobs, each job associated with at least one of the applications; determining that one of the status of one of the computing devices, the status of one of the applications, and the status of one of the jobs is indicative of a performance issue associated with the corresponding computing device, application, or job, the determination being made based on a comparison of a performance of the computing device, application, or job and at least one predetermined criterion; and enabling an action to be performed associated with the performance issue. | 10-22-2015 |
20150261796 | SPECIFYING AND APPLYING LOGICAL VALIDATION RULES TO DATA - Methods, systems, and apparatus, including computer programs encoded on computer storage media, for specifying logical rules, one of the methods includes defining a logical rule, the logical rule applying operations based on a term. The method includes defining a mapping between fields and terms, the mapping including a mapping between a field and the term. The method includes storing the logical rule in association with the term. The method also includes applying the logical rule to data identified by the first field where respective fields are assigned to respective terms. | 09-17-2015 |
20150242093 | COMPOUND CONTROLS - Methods, systems, and apparatus, including computer programs encoded on computer storage media, for specifying a compound control. One of the methods includes identifying a first application. The method includes displaying a canvas. The method includes displaying, in the canvas, a first display object associated with the first application. The method includes identifying a second application, the second application being a computer executable program. The method includes displaying, in the user interface, a second display object associated with a second application. The method includes, in response to a user action that associates the second display object with the first display object, configuring the first application to invoke the second application. The method includes creating a third display object that includes the first set of selector objects and the second set of selector objects. | 08-27-2015 |
20150212891 | RESTARTING PROCESSES - Techniques are disclosed that include a computer-implemented method, including storing information related to an initial state of a process upon being initialized, wherein execution of the process includes executing at least one execution phase and upon completion of the executing of the execution phase storing information representative of an end state of the execution phase; aborting execution of the process in response to a predetermined event; and resuming execution of the process from one of the saved initial and end states without needing to shut down the process. | 07-30-2015 |
20150212796 | SORTING - Systems and techniques are disclosed that include in one aspect a computer implemented method storing a received stream of data elements in a buffer, applying a boundary condition to the data elements stored in the buffer after receiving each individual data element of the stream of data elements, and producing one or more data elements from the buffer based on the boundary condition as an output stream of data elements sorted according to a predetermined order. | 07-30-2015 |
20150106818 | DYNAMICALLY LOADING GRAPH-BASED COMPUTATIONS - Processing data includes: receiving units of work that each include one or more work elements, and processing a first unit of work using a first compiled dataflow graph ( | 04-16-2015 |
20150106341 | DATA PROFILING - Processing data includes profiling data from a data source, including reading the data from the data source, computing summary data characterizing the data while reading the data, and storing profile information that is based on the summary data. The data is then processed from the data source. This processing includes accessing the stored profile information and processing the data according to the accessed profile information. | 04-16-2015 |
20150066862 | MANAGING AN ARCHIVE FOR APPROXIMATE STRING MATCHING - In one aspect, in general, a method is described for managing an archive for determining approximate matches associated with strings occurring in records. The method includes: processing records to determine a set of string representations that correspond to strings occurring in the records; generating, for each of at least some of the string representations in the set, a plurality of close representations that are each generated from at least some of the same characters in the string; and storing entries in the archive that each represent a potential approximate match between at least two strings based on their respective close representations. | 03-05-2015 |
20140344508 | MANAGING MEMORY AND STORAGE SPACE FOR A DATA OPERATION - Processing a plurality of data units to generate result information, includes: performing a data operation for each data unit of a first subset of data units from the plurality of data units, and storing information associated with a result of the data operation in a first set of one or more data structures stored in working memory space of a memory device; after an overflow condition on the working memory space is satisfied, storing information in overflow storage space of a storage device; and repeating an overflow processing procedure multiple times during the processing of the plurality of data units, the overflow processing procedure including: updating a new set of one or more data structures stored in the working memory space using at least some information stored in the overflow storage space. | 11-20-2014 |
20140282418 | RECORDING PROGRAM EXECUTION - Among other things, a method includes, at a computer system on which one or more computer programs are executing, receiving a specification defining types of state information, receiving an indication that an event associated with at least one of the computer programs has occurred, the event associated with execution of a function of the computer program, collecting state information describing the state of the execution of the computer program when the event occurred, generating an entry corresponding to the event, the entry including elements of the collected state information, the elements of state information formatted according to the specification, and storing the entry. The log can be parsed to generate a visualization of computer program execution. | 09-18-2014 |
20140273930 | AUDITING OF DATA PROCESSING APPLICATIONS - A method includes determining a first quantity of data records of a group of data records from a stream of data records received by an application having a plurality of modules. The method includes, for one or more of the modules of the application, determining a respective second quantity of data records output by the module during processing of the group of data records. The method includes determining whether the first and second quantities of data records satisfy a rule. The rule is indicative of a target relationship among a quantity of data records received by the application and a quantity of data records output by one or more modules of the application. | 09-18-2014 |
20140258652 | MANAGING OPERATIONS ON STORED DATA UNITS - A system for managing storage of data units includes a data storage system configured to store multiple data blocks, at least some of the data blocks containing multiple data units, with at least a group of the data blocks being stored contiguously, thereby supporting a first read operation that retrieves data units from at least two adjacent data blocks in the group. The system is configured to perform one or more operations with respect to data units, the operations including a delete operation that replaces a first data block containing a data unit to be deleted with a second data block that does not contain the deleted data unit, with the second data block having the same size as the first data block. | 09-11-2014 |
20140258651 | MANAGING OPERATIONS ON STORED DATA UNITS - A system for managing storage of data units includes a data storage system configured to store multiple data blocks, at least some of the data blocks containing multiple data units, and configured to store, for at least some of the data blocks, corresponding historical information about prior removal of one or more data units from that data block, the removal affecting at least some addresses of data units in that data block. The system is configured to perform at least one operation that accesses at least a first data unit stored in a first data block according to address information interpreted based on any stored historical information corresponding to the first data block. | 09-11-2014 |
20140222752 | DATA RECORDS SELECTION - A computer-implemented method includes accessing a plurality of data records, each data record having a plurality of data fields. The method further includes analyzing values for one or more of the data fields for at least some of the plurality of data records and generating a profile of the plurality of data records based on the analyzing. The method further includes formulating at least one subsetting rule based on the profile; and selecting a subset of data records from the plurality of data records based on the at least one subsetting rule. | 08-07-2014 |
20140189653 | CONFIGURABLE TESTING OF COMPUTER PROGRAMS - Configurable testing of a computer program includes: storing a set of one or more testing specifications, and attribute information defining one or more attributes of a recognizable portion of the computer program; and processing, using at least one processor, the computer program according to at least a first testing specification associated with the computer program. The processing includes: traversing a representation of the computer program that includes elements that represent recognizable portions of the computer program, and while traversing the representation, recognizing recognizable portions of the computer program, and storing values of attributes, defined by the attribute information, of the recognized portions of the computer program. | 07-03-2014 |
20140164495 | MANAGING OBJECTS USING A CLIENT-SERVER BRIDGE - A method for supporting communication between a client and a server includes receiving a first message from a client. The method also includes creating an object in response to the first message. The method also includes sending a response to the first message to the client. The method also includes receiving changes to the object from a server. The method also includes storing the changes to the object. The method also includes receiving a second message from the client. The method also includes sending the stored changes to the client with a response to the second message. | 06-12-2014 |
20140143760 | DYNAMIC GRAPH PERFORMANCE MONITORING - Methods, systems, and apparatus, including computer programs encoded on computer storage media, for dynamic graph performance monitoring. One of the methods includes receiving multiple units of work that each include one or more work elements. The method includes determining a characteristic of the first unit of work. The method includes identifying, by a component of the first dataflow graph, a second dataflow graph from multiple available dataflow graphs based on the determined characteristic, the multiple available dataflow graphs being stored in a data storage system. The method includes processing the first unit of work using the second dataflow graph. The method includes determining one or more performance metrics associated with the processing. | 05-22-2014 |
20140143757 | DYNAMIC COMPONENT PERFORMANCE MONITORING - Methods, systems, and apparatus, including computer programs encoded on computer storage media, for dynamic graph performance monitoring. One of the methods includes receiving input data by the data processing system, the input data provided by an application executing on the data processing system. The method includes determining a characteristic of the input data. The method includes identifying, by the application, a dynamic component from multiple available dynamic components based on the determined characteristic, the multiple available dynamic components being stored in a data storage system. The method includes processing the input data using the identified dynamic component. The method also includes determining one or more performance metrics associated with the processing. | 05-22-2014 |
20140114968 | PROFILING DATA WITH LOCATION INFORMATION - Profiling data includes processing an accessed collection of records, including: generating, for a first set of distinct values appearing in a first set of one or more fields, corresponding location information; generating, for the first set of fields, a corresponding list of entries identifying a distinct value from the first set of distinct values and the location information for the distinct value; generating, for a second set of one or more fields, a corresponding list of entries, with each entry identifying a distinct value from a second set of distinct values appearing in the second set of fields; and generating result information, based at least in part on: locating at least one record of the collection using the location information for at least one value appearing in the first set of fields, and determining at least one value appearing in the second set of fields of the located record. | 04-24-2014 |
20140053159 | FAULT TOLERANT BATCH PROCESSING - Among other aspects disclosed are a method and system for processing a batch of input data in a fault tolerant manner. The method includes reading a batch of input data including a plurality of records from one or more data sources and passing the batch through a dataflow graph. The dataflow graph includes two or more nodes representing components connected by links representing flows of data between the components. At least one but fewer than all of the components includes a checkpoint process for an action performed for each of multiple units of work associated with one or more of the records. The checkpoint process includes opening a checkpoint buffer stored in non-volatile memory at the start of processing for the batch. | 02-20-2014 |
20130318062 | MAPPING DATASET ELEMENTS - Among other things, one aspect includes receiving one or more mapped relationships between a given output and one or more inputs represented by input variables, at least one of the mapped relationships including a transformational expression, the transformational expression defining an output of a mapped relationship based on at least one input variable mapped to an element of an input dataset; receiving identification of elements of an output dataset mapped to outputs of respective mapped relationships; generating output data according to the transformational expression based on input data from the input dataset associated with the element of the input dataset mapped to the input variable; determining validation information in response to the generated output data based on validation criteria defining one or more characteristics of valid values associated with one or more of the identified elements of the output dataset; and presenting visual feedback based on the determined validation information. | 11-28-2013 |
20130007584 | Editing and Compiling Business Rules - A component in a graph-based computation having data processing components connected by linking elements representing data flows is updated by receiving a rule specification, generating a transform for transforming data based on the rule specification, associating the transform with a component in the graph-based computation, and in response to determining that a new rule specification has been received or an existing rule specification has been edited, updating the transform associated with the component in the graph-based computation according to the new or edited rule specification. A computation is tested by receiving a rule specification including a set of rule cases, receiving a set of test cases, each test case containing a value for one or more of the potential inputs, and for each test case, identifying one of the rule cases that will generate an output given the input values of the test case. | 01-03-2013 |
20120311588 | FAULT TOLERANT BATCH PROCESSING - Among other aspects disclosed are a method and system for processing a batch of input data in a fault tolerant manner. The method includes reading a batch of input data including a plurality of records from one or more data sources and passing the batch through a dataflow graph. The dataflow graph includes two or more nodes representing components connected by links representing flows of data between the components. At least one but fewer than all of the components includes a checkpoint process for an action performed for each of multiple units of work associated with one or more of the records. The checkpoint process includes opening a checkpoint buffer stored in non-volatile memory at the start of processing for the batch. | 12-06-2012 |