Patent application number | Description | Published |
20100138388 | MAPPING INSTANCES OF A DATASET WITHIN A DATA MANAGEMENT SYSTEM - Mapping data stored in a data storage system for use by a computer system includes processing specifications of dataflow graphs that include nodes representing computations interconnected by links representing flows of data. At least one of the dataflow graphs receives a flow of data from at least one input dataset and at least one of the dataflow graphs provides a flow of data to at least one output dataset. A mapper identifies one or more sets of datasets. Each dataset in a given set matches one or more criteria for identifying different versions of a single dataset. A user interface is provided to receive a mapping between at least two datasets in a given set. The mapping received over the user interface is stored in association with a dataflow graph that provides data to or receives data from the datasets of the mapping. | 06-03-2010 |
20120102029 | MANAGING DATA SET OBJECTS - Managing data set objects for graph-based data processing includes: storing a group of one or more data set objects in a data storage system, the data set objects each representing a respective data set; and generating an association between at least a first data set object in the group and at least a first node of a dataflow graph for processing data in a data processing system, the first node representing a source or sink of data in a flow of data represented by a link in the dataflow graph, and the first data set object including a plurality of modes in which different transformational logic is applied to data processed by the first node. | 04-26-2012 |
20120185449 | MANAGING CHANGES TO COLLECTIONS OF DATA - Managing changes to a collection of records includes storing a first set of records in a data storage system, the first set of records representing a first version of the collection of records, and validating a proposed change to the collection of records specified by an input received over a user interface. The data storage system is queried based on validation criteria associated with the proposed change, and a first result is received in response to the querying. A second set of records is processed representing changes not yet applied to the collection of records to generate a second result. The first result is updated based on the second result to generate a third result. The third result is processed to determine whether the proposed change is valid according to the validation criteria. | 07-19-2012 |
20160062736 | SPECIFYING COMPONENTS IN GRAPH-BASED PROGRAMS - User input is received specifying components of a graph-based program specification. User input is received specifying links, at least some connecting an output port of an upstream component to an input port of a downstream component. The graph-based program specification is processed to identify one or more subsets of the components, including: identifying one or more subset entry points and one or more subset exit points that occur between components in different subsets based at least in part on data processing characteristics of linked components, and forming the subsets based on the identified subset entry points and exit points. A visual representation of the formed subsets is rendered within a user interface. Prepared code is generated for each formed subset that when used for execution by a runtime system causes processing tasks corresponding to the components in each formed subset to be performed. | 03-03-2016 |
20160062737 | SPECIFYING CONTROL AND DATA CONNECTIONS IN GRAPH-BASED PROGRAMS - A first component of a graph-based program specification includes an output control port. A second component includes an input control port and an input data port. A third component includes an output data port. The output control port is connected to the input control port, and the output data port is connected to the input data port. The first component includes control code that when executed causes the output control port to provide, to the input control port, at least one of suppression information or invocation information. The second component includes control code that when executed causes a computing system configured by the graph-based program specification to begin processing data received at the input data port in response to the invocation information if no suppression information is received at the input control port before the invocation information is received at the input control port. | 03-03-2016 |
20160062749 | EXECUTING GRAPH-BASED PROGRAM SPECIFICATIONS - A graph-based program specification includes components corresponding to tasks and directed links between ports of the components, including: a first type of link configuration between ports of linked components, corresponding to transfer of control or transfer of a single data element, and a second type of link configuration between ports of linked components, corresponding to transfer of multiple data elements. A compiler generates a target program specification including control code representing at least one control graph including graph nodes representing the components, where at least two are connected based on links of the first type. A computing node initiates execution of the target program specification, and manages computing resources for links of the second type, the computing resources including at least one of: (1) a buffer for storing data elements provided by an output port, or (2) a buffer for storing data elements provided to an input port. | 03-03-2016 |
20160062776 | EXECUTING GRAPH-BASED PROGRAM SPECIFICATIONS - A graph-based program specification includes components corresponding to tasks and directed links between ports of the components, including: a first type of link configuration defined by respective output and input ports of linked components, and a second type of link configuration defined by respective output and input ports of linked components. A compiler recognizes different types of link configurations and provides in a target program specification occurrences of a target primitive for executing a function for each occurrence of a data element flowing over a link of the second type. A computing node initiates execution of the target program specification, and determines at runtime, for components associated with the occurrences of the target primitive, an order in which instances of tasks corresponding to the components are to be invoked, and/or a computing node on which instances of tasks corresponding to the components are to be executed. | 03-03-2016 |
20160062800 | CONTROLLING DATA PROCESSING TASKS - Information representative of a graph-based program specification has a plurality of components, each of which corresponds to a task, and directed links between ports of said components. A program corresponding to said graph-based program specification is executed. A first component includes a first data port, a first control port, and a second control port. Said first data port is configured to receive data to be processed by a first task corresponding to said first component, or configured to provide data that was processed by said first task corresponding to said first component. Executing a program corresponding to said graph-based program specification includes: receiving said first control information at said first control port, in response to receiving said first control information, determining whether or not to invoke said first task, and after receiving said first control information, providing said second control information from said second control port. | 03-03-2016 |