Patent application number | Description | Published |
20090119238 | METHOD AND SYSTEM FOR PREDICTING RESOURCE USAGE OF REUSABLE STREAM PROCESSING ELEMENTS - A method is provided for generating a resource function estimate of resource usage by an instance of a processing element configured to consume zero or more input data streams in a stream processing system having a set of available resources that comprises receiving at least one specified performance metric for the zero or more input data streams and a processing power of the set of available resources, wherein one specified performance metric is stream rate; generating a multi-part signature of executable-specific information for the processing element and a multi-part signature of context-specific information for the instance; accessing a database of resource functions to identify a static resource function corresponding to the executable-specific information and a context-dependent resource function corresponding to the context-specific information; combining the static resource function and the context-dependent resource function to form a composite resource function for the instance; and applying the resource function to the at least one specified performance metric and the processing power to generate the resource function estimate of the at least one specified performance metric for processing by the instance. | 05-07-2009 |
20090300615 | METHOD FOR GENERATING A DISTRIBUTED STREAM PROCESSING APPLICATION - Techniques for generating a distributed stream processing application are provided. The techniques include obtaining a declarative description of one or more data stream processing tasks, wherein the declarative description expresses at least one stream processing task, and generating one or more execution units from the declarative description of one or more data stream processing tasks, wherein the one or more execution units are deployable across one or more distributed computing nodes, and comprise a distributed data stream processing application. | 12-03-2009 |
20090313614 | METHOD FOR HIGH-PERFORMANCE DATA STREAM PROCESSING - Techniques for optimizing data stream processing are provided. The techniques include employing a pattern, wherein the pattern facilitates splitting of one or more incoming streams and distributing processing across one or more operators, obtaining one or mote operators, wherein the one or more operators support at least one group-independent aggregation and join operation on one or more streams, generating code, wherein the code facilitates mapping of the application onto a computational infrastructure to enable workload partitioning, using the one or more operators to decompose each of the application into one or more granular components, and using the code to reassemble the one or more granular components into one or more deployable blocks to map the application to a computational infrastructure, wherein reassembling the one or more granular components to map the application to the computational infrastructure optimizes data stream processing of the application. | 12-17-2009 |
20100094861 | SYSTEM AND METHOD FOR APPLICATION SESSION TRACKING - A system and method for application session tracking includes activating an application component for execution using an application session tracking facility (ASTF) and intercepting resource requests by the ASTF acting on behalf of this application. Resources are managed by allocating or releasing resources in accordance with resource usage profiles determined by system or application administrators of an application. The ASTF approach allows for controlling the usage of (potentially) distributed resources such as temporary space, database assets such as materialized views, directory services, shared memory segments among others during runtime and their return to their respective free pools and any necessary subsequent cleanup tasks upon a session termination. | 04-15-2010 |
20100100518 | Rules-Based Cross-FSM Transition Triggering - A method for cross-triggering transitions in independent finite state machines is provided. For a given plurality of finite state machine definitions having a plurality of states and a plurality of transitions among the states, two or more independent instances of the plurality of finite state machine definitions are created. In addition, associations between two or more of the independent finite state machine instances are identified. The method uses cross-triggering rules that identify a condition in a first one of the associated independent finite state machine instances that triggers a transition action in a second one of the associated independent finite state machine instances. Each cross-triggering rule is triggered upon an occurrence of the cross-triggering rule condition, and the transition action in the second associated independent finite state machine instance is performed in response to the triggering of the cross-triggering rule. | 04-22-2010 |
20100292980 | APPLICATION RESOURCE MODEL COMPOSITION FROM CONSTITUENT COMPONENTS - Techniques for composing an application resource model in a data stream processing system are disclosed. The application resource model may be used to understand what resources will be consumed by an application when executed by the data stream processing system. For example, a method for composing an application resource model for a data stream processing system comprises the following steps. One or more operator-level metrics are obtained from an execution of a data stream processing application in accordance with a first configuration. The application is executed by one or more nodes of the data stream processing system, and the application is comprised of one or more processing elements that are comprised of one or more operators. One or more operator-level resource functions are generated based on the obtained one or more operator-level metrics. A processing element-level resource function is generated based on the one or more generated operator-level resource functions. The processing element-level resource function represents an application resource model usable for predicting one or more characteristics of the application executed in accordance with a second configuration. | 11-18-2010 |
20100293283 | ON-DEMAND MARSHALLING AND DE-MARSHALLING OF NETWORK MESSAGES - One embodiment of a method for on-demand de-marshalling of a network message includes receiving a first network message at a receiver device, wherein the first network message was sent from a sender device, and wherein the first network message comprises one or more attributes that characterize an object at the sender device, and further wherein the first network message is in an encoded format according to a network protocol, storing the first network message in a buffer of the receiver device in the encoded format, identifying, at the receiver device and prior to de-marshalling of any of the attributes, that at least one of the attributes is to be manipulated by the receiver device, de-marshalling, in response to the identifying, the identified attribute(s), and manipulating the identified attribute(s) at the receiver device. | 11-18-2010 |
20100293301 | Dynamically Composing Data Stream Processing Applications - Techniques for dynamically modifying inter-connections between components in an application are provided. The techniques include receiving a data producer profile for each output port within a software application to be executed on one or more processors, receiving a data subscription profile for each input port of each component of the application, establishing connections between the output ports and the input ports of the components in the application based on a comparison of each data producer profile and each data subscription profile, executing the application on one or more processors to process streams of data, receiving either or both of a new data producer profile or a new data subscription profile during the execution of the application, and establishing at least one new connection between an output port and an input port based upon a revised comparison of the received data profiles that include the new data profile. | 11-18-2010 |
20100293532 | FAILURE RECOVERY FOR STREAM PROCESSING APPLICATIONS - In one embodiment, the invention is a method and apparatus for failure recovery for stream processing applications. One embodiment of a method for providing a failure recovery mechanism for a stream processing application includes receiving source code for the stream processing application, wherein the source code defines a fault tolerance policy for each of the components of the stream processing application, and wherein respective fault tolerance policies defined for at least two of the plurality of components are different, generating a sequence of instructions for converting the state(s) of the component(s) into a checkpoint file comprising a sequence of storable bits on a periodic basis, according to a frequency defined in the fault tolerance policy, initiating execution of the stream processing application, and storing the checkpoint file, during execution of the stream processing application, at a location that is accessible after failure recovery. | 11-18-2010 |
20100293533 | INCREMENTALLY CONSTRUCTING EXECUTABLE CODE FOR COMPONENT-BASED APPLICATIONS - One embodiment of a method for constructing executable code for a component-based application includes receiving a request to compile source code for the component-based application, wherein the request identifies the source code, and wherein the source code comprises a plurality of source code components, each of the source code components implementing a different component of the application, and performing a series of steps for each source code component where the series of steps includes: deriving a signature for the source code component, retrieving a stored signature corresponding to a currently available instance of executable code for the source code component, comparing the derived signature with the stored signature, compiling the source code component into the executable code when the derived signature does not match the stored signature, and obtaining the executable code for the source code component from a repository when the derived signature matches the stored signature. | 11-18-2010 |
20100293534 | USE OF VECTORIZATION INSTRUCTION SETS - In one embodiment, the invention is a method and apparatus for use of vectorization instruction sets. One embodiment of a method for generating vector instructions includes receiving source code written in a high-level programming language, wherein the source code includes at least one high-level instruction that performs multiple operations on a plurality of vector operands, and compiling the high-level instruction(s) into one or more low-level instructions, wherein the low-level instructions are in an instruction set of a specific computer architecture. | 11-18-2010 |
20100293535 | Profile-Driven Data Stream Processing - Techniques for compiling a data stream processing application are provided. The techniques include receiving, by a compiler executing on a computer system, source code for a data stream processing application, wherein the source code comprises source code for a plurality of operators, each of which performs a data processing function, determining, by the compiler, one or more characteristics of operators within the data stream processing application, grouping, by the compiler, the operators into one or more execution containers based on the one or more characteristics, and compiling, by the compiler, the source code for the data stream processing application into executable code, wherein the executable code comprises a plurality of execution units, wherein each execution unit contains one or more of the operators, wherein each operator is assigned to an execution unit based on the grouping, and wherein each execution unit is to be executed in a partition. | 11-18-2010 |
20100325621 | PARTITIONING OPERATOR FLOW GRAPHS - Techniques for partitioning an operator flow graph are provided. The techniques include receiving source code for a steam processing application, wherein the source code comprises an operator flow graph, wherein the operator flow graph comprises a plurality of operators, receiving profiling data associated with the plurality of operators and one or more processing requirements of the operators, defining a candidate partition as a coalescing of one or more of the operators into one or more sets of processing elements (PEs), using the profiling data to create one or more candidate partitions of the processing elements, using the one or more candidate partitions to choose a desired partitioning of the operator flow graph, and compiling the source code into an executable code based on the desired partitioning. | 12-23-2010 |
20110061060 | Determining Operator Partitioning Constraint Feasibility - Techniques for determining feasibility of a set of one or more operator partitioning constraints are provided. The techniques include receiving one or more sets of operator partitioning constraints, wherein each set of one or more constraints define one or more desired conditions for grouping together of operators into partitions and placing partitions on hosts, wherein each operator is embodied as software that performs a particular function, processing each set of one or more operator partitioning constraints to determine feasibility of each set of one or more operator partitioning constraints, creating and outputting one or more candidate partitions and one or more host placements for each set of feasible partitioning constraints, and creating and outputting a certificate of infeasibility for each set of infeasible partitioning constraints, wherein the certificate of infeasibility outlines one or more reasons for infeasibility. | 03-10-2011 |
20110185339 | AUTOMATING THE CREATION OF AN APPLICATION PROVISIONING MODEL - An application provisioning model is automatically created. The model is created from a high-level application and specifies dependencies of the application. It is used to provision the application on one or more nodes or other actions. | 07-28-2011 |
20110185346 | AUTOMATED BUILDING AND RETARGETING OF ARCHITECTURE-DEPENDENT ASSETS - Architecture-dependent assets are automatically built and retargeted. An asset originally built for one architecture is downloaded and automatically retargeted on another architecture. This automatically retargeting may be performed on demand, at runtime. | 07-28-2011 |
20110191759 | Interactive Capacity Planning - Techniques for performing capacity planning for applications running on a computational infrastructure are provided. The techniques include instrumenting an application under development to receive one or more performance metrics under a physical deployment plan, receiving the one or more performance metrics from the computational infrastructure hosting one or more applications that are currently running, using a predictive inference engine to determine how the application under development can be deployed, and using the determination to perform capacity planning for the applications on the computational infrastructure. | 08-04-2011 |
20110225584 | MANAGING MODEL BUILDING COMPONENTS OF DATA ANALYSIS APPLICATIONS - Data analysis applications include model building components and stream processing components. To increase utility of the data analysis application, in one embodiment, the model building component of the data analysis application is managed. Management includes resource allocation and/or configuration adaptation of the model building component, as examples. | 09-15-2011 |
20110239048 | PARTIAL FAULT TOLERANT STREAM PROCESSING APPLICATIONS - In one embodiment, the invention comprises partial fault tolerant stream processing applications. One embodiment of a method for implementing partial fault tolerance in a stream processing application comprising a plurality of stream operators includes: defining a quality score function that expresses how well the application is performing quantitatively, injecting a fault into at least one of the plurality of operators, assessing an impact of the fault on the quality score function, and selecting at least one partial fault-tolerant technique for implementation in the application based on the quantitative metric-driven assessment. | 09-29-2011 |
20110246972 | METHOD OF SELECTING AN EXPRESSION EVALUATION TECHNIQUE FOR DOMAIN-SPECIFIC LANGUAGE COMPILATION - A method and computer program product for selecting an expression evaluation technique for domain-specific language (DSL) compilation. An application written in DSL for a programming task is provided, the application including a plurality of components configured by expressions. A technique that most quickly implements the programming task is selected from a plurality of techniques for evaluating the expressions. The DSL application is compiled in accordance with the selected expression evaluation technique to generate general-purpose programming language (GPL) code. | 10-06-2011 |
20110247007 | OPERATORS WITH REQUEST-RESPONSE INTERFACES FOR DATA STREAM PROCESSING APPLICATIONS - Processing streaming data in a data processing system is facilitated by: declaring and defining, by a processor, a request-response interface as part of a stream processing operator defined using a stream processing language; processing a stream of data using the stream processing operator with the request-response interface defined as a part thereof; and communicating with the stream processing operator through the request-response interface via a communication path separate from the stream of data, the communicating accessing or controlling a state of the stream processing operator while the stream processing operator is processing the stream of data. | 10-06-2011 |
20110289301 | Tracing Flow of Data in a Distributed Computing Application - A method is provided for tracing dataflow in a distributed computing application. For example, the method includes incrementally advancing a dataflow in a dataflow path of one or more dataflow paths according to two or more directives encoded in two or more data messages. The method further includes performing the two or more directives. The dataflow path includes one or more operators including at least one merge operator operative to merge the two or more data messages and merge the two or more directives. One or more of the incrementally advancing of the dataflow and the performing of the two or more directives are implemented as instruction code performed on a processor device. | 11-24-2011 |
20110295939 | STATE SHARING IN A DISTRIBUTED DATA STREAM PROCESSING SYSTEM - State sharing is facilitated in stream processing environments, including distributed stream processing environments. A customized shared state implementation representing the state to be shared is automatically created based on at least one of user preferences, hints of usage, and system performance. | 12-01-2011 |
20120059839 | PROXYING OPEN DATABASE CONNECTIVITY (ODBC) CALLS - An Open Database Connectivity (ODBC) proxy infrastructure to transparently route incoming queries to one or more selected query engines. The ODBC proxy receives a query from an application, and determines based on the characteristics of the query and the capabilities of the query engines which one or more query engines are to perform the query. The proxy then routes the query to the one or more query engines, which perform the query. The results are then returned to the proxy, which provides the results to the application. | 03-08-2012 |
20120117423 | FAULT TOLERANCE IN DISTRIBUTED SYSTEMS - Fault tolerance is provided in a distributed system. The complexity of replicas and rollback requests are avoided; instead, a local failure in a component of a distributed system is tolerated. The local failure is tolerated by storing state related to a requested operation on the component, persisting that stored state in a data store, such as a relational database, asynchronously processing the operation request, and if a failure occurs, restarting the component using the stored state from the data store. | 05-10-2012 |
20120259910 | Generating a Distributed Stream Processing Application - Techniques for generating a distributed stream processing application are provided. The techniques include obtaining a declarative description of one or more data stream processing tasks from a graph of operators, wherein the declarative description expresses at least one stream processing task, generating one or more containers that encompass a combination of one or more stream processing operators, and generating one or more execution units from the declarative description of one or more data stream processing tasks, wherein the one or more execution units are deployable across one or more distributed computing nodes, and comprise a distributed data stream processing application binary. | 10-11-2012 |
20120297391 | APPLICATION RESOURCE MODEL COMPOSITION FROM CONSTITUENT COMPONENTS - Techniques for composing an application resource model are disclosed. The techniques include obtaining operator-level metrics from an execution of a data stream processing application according to a first configuration, wherein the application is executed by nodes of the data stream processing system and the application includes processing elements comprised of multiple operators, wherein two or more of the operators are combined in a first combination to form a processing element according to the first configuration, generating operator-level resource functions from the first combination of operators based on the obtained operator-level metrics, and generating a processing element-level resource function using the generated operator-level resource functions to predict a model for the processing element formed by a second combination of operators, the processing element-level resource function representing an application resource model usable for predicting characteristics of the application executed according to a second configuration. | 11-22-2012 |
20130018943 | DATA SHARING IN A DISTRIBUTED DATA STREAM PROCESSING SYSTEM - State sharing is facilitated in stream processing environments, including distributed stream processing environments. A customized shared state implementation representing the state to be shared is automatically created based on at least one of user preferences, hints of usage, and system performance. | 01-17-2013 |
20130238936 | PARTIAL FAULT TOLERANT STREAM PROCESSING APPLICATIONS - In one embodiment, the invention comprises partial fault tolerant stream processing applications. One embodiment of a method for implementing partial fault tolerance in a stream processing application comprising a plurality of stream operators includes: defining a quality score function that expresses how well the application is performing quantitatively, injecting a fault into at least one of the plurality of operators, assessing an impact of the fault on the quality score function, and selecting at least one partial fault-tolerant technique for implementation in the application based on the quantitative metric-driven assessment. | 09-12-2013 |
20130239100 | Partitioning Operator Flow Graphs - Techniques for partitioning an operator flow graph are provided. The techniques include receiving source code for a stream processing application, wherein the source code comprises an operator flow graph, wherein the operator flow graph comprises a plurality of operators, receiving profiling data associated with the plurality of operators and one or more processing requirements of the operators, defining a candidate partition as a coalescing of one or more of the operators into one or more sets of processing elements (PEs), using the profiling data to create one or more candidate partitions of the processing elements, using the one or more candidate partitions to choose a desired partitioning of the operator flow graph, and compiling the source code into an executable code based on the desired partitioning. | 09-12-2013 |
20140033173 | Generating Layouts for Graphs of Data Flow Applications - An embodiment of the invention provides a method of displaying a data flow, wherein a description of a data flow application to be displayed is received. The data flow application includes nodes and edges connecting the nodes, wherein the nodes represent operators and the edges represent data connections for data flowing between the operations. A reason that a user is to view the data flow and/or a user constraint on a complexity of the data flow application to be displayed is determined with a processor; and, the time required to render a display of the data flow application is estimated. A transformed representation of the data flow application is created with the processor. The transformed representation is created based upon the user reason, the user constraint, the estimated time of rendering, and/or a layout strategy. The transformed representation is displayed on a graphical user interface. | 01-30-2014 |
20140215484 | MANAGING MODEL BUILDING COMPONENTS OF DATA ANALYSIS APPLICATIONS - Data analysis applications include model building components and stream processing components. To increase utility of the data analysis application, in one embodiment, the model building component of the data analysis application is managed. Management includes resource allocation and/or configuration adaptation of the model building component, as examples. | 07-31-2014 |