Patent application number | Description | Published |
20080235698 | METHOD AND APPARATUS FOR ASSIGNING CANDIDATE PROCESSING NODES IN A STREAM-ORIENTED COMPUTER SYSTEM - A method of choosing jobs to run in a stream based distributed computer system includes determining jobs to be run in a distributed stream-oriented system by deciding a priority threshold above which jobs will be accepted, below which jobs will be rejected. Overall importance is maximized relative to the priority threshold based on importance values assigned to all jobs. System constraints are applied to ensure jobs meet set criteria. | 09-25-2008 |
20080271036 | METHOD AND APPARATUS FOR ASSIGNING FRACTIONAL PROCESSING NODES TO WORK IN A STREAM-ORIENTED COMPUTER SYSTEM - An apparatus and method for making fractional assignments of processing elements to processing nodes for stream-based applications in a distributed computer system includes determining an amount of processing power to give to each processing element. Based on a list of acceptable processing nodes, a determination of fractions of which processing nodes will work on each processing element is made. To update allocations of the amount of processing power and the fractions, the process is repeated. | 10-30-2008 |
20090119238 | METHOD AND SYSTEM FOR PREDICTING RESOURCE USAGE OF REUSABLE STREAM PROCESSING ELEMENTS - A method is provided for generating a resource function estimate of resource usage by an instance of a processing element configured to consume zero or more input data streams in a stream processing system having a set of available resources that comprises receiving at least one specified performance metric for the zero or more input data streams and a processing power of the set of available resources, wherein one specified performance metric is stream rate; generating a multi-part signature of executable-specific information for the processing element and a multi-part signature of context-specific information for the instance; accessing a database of resource functions to identify a static resource function corresponding to the executable-specific information and a context-dependent resource function corresponding to the context-specific information; combining the static resource function and the context-dependent resource function to form a composite resource function for the instance; and applying the resource function to the at least one specified performance metric and the processing power to generate the resource function estimate of the at least one specified performance metric for processing by the instance. | 05-07-2009 |
20090238178 | METHOD, SYSTEM, AND COMPUTER PROGRAM PRODUCT FOR IMPLEMENTING STREAM PROCESSING USING A RECONFIGURABLE OPTICAL SWITCH - A method, system, and computer program product for implementing stream processing are provided. The system includes an application framework and applications containing dataflow graphs managed by the application framework running on a first network. The system also includes at least one circuit switch in the first network having a configuration that is controlled by the application framework, a plurality of processing nodes interconnected by the first network over one of wireline and wireless links, and a second network for providing at least one of control and additional data transfer over the first network. The application framework reconfigures circuit switches in response to monitoring aspects of the applications and the first network | 09-24-2009 |
20090241123 | METHOD, APPARATUS, AND COMPUTER PROGRAM PRODUCT FOR SCHEDULING WORK IN A STREAM-ORIENTED COMPUTER SYSTEM WITH CONFIGURABLE NETWORKS - A method, apparatus, and computer program product for scheduling stream-based applications in a distributed computer system with configurable networks are provided. The method includes choosing, at a highest temporal level, jobs that will run, an optimal template alternative for the jobs that will run, network topology, and candidate processing nodes for processing elements of the optimal template alternative for each running job to maximize importance of work performed by the system. The method further includes making, at a medium temporal level, fractional allocations and re-allocations of the candidate processing elements to the processing nodes in the system to react to changing importance of the work. The method also includes revising, at a lowest temporal level, the fractional allocations and re-allocations on a continual basis to react to burstiness of the work, and to differences between projected and real progress of the work. | 09-24-2009 |
20090300623 | METHODS AND SYSTEMS FOR ASSIGNING NON-CONTINUAL JOBS TO CANDIDATE PROCESSING NODES IN A STREAM-ORIENTED COMPUTER SYSTEM - A system and method for choosing non-continual jobs to run in a stream-based distributed computer system includes determining a total amount of resources to be consumed by non-continual jobs. A priority threshold is determined above which jobs will be accepted, below which jobs will be rejected. Overall penalties are minimized relative to the priority threshold based on estimated completion times of the jobs. System constraints are applied to ensure that jobs meet set criteria such that a plurality of non-continual jobs are scheduled which consider the system constraints and minimize overall penalties using available resources. | 12-03-2009 |
20100242042 | Method and apparatus for scheduling work in a stream-oriented computer system - An apparatus and method for scheduling stream-based applications in a distributed computer system includes a scheduler configured to schedule work using three temporal levels. Each temporal level includes a method. A macro method is configured to schedule jobs that will run, in a highest temporal level, in accordance with a plurality of operation constraints to optimize importance of work. A micro method is configured to fractionally allocate, at a medium temporal level, processing elements to processing nodes in the system to react to changing importance of the work. A nano method is configured to revise, at a lowest temporal level, fractional allocations on a continual basis. | 09-23-2010 |
20100325621 | PARTITIONING OPERATOR FLOW GRAPHS - Techniques for partitioning an operator flow graph are provided. The techniques include receiving source code for a steam processing application, wherein the source code comprises an operator flow graph, wherein the operator flow graph comprises a plurality of operators, receiving profiling data associated with the plurality of operators and one or more processing requirements of the operators, defining a candidate partition as a coalescing of one or more of the operators into one or more sets of processing elements (PEs), using the profiling data to create one or more candidate partitions of the processing elements, using the one or more candidate partitions to choose a desired partitioning of the operator flow graph, and compiling the source code into an executable code based on the desired partitioning. | 12-23-2010 |
20110061060 | Determining Operator Partitioning Constraint Feasibility - Techniques for determining feasibility of a set of one or more operator partitioning constraints are provided. The techniques include receiving one or more sets of operator partitioning constraints, wherein each set of one or more constraints define one or more desired conditions for grouping together of operators into partitions and placing partitions on hosts, wherein each operator is embodied as software that performs a particular function, processing each set of one or more operator partitioning constraints to determine feasibility of each set of one or more operator partitioning constraints, creating and outputting one or more candidate partitions and one or more host placements for each set of feasible partitioning constraints, and creating and outputting a certificate of infeasibility for each set of infeasible partitioning constraints, wherein the certificate of infeasibility outlines one or more reasons for infeasibility. | 03-10-2011 |
20110246999 | METHOD AND APPARATUS FOR ASSIGNING CANDIDATE PROCESSING NODES IN A STREAM-ORIENTED COMPUTER SYSTEM - A method of choosing jobs to run in a stream based distributed computer system includes determining jobs to be run in a distributed stream-oriented system by deciding a priority threshold above which jobs will be accepted, below which jobs will be rejected. Overall importance is maximized relative to the priority threshold based on importance values assigned to all jobs. System constraints are applied to ensure jobs meet set criteria. | 10-06-2011 |
20120174110 | AMORTIZING COSTS OF SHARED SCANS - Techniques for scheduling a plurality of jobs sharing input are provided. The techniques include partitioning one or more input datasets into multiple subcomponents, analyzing a plurality of jobs to determine which of the plurality of jobs require scanning of one or more common subcomponents of the one or more input datasets, and scheduling a plurality of jobs that require scanning of one or more common subcomponents of the one or more input datasets, facilitating a single scanning of the one or more common subcomponents to be used as input by each of the plurality of jobs. | 07-05-2012 |
20120304186 | Scheduling Mapreduce Jobs in the Presence of Priority Classes - Techniques for scheduling one or more MapReduce jobs in a presence of one or more priority classes are provided. The techniques include obtaining a preferred ordering for one or more MapReduce jobs, wherein the preferred ordering comprises one or more priority classes, prioritizing the one or more priority classes subject to one or more dynamic minimum slot guarantees for each priority class, and iteratively employing a MapReduce scheduler, once per priority class, in priority class order, to optimize performance of the one or more MapReduce jobs. | 11-29-2012 |
20120304188 | Scheduling Flows in a Multi-Platform Cluster Environment - Techniques for scheduling multiple flows in a multi-platform cluster environment are provided. The techniques include partitioning a cluster into one or more platform containers associated with one or more platforms in the cluster, scheduling one or more flows in each of the one or more platform containers, wherein the one or more flows are created as one or more flow containers, scheduling one or more individual jobs into the one or more flow containers to create a moldable schedule of one or more jobs, flows and platforms, and automatically converting the moldable schedule into a malleable schedule. | 11-29-2012 |
20130031558 | Scheduling Mapreduce Jobs in the Presence of Priority Classes - Techniques for scheduling one or more MapReduce jobs in a presence of one or more priority classes are provided. The techniques include obtaining a preferred ordering for one or more MapReduce jobs, wherein the preferred ordering comprises one or more priority classes, prioritizing the one or more priority classes subject to one or more dynamic minimum slot guarantees for each priority class, and iteratively employing a MapReduce scheduler, once per priority class, in priority class order, to optimize performance of the one or more MapReduce jobs. | 01-31-2013 |
20130031561 | Scheduling Flows in a Multi-Platform Cluster Environment - Techniques for scheduling multiple flows in a multi-platform cluster environment are provided. The techniques include partitioning a cluster into one or more platform containers associated with one or more platforms in the cluster, scheduling one or more flows in each of the one or more platform containers, wherein the one or more flows are created as one or more flow containers, scheduling one or more individual jobs into the one or more flow containers to create a moldable schedule of one or more jobs, flows and platforms, and automatically converting the moldable schedule into a malleable schedule. | 01-31-2013 |
20130151536 | Vertex-Proximity Query Processing - A method, an apparatus and an article of manufacture for processing a random-walk based vertex-proximity query on a graph. The method includes computing at least one vertex cluster and corresponding meta-information from a graph, dynamically updating the clustering and corresponding meta-information upon modification of the graph, and identifying a vertex cluster relevant to at least one query vertex and aggregating corresponding meta-information of the cluster to process the query. | 06-13-2013 |
20130239100 | Partitioning Operator Flow Graphs - Techniques for partitioning an operator flow graph are provided. The techniques include receiving source code for a stream processing application, wherein the source code comprises an operator flow graph, wherein the operator flow graph comprises a plurality of operators, receiving profiling data associated with the plurality of operators and one or more processing requirements of the operators, defining a candidate partition as a coalescing of one or more of the operators into one or more sets of processing elements (PEs), using the profiling data to create one or more candidate partitions of the processing elements, using the one or more candidate partitions to choose a desired partitioning of the operator flow graph, and compiling the source code into an executable code based on the desired partitioning. | 09-12-2013 |