SAS INSTITUTE INC. Patent applications |
Patent application number | Title | Published |
20160117205 | TECHNIQUES TO COMPUTE ATTRIBUTE RELATIONSHIPS UTILIZING A LEVELING OPERATION IN A COMPUTING ENVIRONMENT - Various embodiments include a system having interfaces, storage devices, memory, and processing circuitry. The system may be coupled with one or more storage devices and may receive episode information for a patient from a storage device via one or more wired or wireless links, the episode information includes a plurality of episodes associated with the patient, each of the plurality of episodes is a specific instance of a medical condition. The system may generate a candidate episode pairs list comprising a plurality of candidate episode pairs. Embodiments may also include the system generating a transition list comprising episode pairs from the plurality of candidate episode pairs in the candidate episode pairs list and determining attribute relationships between the plurality of episodes for the patient based on episode pairs in the transition list, the attribute relationships used to attribute items between the plurality of episodes. | 04-28-2016 |
20160077833 | AUTOMATED DECOMPOSITION FOR MIXED INTEGER LINEAR PROGRAMS WITH EMBEDDED NETWORKS REQUIRING MINIMAL SYNTAX - Embodiments include techniques to receive computer-executable query instructions to solve a MILP problem, the query instructions including a first expression conveying an objective function and side constraint that define a master problem of the MILP problem, a second expression conveying a mapping of graph data to a graph, and a third expression conveying a selection of a graph-based algorithm to solve a subproblem of the MILP problem; a subproblem component to replace the third expression with a fourth expression during decomposition of the MILP problem, the fourth expression including instructions to implement the graph-based algorithm to solve the subproblem; and an execution control component to perform iterations of solving the MILP problem that include executing the first expression to derive a solution to the master problem; and executing the fourth expression to derive a solution to the subproblem based on the mapping and the master problem solution. | 03-17-2016 |
20160070778 | TECHNIQUES FOR DYNAMIC PARTITIONING IN A DISTRIBUTED PARALLEL COMPUTATIONAL ENVIRONMENT - An apparatus includes an organization component to retrieve from task instructions an indication of a type of organization of data set subportions prior to performance of a computation and a data item by which the data set subportions are to be organized, organize the data set subportion among others based on the data item and type of organization, monitor availability of a first processing resource and a first storage resource of a node device employed to organize the data set subportions, and based on insufficient availability of at least one of the first processing resource or the first storage resource, interrupt the organization of the data set subportions, and dispatch a first set of one or more organized data set subportions to be processed; and a performance component to execute the task instructions to process the organized data set subportion. | 03-10-2016 |
20160044092 | DYNAMIC ASSIGNMENT OF TRANSFERS OF BLOCKS OF DATA - A computer-program causing a computing device to transmit a command to a data storage cluster for multiple data transfer threads thereof to request assignment of a data transfer from a distribution thread; await receipt of a request for assignment from a data transfer thread; compare the quantity data transfer threads to the quantity of computation threads of a data processing cluster; assign to the data transfer thread an exchange of a block of data with a single computation thread in response to receipt of the request and to the multitude of data transfer threads comprising a greater quantity of threads than the multitude of computation threads; and assign to the data transfer thread exchanges of multiple blocks of data with multiple computation threads in response to receipt of the request and to the multitude of data transfer threads comprising a lesser quantity of threads than the multitude of computation threads. | 02-11-2016 |
20160041901 | DYNAMIC ASSIGNMENT OF TRANSFERS OF BLOCKS OF DATA - A computer-program causing a computing device to transmit, from a data transfer thread of a multitude of data transfer threads executed within a data storage cluster and to a distribution thread at a network address on a network, a request for an assignment of an exchange of data with at least one computation thread of a multitude of computation threads executed within a data processing cluster; exchange a block of data with a single computation thread of the multitude of computation threads in response to receipt of an assignment to exchange the block of data with the single computation thread; and exchange multiple blocks of data with multiple computation threads of the multitude of computation threads in a round robin manner among the multiple computation threads in response to receipt of an assignment to exchange the multiple blocks of data with the multiple computation threads. | 02-11-2016 |
20150347149 | AUTOMATED DECOMPOSITION FOR MIXED INTEGER LINEAR PROGRAMS WITH EMBEDDED NETWORKS REQUIRING MINIMAL SYNTAX - An apparatus includes a communications component to receive computer-executable query instructions to solve a MILP problem, the query instructions including a first expression conveying an objective function and side constraint that define a master problem of the MILP problem, a second expression conveying a mapping of graph data to a graph, and a third expression conveying a selection of a graph-based algorithm to solve a subproblem of the MILP problem; a subproblem component to replace the third expression with a fourth expression during decomposition of the MILP problem, the fourth expression including instructions to implement the graph-based algorithm to solve the subproblem; and an execution control component to perform iterations of solving the MILP problem that include executing the first expression to derive a solution to the master problem; and executing the fourth expression to derive a solution to the subproblem based on the mapping and the master problem solution. | 12-03-2015 |
20150331932 | TECHNIQUES FOR GENERATING A CLUSTERED REPRESENTATION OF A NETWORK BASED ON NODE DATA - An apparatus includes a communications component to receive a specified variable and one or more specified criteria to select a final clustered representation of a network, the specified criteria including a maximum degree of loss of information for the specified variable for the final clustered representation; and an iterative collapse component to perform iteration(s) of deriving the final clustered representation. Each iteration includes calculating the degree of loss from each possible combination of two linked nodes of a current clustered representation to generate a next clustered representation; selecting the possible combination associated with a smallest degree of loss; determining whether to cease iterations based on whether the smallest degree associated with the selected combination exceeds the maximum degree; effecting the selected combination if the smallest degree doesn't exceed the maximum degree; and selecting the current clustered representation as the final clustered representation if the smallest degree exceeds the maximum degree. | 11-19-2015 |
20150324328 | TECHNIQUES TO PROVIDE SIGNIFICANCE FOR STATISTICAL TESTS - Techniques to provide significance for statistical tests are described. An apparatus may comprise a data handler component to receive a real data set from a client device, the real data set to comprise data representing at least one measurable phenomenon, a statistical test component to receive a computational representation arranged to generate an approximate probability distribution for statistics of a statistical test based on a parameter vector, the statistics of the statistical test to follow a probability distribution, generate statistics for the statistical test using the real data set, generate the approximate probability distribution of the computational representation, and a significance generator component to generate a set of statistical significance values for the statistics through interpolation using the approximate probability distribution, the set of statistical significance values comprising one or more p-values. Other embodiments are described and claimed. | 11-12-2015 |
20150324327 | TECHNIQUES TO PERFORM INTERPOLATION FOR STATISTICAL TESTS - Techniques to perform interpolation for statistical tests are described. An apparatus may comprise processor circuitry and a simulated data component for execution by the processor circuitry to generate simulated data for a statistical test, statistics of the statistical test based on parameter vectors to follow a probability distribution. The apparatus may further comprise a statistic simulator component for execution by the processor circuitry to simulate statistics for the parameter vectors from the simulated data, each parameter vector represented with a single point in a grid of points. The apparatus may further comprise a code generator component for execution by the processor circuitry to remove selective points from the grid of points to form a subset of points, and generate interpolation code to interpolate a statistic of the statistical test on any point. Other embodiments are described and claimed. | 11-12-2015 |
20150324326 | TECHNIQUES TO PERFORM CURVE FITTING FOR STATISTICAL TESTS - Techniques to perform curve fitting for statistical tests are described. An apparatus may comprise a simulated data component to generate simulated data for a statistical test, the statistical test based on parameter vectors to follow a probability distribution. The apparatus may further comprise a statistic simulator component to simulate statistics for the parameter vectors from the simulated data, each parameter vector represented with a single point in a grid of points, calculate quantiles for the parameters vectors from the simulated data, and fit an estimated cumulative distribution function (CDF) curve to quantiles for each point in the grid of points using a monotonic cubic spline interpolation technique in combination with a transform to satisfy a defined level of precision. Other embodiments are described and claimed. | 11-12-2015 |
20150324325 | TECHNIQUES TO PERFORM DATA REDUCTION FOR STATISTICAL TESTS - Techniques to perform data reduction for statistical tests are described. An apparatus may comprise an evaluation component to receive a computational representation arranged to generate an approximate probability distribution for statistics of a statistical test, the computational representation to include a simulated data structure with information for estimated cumulative distribution function (CDF) curves for one or more parameter vectors of the statistical test, each parameter vector represented with a single point in a grid of points, the evaluation component to evaluate the simulated data structure to determine whether any points in the grid of points are removable from the simulated data structure with a target level of precision, and a data reduction generator to reduce the simulated data structure in accordance with the evaluation to produce a reduced simulated data structure having a smaller data storage size relative to the simulated data structure. Other embodiments are described and claimed. | 11-12-2015 |
20150324221 | TECHNIQUES TO MANAGE VIRTUAL CLASSES FOR STATISTICAL TESTS - Techniques to manage virtual classes for statistical tests are described. An apparatus may comprise a simulated data component to generate simulated data for a statistical test, statistics of the statistical test based on parameter vectors to follow a probability distribution, a statistic simulator component to simulate statistics for the parameter vectors from the simulated data with a distributed computing system comprising multiple nodes each having one or more processors capable of executing multiple threads, the simulation to occur by distribution of portions of the simulated data across the multiple nodes of the distributed computing system, and a distributed control engine to control task execution on the distributed portions of the simulated data on each node of the distributed computing system with a virtual software class arranged to coordinate task and sub-task operations across the nodes of the distributed computing system. Other embodiments are described and claimed. | 11-12-2015 |
20150301860 | TECHNIQUES FOR GENERATING INSTRUCTIONS TO CONTROL DATABASE PROCESSING - An apparatus includes a task selector to receive an indication of a database task to be performed, wherein the database task includes a set of subtasks; a source selector to receive an indication of a source device to perform the set of subtasks, and to retrieve from the source device an indication of a processing environment currently available within the source device that includes an identity and version level of a database routine of the source device; and an instruction generator to determine a set of languages able to be interpreted by the database routine based on the identity and version level, select a language of the set of languages in which to generate instructions for each subtask based on the processing environment, and generate and transmit the instructions to the source device. | 10-22-2015 |
20150278220 | ONLINE REPUTATION IMPACTED INFORMATION SYSTEMS - Various embodiments may be generally directed to techniques and an apparatus including a network interface, processing circuitry coupled with the network interface, and one or more modules operable on the processing circuitry to receive historical rate information and historical reputation information for one or more products, generate a plurality of rate indices from the historical rate information for one or more products, each of the rate indices associated with a different lead time, and determine a rate index from the plurality of rate indices associated with an optimal lead time based on a maximum correlation between the rate index and a reputation index, the reputation index based on the historical reputation information for the one or more products. In addition, a multiple linear regression model comprising one or more parameters may be generated using the rate index, the reputation index, and one or more indicator values, the multiple linear regression model may be used to determine a reputation impacted rate for a product. | 10-01-2015 |
20150261573 | TECHNIQUES FOR GENERATING INSTRUCTIONS TO CONTROL DATABASE PROCESSING - An apparatus includes a task selector to receive an indication of a database task to be performed, wherein the database task includes first and second subtasks; a source selector to receive an indication of a source device to perform the first and second subtasks, and to retrieve from the source device an indication of a processing environment currently available within the source device that includes an identity and version level of a database routine of the specified source device; and an instruction generator to determine a set of languages able to be interpreted by the database routine based on the identity and version level, determine whether to perform the first and second subtasks in parallel based on the processing environment, select a language in which to generate instructions to perform the first subtask based on the determination, and generate and transmit the instructions to the source device. | 09-17-2015 |
20150234956 | TECHNIQUES FOR COMPRESSING A LARGE DISTRIBUTED EMPIRICAL SAMPLE OF A COMPOUND PROBABILITY DISTRIBUTION INTO AN APPROXIMATE PARAMETRIC DISTRIBUTION WITH SCALABLE PARALLEL PROCESSING - Techniques for estimated compound probability distribution are described. An apparatus may comprise a configuration component, perturbation component, sample generation controller, an aggregation component, a distribution fitting component, and statistics generation component. The configuration component may be operative to receive a compound model specification and candidate distribution definition. The perturbation component may be operative to generate a plurality of models from the compound model specification. The sample generation controller may be operative to initiate the generation of a plurality of compound model samples from each of the plurality of models. The distribution fitting component may generate parameter values for the candidate distribution definition based on the compound model samples. The statistics generation component may generate approximated aggregate statistics. Other embodiments are described and claimed. | 08-20-2015 |
20150234955 | TECHNIQUES FOR ESTIMATING COMPOUND PROBABILITY DISTRIBUTION BY SIMULATING LARGE EMPIRICAL SAMPLES WITH SCALABLE PARALLEL AND DISTRIBUTED PROCESSING - Techniques for estimated compound probability distribution are described. An apparatus may comprise a configuration component, perturbation component, sample generation controller, an aggregation component, a distribution fitting component, and statistics generation component. The configuration component may be operative to receive a compound model specification and candidate distribution definition. The perturbation component may be operative to generate a plurality of models from the compound model specification. The sample generation controller may be operative to initiate the generation of a plurality of compound model samples from each of the plurality of models. The distribution fitting component may generate parameter values for the candidate distribution definition based on the compound model samples. The statistics generation component may generate approximated aggregate statistics. Other embodiments are described and claimed. | 08-20-2015 |
20150222485 | DYNAMIC SERVER CONFIGURATION AND INITIALIZATION - A computer-program product causing a processor component to identify requisite configuration information for executing an application routine by the processor component, wherein the requisite configuration information enables the application routine to communicate with another computing device via a network during execution of the application routine; generate an API that specifies the requisite configuration information to a resource device via the network when the resource device accesses the API via the network; operate a network interface that couples the processor component to the network to make the API accessible to the resource device via the network to enable the resource device to provide the requisite configuration information; and relay the requisite configuration information following provision of the requisite configuration information by the resource device to the application routine for use during execution of the application routine and to a start component to trigger a start of execution of the application routine. | 08-06-2015 |
20150200807 | DYNAMIC SERVER TO SERVER CONFIGURATION AND INITIALIZATION - An apparatus includes a discovery component to identify a first application routine within a storage for execution, to identify execution of a remote application routine within a node device as a first requisite for execution of the first application routine from a first application requisites data, and to provide indications of storage of the first application routine and the first requisite to a control master; a start component to restart an earlier started execution of the first application routine in response to receipt of an indication that execution of the remote application routine within the node device has been restarted in accordance with a catalog received from the control master, the catalog including indications of the first and remote application routines and an indication of the first requisite; and a status component to provide an indication to the control master of the restart of execution of the first application routine. | 07-16-2015 |
20150200805 | DYNAMIC SERVER TO SERVER CONFIGURATION AND INITIALIZATION - An apparatus includes a discovery component to identify a first application routine within a storage for execution, to identify execution of a remote application routine within a node device as a first requisite for execution of the first application routine from a first application requisites data, and to provide indications of storage of the first application routine and the first requisite to a control master; a start component to start of execution of the first application routine in response to receipt of an indication that execution of the remote application routine within the node device has started in accordance with a catalog received from the control master, the catalog including indications of the first and remote application routines and an indication of the first requisite; and a status component to provide an indication to the control master of the start of execution of the first application routine. | 07-16-2015 |
20150120669 | TECHNIQUES FOR CREATING A BOOTABLE IMAGE IN A CLOUD-BASED COMPUTING ENVIRONMENT - Various embodiments are generally directed to an apparatus, method and other techniques for receiving a request to generate a bootable image in a cloud-based computing environment, creating a block storage volume in the cloud-based computing environment in response to receiving the request, the block storage volume having one or more partitions. Further, an apparatus, method and so forth may include installing software comprising one or more files in a file system on the block storage volume in the cloud-based computing environment, creating a snapshot of the file system including the software in the cloud-based computing environment, and creating a bootable image from the snapshot of the file system in the cloud-based computing environment. | 04-30-2015 |
20150120647 | LINKED NETWORK SCORING UPDATE - A method of updating a score in a network of linked nodes is provided. Scoring information including a node identifier and a score value for a node in a network of nodes is received. The score value is determined using an analytic model and a parameter value. An anchored network record for which the node is an anchor is identified using the node identifier. A node record for the node is identified in the identified anchored network record. A network score value is computed based on the score value. The identified node record is updated with the score value and the computed network score value. A next anchored network record that includes the node is identified using the node identifier. A second node record for the node is identified in the identified next anchored network record. The second node record is updated based on the updated, identified node record. | 04-30-2015 |
20150117262 | LINK ADDITION TO A NETWORK OF LINKED NODES - A method of adding a link to a network of linked nodes is provided. Scoring information is received. The scoring information includes a first node identifier, a second node identifier, and a link value. The link value is determined using an analytic model. A first anchored network record for which a first node associated with the first node identifier is an anchor is identified. A first link record is added to the identified first anchored network record using the first node identifier, the second node identifier, and the link value. A first node record associated with the second node identifier is added to the identified first anchored network record. A node record is identified for the first node in the identified anchored network record. A network score value included in the identified node record is computed based on the link value. The identified node record is updated with the computed network score value. | 04-30-2015 |
20150117254 | OBJECT STORE CREATION - A method of creating an object store is provided. Node table information reading and link table information are read. The node table information includes node information for a plurality of nodes. The link table information includes link information between pairs of nodes of the plurality of nodes. An anchored network record is created for each node of the plurality of nodes based on the node information and the link information and a defined maximum degree of separation. The anchored network record includes anchor node information associated with an anchor node of the anchored network record and a node record for each node of the plurality of nodes that is within the defined maximum degree of separation from the anchor node of the anchored network record. The created anchored network record is stored for each node of the plurality of nodes. | 04-30-2015 |
20150106663 | HASH LABELING OF LOGGING MESSAGES - Systems and methods for labeling text with alphanumeric identifiers are included. A logging string that includes a block of output text may be determined during program code execution. A computing device may generate a first alphanumeric identifier for the logging string using a hashing algorithm. The computing device may remove a portion of the logging string to determine a modified string. The computing device may generate a second alphanumeric identifier for the modified string using the hashing algorithm. The first alphanumeric identifier and the second alphanumeric identifier are presented with the logging string. | 04-16-2015 |
20150077428 | VECTOR GRAPH GRAPHICAL OBJECT - Various embodiments are generally directed to techniques for increasing the amount of information conveyed per graphical object in graphical presentations of data. A non-transitory machine-readable storage medium includes instructions, that when executed, cause a computing device to determine a major range of values occurring during a major period, the major period including a shorter minor period; and generate a vector graph including a graphical object and an axis indicating a scale. The graphical object may include a major period line parallel to the axis and indicating the major range; and a minor period arrow overlying and pointing in a direction parallel to the length of the major period line, the point and base of the minor period arrow overlying the major period line at locations indicating values at an end and at a start, respectively, of the minor period. Other embodiments are described and claimed. | 03-19-2015 |
20150029213 | VISUALIZING HIGH-CARDINALLY DATA - A method of visualizing high-cardinally data is provided. A graph is presented on a display. The graph includes a first axis, a second axis, and a plurality of value markers. The first axis includes a minimum value and a maximum value and the second axis includes a plurality of category values. A selection indicator identifying selection of a first value marker of the plurality of value markers is received. The first value marker indicates a value for a category value of the plurality of category values. A second plurality of category values is determined based on the category value. The graph and a second graph are presented on the display. The second graph includes a third axis, a fourth axis, and a second plurality of value markers. The third axis includes a second minimum value and a second maximum value. | 01-29-2015 |
20150019554 | NUMBER OF CLUSTERS ESTIMATION - A method of determining a number of clusters for a dataset is provided. Centroid locations for a defined number of clusters are determined using a clustering algorithm. Boundaries for each of the defined clusters are defined. A reference distribution that includes a plurality of data points is created. The plurality of data points are within the defined boundary of at least one cluster of the defined clusters. Second centroid locations for the defined number of clusters are determined using the clustering algorithm and the reference distribution. A gap statistic for the defined number of clusters based on a comparison between a first residual sum of squares and a second residual sum of squares is computed. The processing is repeated for a next number of clusters to create. An estimated best number of clusters for the received data is determined by comparing the gap statistic computed for each iteration of the number of clusters. | 01-15-2015 |
20140372090 | INCREMENTAL RESPONSE MODELING - A method of selecting a one-class support vector machine (SVM) model for incremental response modeling is provided. Exposure group data generated from first responses by an exposure group receiving a request to respond is received. Control group data generated from second responses by a control group not receiving the request to respond is received. A response is either positive or negative. A one-class SVM model is defined using the positive responses in the control group data and an upper bound parameter value. The defined one-class SVM model is executed with the identified positive responses from the exposure group data. An error value is determined based on execution of the defined one-class SVM model. A final one-class SVM model is selected by validating the defined one-class SVM model using the determined error value. | 12-18-2014 |
20140365198 | TECHNIQUES TO SIMULATE PRODUCTION EVENTS - Techniques to simulate production events are described. Some embodiments are particularly directed to techniques to simulate production events based on randomization across a distribution of production events. In one embodiment, for example, an apparatus may comprise a simulation application operative to simulate one or more commands in a simulated environment using a task hierarchy, the simulation application comprising a configuration component, a command generation component, and an execution component, wherein simulating the one or more commands comprises executing one or more task commands. The configuration component may be operative to receive the task hierarchy from the data store, the task hierarchy comprising a plurality of task entries, each task entry comprising a list of task entries or a task command, the list of task entries comprising probabilities associated with each task entry in the list of task entries, wherein task commands correspond to the simulated environment representing a production environment. The command generation component may be operative to determine the one or more task commands by traversing through the task hierarchy based on the associated probabilities until the one or more task commands are reached. The execution component may be operative to execute the one or more task commands as one or more simulated commands in the simulated environment. Other embodiments are described and claimed. | 12-11-2014 |
20140351196 | METHODS AND SYSTEMS FOR USING CLUSTERING FOR SPLITTING TREE NODES IN CLASSIFICATION DECISION TREES - Systems and methods for determining an optimal splitting scheme for a node in a classification decision tree. A computing system may receive input data related to a decision tree to be generated from a data set. The input data identifies a target attribute of the data set and a set of candidate attributes of the data set to be used as nodes in the decision tree. The computing system may determine, using a clustering algorithm and the set of candidate attributes, a number of potential splitting schemes to be used to split a node in the decision tree. The computing system may calculate a splitting measurement for each of the plurality of potential splitting schemes. The computing system may select an optimal splitting scheme from the plurality of potential splitting schemes for each node in the decision tree based on the splitting measurement. | 11-27-2014 |
20140330884 | SYSTEMS AND METHODS INVOLVING A MULTI-PASS ALGORITHM FOR HIGH CARDINALITY DATA - This disclosure describes methods, systems, computer-readable media, and apparatuses for calculating a summary statistic. Calculating the summary statistic can be performed by identifying multiple subsets of a set of variable observations and assigning the subsets to grid-computing devices such that no two of the subsets are assigned to a same one of the grid-computing devices. A parallel processing operation that involves multiple processing phases at each of the grid-computing devices is then coordinated. The parallel processing operation includes each of the grid-computing devices inventorying the respectively assigned subset and generating inventory information representative of the respectively assigned subset. Subsequently, the inventory information generated by the grid-computing devices is received, and a summary statistic is determined by synthesizing the received inventory information. | 11-06-2014 |
20140330827 | METHODS AND SYSTEMS TO OPERATE ON GROUP-BY SETS WITH HIGH CARDINALITY - This disclosure describes methods, systems, computer-readable media, and apparatuses for efficiently calculating group-by statistics. A data set that includes multiple entries is accessed. The multiple entries are grouped into group-by subsets which are formed on two or more group-by variables and which are subsets are subsets of the data set. Cardinality data is determined for each of the group-by subsets, wherein cardinality data represents a number of entries in a group-by subset. At least one summary of data in each of the group-by subsets is generated, wherein each of the summaries includes the cardinality data determined for the group-by subset. Objects for the group-by subsets are initialized such that the objects store the summaries. The objects may then be used to generate multiple statistical summaries of the data set. | 11-06-2014 |
20140330826 | METHODS AND SYSTEMS FOR DATA REDUCTION IN CLUSTER ANALYSIS IN DISTRIBUTED DATA ENVIRONMENTS - Systems and methods for data reduction of a data set are included. A computing system may group data points in a data set into a number of data point bubbles represented by a number of representative points. A data point bubble may include a one or more data points from the data set and a representative point from the data set. The computing system may calculate a cluster assignment for the representative point by executing a clustering algorithm using the number of representative points. | 11-06-2014 |
20140330536 | TECHNIQUES TO SIMULATE STATISTICAL TESTS - Techniques to simulate statistical tests are described. An apparatus may comprise a simulated data component to generate simulated data for a statistical test, where statistics of the statistical test are based on parameter vectors to follow a probability distribution, a statistic simulator component to generate statistics for the parameter vectors from the simulated data, each parameter vector represented with a single point in a grid of points, the statistic simulation component to distribute portions of the simulated data or simulated statistics across multiple nodes of a distributed computing system in accordance with a column-wise or column-wise-by-group distribution algorithm, and a code generator component to create a computational representation arranged to generate an approximate probability distribution for each point in the grid of points from the simulated statistics, the approximate probability distribution to comprise an empirical cumulative distribution function (CDF). Other embodiments are described and claimed. | 11-06-2014 |
20140330441 | TECHNIQUES TO DETERMINE SETTINGS FOR AN ELECTRICAL DISTRIBUTION NETWORK - Techniques to determine settings for an electrical distribution network are described. Some embodiments are particularly directed to techniques to determine settings for an electrical distribution network using power flow heuristics. In one embodiment, for example, an apparatus may comprise a model reception component, a forecast component, and an optimization component. The model reception component may be operative to receive a model of an electrical distribution network having multiple capacitor banks and multiple voltage regulators, each of the multiple capacitor banks represented in the model by a model capacitor bank, each of the multiple voltage regulators represented in the model by a model voltage regulator, the electrical distribution network having a radial layout in which power flows from a source to multiple nodes in which each node is associated with one voltage regulator. The forecast reception component may be operative to receive a forecast for demand on the electrical distribution network. The optimization component may be operative to receive the model capacitor banks and model voltage regulators and determine one or more settings for the multiple capacitor banks and multiple voltage regulators that allow for providing power within predetermined limits while reducing power loss as compared to a power loss of the existing settings or reducing power usage as compared to a power usage of the existing settings, the one or more settings for the multiple voltage regulators determined according to a heuristic in which potential settings are iteratively determined for each of the model voltage regulators based on a least squares model of load flow analysis. Other embodiments are described and claimed. | 11-06-2014 |
20140324762 | COMPUTATION OF RECEIVER OPERATING CHARACTERISTIC CURVES - A method of determining a false and/or a true positive rate is provided. A true count value and a false count value are initialized for probability bins. For a plurality of records, a truth of event occurrence and a probability of occurrence are read; a probability bin that includes the probability of occurrence is determined; the true count value of the determined probability bin is incremented when the truth of event occurrence indicates true; and the false count value of the determined probability bin is incremented when the truth of event occurrence indicates false. A true positive rate and a false positive rate are computed for each probability bin based on the true count value, the false count value, a determined total number of true event occurrences, and a determined total number of false event occurrences. | 10-30-2014 |
20140324738 | VISUALIZING HIGH CARDINALITY CATEGORICAL DATA - A computer-program causing a computing device to perform an association measurement between a target variable and each non-target variable of a data set; select non-target variables for inclusion in a visualization based on the degree of association; perform correspondence analysis between target values of the target variable and non-target values of each selected non-target variable; order target value markers within a target row based on the degrees of closeness; order non-target value markers within each non-target row based on the degrees of closeness; determine a width of each target value marker based on a frequency of occurrence of its target value in the data set; determine a width of each non-target value marker based on a frequency of occurrence of its non-target value in the data set; and cause generation of the visualization with connection markers emanating from the target value markers and extending among the non-target value markers. | 10-30-2014 |
20140297997 | AUTOMATED COOPERATIVE CONCURRENCY WITH MINIMAL SYNTAX - Various embodiments are generally directed to techniques for reducing syntax requirements in application code to cause concurrent execution of multiple iterations of at least a portion of a loop thereof to reduce overall execution time in solving a large scale problem. At least one non-transitory machine-readable storage medium includes instructions that when executed by a computing device, cause the computing device to parse an application code to identify a loop instruction indicative of an instruction block that includes instructions that define a loop of which multiple iterations are capable of concurrent execution, the instructions including at least one call instruction to an executable routine capable of concurrent execution; and insert at least one coordinating instruction into an instruction sub-block of the instruction block to cause sequential execution of instructions of the instruction sub-block across the multiple iterations based on identification of the loop instruction. Other embodiments are described and claimed. | 10-02-2014 |
20140282856 | RULE OPTIMIZATION FOR CLASSIFICATION AND DETECTION - This disclosure describes methods, systems, and computer-program products for determining classification rules to use within a fraud detection system The classification rules are determined by accessing distributional data representing a distribution of historical transactional events over a multivariate observational sample space defined with respect to multiple transactional variables. Each of the transactional events is represented by data with respect to each of the variables, and the distributional data is organized with respect to multi-dimensional subspaces of the sample space. A classification rule that references at least one of the subspaces is accessed, and the rule is modified using local optimization applied using the distributional data. A pending transaction is classified based on the modified classification rule and the transactional data. | 09-18-2014 |
20140282246 | LIST FOR TOUCH SCREENS - Various embodiments are generally directed to techniques for increasing the accuracy with which list items may be selected on a touch screen. A machine-readable storage medium includes instructions that when executed cause a computing device to present a list of multiple list items on a touch screen, each associated with a touch area and including a wide area marking a location of the touch area and a narrow area narrower than the wide area, the wide and narrow areas defining a presentation area wherein the wide areas of adjacent first and second list items are positioned at different first and second widthwise positions, respectively, and wherein the touch areas of the first and second list items coincide with the wide areas of the first and second list items, respectively. Other embodiments are described and claimed. | 09-18-2014 |
20140282152 | LIST WITH TARGETS FOR TOUCH SCREENS - Various embodiments are generally directed to techniques for increasing the accuracy with which list items may be selected on a touch screen. A machine-readable storage medium includes instructions that when executed cause a computing device to present a list of multiple list items on a touch screen, each associated with a touch area and including a presentation area and a visible target marking a location of the touch area, wherein the targets and coinciding touch areas of adjacent first and second list items are positioned at different first and second widthwise positions of the presentation areas of the first and second list items, respectively, and wherein the touch areas of each the first and second list items coincide with a portion of the presentation area of the other of the first and second list items. Other embodiments are described and claimed. | 09-18-2014 |
20140280986 | DELIVERY ACKNOWLEDGMENT IN EVENT STREAM PROCESSING - A method of acknowledging receipt of an event block object is provided. First connection information for connecting to an event stream processing (ESP) engine executing at a first computing device is received. A first connection to the ESP engine is established using the received first connection information. Second connection information for connecting to a publishing client executing at a second computing device is received. A second connection to the publishing client is established using the received second connection information, wherein the first connection differs from the second connection. An event block object is received from the ESP engine using the established first connection, wherein the event block object includes a unique identifier for the event block object. Successful processing of the event block object is determined. Responsive to the successful processing determination, an acknowledgment message including the unique identifier is sent to the publishing client using the established second connection. | 09-18-2014 |
20140280343 | SIMILARITY DETERMINATION BETWEEN ANONYMIZED DATA ITEMS - A method of determining a similarity between records in a data set is provided. Data organized into a plurality of records is received. First characters associated with a field and a first record of the plurality of records are selected. The selected first characters are encoded and subdivided into a first sliding series of a defined number of characters. Second characters associated with the field and a second record of the plurality of records are selected. The selected second characters are encoded and subdivided into a second sliding series of the defined number of characters. Whether or not the first sliding series and the second sliding series are similar is determined by comparing the encoded and subdivided first characters to the encoded and subdivided second characters using a fuzzy matching algorithm. | 09-18-2014 |
20140280331 | FILTERING OF A SHARED, DISTRIBUTED CUBE - A method of performing a query on a cube of data is provided. An access key associated with a user is created at a computing device. The access key defines the user's access to a cube of data distributed onto a plurality of computing devices with each computing device of the plurality of computing devices storing a different portion of the cube of data. A plurality of access masks is stored in association with the portion of the cube of data stored on the computing device. A process space associated with the user is created. A query on the cube of data is received by the computing device. The query is associated with the user. The query is processed while masking the created access key with the stored plurality of access masks, wherein the masking controls access to the stored portion of the cube of data. A result of the processed query is sent to a requesting computing device. | 09-18-2014 |
20140280330 | PERTURBATION OF A SHARED, DISTRIBUTED CUBE - A method of performing a query on a cube of data is provided. A cube of data is distributed onto a plurality of computing devices with each computing device of the plurality of computing devices storing a different portion of the cube of data. A perturbation rule configured for application to the cube of data and associated with a user is received. A process space associated with the user is created. The received perturbation rule is compiled in association with the created process space. A query on the portion of the cube of data stored at the computing device is received. The received query is associated with the created process space. The query is processed while applying the compiled perturbation rule to data extracted from the portion of the cube of data stored at the computing device. A result of the processed query is sent to a requesting computing device. | 09-18-2014 |
20140280247 | TECHNIQUES FOR DATA RETRIEVAL IN A DISTRIBUTED COMPUTING ENVIRONMENT - Enhanced techniques for data retrieval in a distributed computing environment are described. A computing node of a distributed computing environment may receive a data request. The computing node may include one or more subsets of data. The computing node may be configured to search among the one or more subset of data for a beginning of a data range that is responsive to the data request. The computing node may be further configured to forward a data range responsive to the search to another computing node of the distributed computing system to be merged with one or more additional data ranges. Other embodiments are described and claimed. | 09-18-2014 |
20140280239 | SIMILARITY DETERMINATION BETWEEN ANONYMIZED DATA ITEMS - A method of determining a similarity between records in a data set is provided. Data organized into a plurality of records is received. First characters associated with a field and a first record of the plurality of records are selected. The selected first characters are subdivided into a first sliding series of a defined number of characters. Second characters associated with the field and a second record of the plurality of records are selected. The selected second characters are subdivided into a second sliding series of the defined number of characters. A similarity score between the first sliding series and the second sliding series is calculated. Whether or not the first sliding series and the second sliding series are similar is determined based on the calculated similarity score. | 09-18-2014 |
20140280220 | SCORED STORAGE DETERMINATION - A method of determining a storage device on which to store received data is provided. Data is received. A score indicating a value associated with the received data is computed. A storage device is determined from a plurality of types of storage devices on which to store the received data based on the computed score. The received data is sent to the determined storage device. | 09-18-2014 |
20140279833 | METHOD TO REDUCE LARGE OLAP CUBE SIZE USING CELL SELECTION RULES - Various embodiments are directed to techniques for providing one or more reduced-size rule cubes indicating cell rules. A computer-program product embodied in a machine-readable storage medium includes instructions to cause a computing device to select a cell rule to include in a rule cube based on applicability of the cell rule to a selected portion of a data cube; analyze the cell rule to identify a wildcarded dimension in a specification of cells of the data cube that are subject to the cell rule; and generate the rule cube indicating applicability of the cell rule to the selected portion of the data cube, wherein a cell of the rule cube corresponds to multiple cells of the data cube, and wherein the wildcarded dimension of the rule cube is reduced in length in comparison to a length of the wildcarded dimension of the data cube. Other embodiments are described and claimed. | 09-18-2014 |
20140279819 | COMPACT REPRESENTATION OF MULTIVARIATE POSTERIOR PROBABILITY DISTRIBUTION FROM SIMULATED SAMPLES - Various embodiments are directed to techniques for selecting a subset of a set of simulated samples. A computer-program product including instructions to cause a computing device to order a plurality of UPDFs by UPDF value, wherein the plurality of UPDFs is associated with a chain of draws of a set of simulated samples, wherein each draw comprises multiple parameters and the UPDF values map to parameter values of the parameters; select a subset of the plurality of UPDFs based on the subset of the plurality of UPDFs having UPDF values within a range corresponding to a range of parameter values to include in a subset of the set of simulated samples; and transmit an indication of a draw comprising parameters having parameter values to include in the subset of the set of simulated samples, wherein the indication identifies the draw by associated UPDF. Other embodiments are described and claimed. | 09-18-2014 |
20140279816 | TECHNIQUES FOR PRODUCING STATISTICALLY CORRECT AND EFFICIENT COMBINATIONS OF MULTIPLE SIMULATED POSTERIOR SAMPLES - Various embodiments are generally directed to techniques for producing statistically correct and efficient combinations of multiple simulated posterior samples from MCMC and related Bayesian sampling schemes are described. One or more chains from a Bayesian posterior distribution of values may be generated. It may be determine whether the one or more chains have reached stationarity through parallel processing on a plurality of processing nodes. Based upon the determination, each of the one or more chains that have reached stationarity through parallel processing on the plurality of processing nodes may be sorted. The one or more sorted chains may be resampled through parallel processing on the plurality of processing nodes. The one or more resampled chains may be combined. Other embodiments are described and claimed. | 09-18-2014 |
20140279527 | Enterprise Cascade Models - Methods, systems, computer-readable media, and apparatuses for detecting unauthorized activity are disclosed. Detecting unauthorized activity is done by accessing first data that represents activity involving a first service provided to a customer, accessing second data that represents activity involving a second service provided to a customer. The activity involving the second service and the activity involving the first service both include authorized customer activity, and the activity associated with the second service further includes unauthorized activity. The first data is filtered using a filtering criteria and a portion of the first data is selected to be retained. The second data and the retained portion of the first data are analyzed, and the analysis includes classifying the activity associated with the second service in a way that distinguishes the unauthorized activity from the authorized activity associated with the second service. | 09-18-2014 |
20140278335 | TECHNIQUES FOR AUTOMATED BAYESIAN POSTERIOR SAMPLING USING MARKOV CHAIN MONTE CARLO AND RELATED SCHEMES - Techniques for automated Bayesian posterior sampling using Markov Chain Monte Carlo and related schemes are described. In an embodiment, one or more values in a stationarity phase for a system configured for Bayesian sampling may be initialized. Sampling may be performed in the stationarity phase based upon the one or more values to generate a plurality of samples. The plurality of samples may be evaluated based upon one or more stationarity criteria. The stationarity phase may be exited when the plurality of samples meets the one or more stationarity criteria. Other embodiments are described and claimed. | 09-18-2014 |
20140278239 | APPROXIMATE MULTIVARIATE POSTERIOR PROBABILITY DISTRIBUTIONS FROM SIMULATED SAMPLES - Various embodiments are directed to techniques for deriving a sample representation from a random sample. A computer-program product includes instructions to cause a first computing device to fit an empirical distribution function to a marginal probability distribution of a variable within a first sample portion of a random sample to derive a partial marginal probability distribution approximation, wherein the random sample is divided into multiple sample portions distributed among multiple computing devices; fit a first portion of a copula function to a multivariate probability distribution of the first sample portion, wherein the copula function is divided into multiple portions; and transmit an indication of a first likelihood contribution of the first sample portion to a coordinating device to cause a second computing device to fit a second portion of the copula function to a multivariate probability distribution of a second sample portion. Other embodiments are described and claimed. | 09-18-2014 |
20140278236 | TECHNIQUES FOR AUTOMATED BAYESIAN POSTERIOR SAMPLING USING MARKOV CHAIN MONTE CARLO AND RELATED SCHEMES - Techniques for automated Bayesian posterior sampling using Markov Chain Monte Carlo and related schemes are described. In an embodiment, one or more values in an accuracy phase for a system configured for Bayesian sampling may be initialized. Sampling may be performed in the accuracy phase based upon the one or more values to generate a plurality of samples. The plurality of samples may be evaluated based upon one or more accuracy criteria. The accuracy phase may be exited when the plurality of samples meets the one or more accuracy criteria. Other embodiments are described and claimed. | 09-18-2014 |
20140258454 | PARALLEL COMMUNITY DETECTION - Various embodiments are directed to techniques for countering oscillation in community assignments of nodes in a network during detection of its communities. A computer-program product tangibly embodied in a non-transitory machine-readable storage medium includes instructions operable to cause a computing device to derive a first connectedness metric of a first community to which a first node of a network belongs and a second connectedness metric of a second community to which a second node of the network belongs in parallel in an iteration of parallel detection of communities in the network, wherein the first and second nodes are connected in the network; randomly pin the first node to prevent its reassignment to another community during the iteration; compare the first and second connectedness metrics during the iteration; and reassign the second node from the second community to the first community based on the comparison. Other embodiments are described and claimed. | 09-11-2014 |
20140258193 | TECHNIQUES TO REFINE SOLUTIONS TO LINEAR OPTIMIZATION PROBLEMS USING SYMMETRIES - Techniques to refine solutions to linear optimization problems using symmetries are described. Some embodiments are particularly directed to techniques to refine solutions to linear optimization problems using symmetries to permute an existing solution into other feasible solutions that may improve upon the objective function. In one embodiment, for example, an apparatus may comprise a configuration component, an optimization component, a symmetries component, and an improvement component. The configuration component may be operative to receive an optimization problem described by an objective and constraints on a plurality of variables. The optimization interface component may be operative to receive an initial feasible solution to the optimization problem, the initial feasible solution comprising an assignment of values to the plurality of variables satisfying all the constraints, the initial feasible solution producing an initial objective value when applied to the objective. The symmetries interface component may be operative to receive one or more symmetries of the plurality of variables for the constraints, the one or more symmetries defining permutations of the plurality of variables guaranteed to produce only additional feasible solutions given the constraints when applied to an existing feasible solution. The improvement component may be operative to use the symmetries to produce permutations of the assignment of values to the plurality of variables, determine which of the permutations results in an improved objective value when applied to the objective, the improved objective value improving on the initial objective value produced by the initial feasible solution, and select the permutation that results in the improved objective value as an improved feasible solution to the optimization problem, the improved feasible solution improving on the initial feasible solution according to the objective. Other embodiments are described and claimed. | 09-11-2014 |
20140258162 | TECHNIQUES TO BLOCK RECORDS FOR MATCHING - Techniques to block records for matching are described. Some embodiments are particularly directed to techniques to block records for matching entities with inconsistent identifying information. In one embodiment, for example, an apparatus may comprise a configuration component, a coding component, a blocking component, and a matching component. The configuration component may be operative to receive a data set comprising a plurality of records and operative to receive a set of blocking variables, the blocking variables present as variables in each of the plurality of records. The coding component operative to generate match codes based on the blocking variables. The blocking component operative on the processor circuit to produce a plurality of blocks of records from the data set based on the match codes. The matching component operative to match records within each of the plurality of blocks by performing deterministic or probabilistic entity resolution based on similar variables of the records within each of the blocks. Other embodiments are described and claimed. | 09-11-2014 |
20140257913 | STORM RESPONSE OPTIMIZATION - A method of predicting equipment failures is provided. Potentially-effected equipment located in a path projection for a weather event is identified based on a current characteristic data describing equipment supporting a service. A likelihood of failure for each equipment of the identified potentially-effected equipment is calculated by executing a failure prediction model with the current characteristic data and weather event data describing characteristics of the weather event. Equipment failures of the identified potentially-effected equipment are predicted by comparing the calculated likelihood of failure for each equipment of the identified potentially-effected equipment to a predefined threshold. Information identifying the predicted equipment failures is output. | 09-11-2014 |
20140257895 | CONSTRAINED SERVICE RESTORATION - A method of determining service routes for a plurality of crews is provided. Outage data identifying service outage source locations, a number of affected customers associated with each location, and a type of repair to perform at each location is received. Crew data identifying a start location and a crew skill indicator for each crew is received. A service route is determined for each crew using a mixed integer linear program minimizing a total customer time without the service subject to the crew skill indicator satisfying the type of repair to perform at each location. The service route for a crew includes the start location as a first location and at least one location of the plurality of locations. | 09-11-2014 |
20140257778 | Devices for Forecasting Ratios in Hierarchies - Systems and methods for forecasting ratios in hierarchies are provided. Hierarchies can be formed that have components, including a numerator time series with values from input data, a denominator time series with values from input data, and a ratio time series of the numerator time series over the denominator time series. The components can be modeled to generate forecasted hierarchies. The forecasted hierarchies can be reconciled so that the forecasted hierarchies are statistically consistent throughout nodes of the forecasted hierarchies. | 09-11-2014 |
20140257694 | CONSTRAINED SERVICE RESTORATION WITH HEURISTICS - A method of determining service routes for a plurality of crews is provided. Outage data identifying a plurality of service outage source locations, a number of affected customers associated with each location of the plurality of locations, and a type of repair to perform at each location of the plurality of locations is received. A repair time is estimated for each location. Crew data identifying a plurality of crews and a start location for the plurality of crews is received. A service route for each crew is determined based on a crew skill indicator associated with each crew satisfying the type of repair to perform at each location and based on the estimated repair time for each location. The service route for a crew of the plurality of crews includes the start location as a first location and at least one location of the plurality of locations. | 09-11-2014 |
20140249795 | TECHNIQUES TO AUTOMATICALLY GENERATE SIMULATED INFORMATION - Techniques to automatically generate simulated information are described. A method comprises receiving, by a program builder component executed on a processor, a structured input file comprising one or more data libraries and one or more directive files to generate simulated data for a simulation database. The method further comprising producing, by the program builder component executed on the processor, a data generator program based on the structured input file, the data generator program arranged to generate the simulated data for the simulation database using multiple data generating sessions executed concurrently or sequentially. Other embodiments are described and claimed. | 09-04-2014 |
20140249776 | System and Method for Multivariate Outlier Detection - A computer-implemented method of determining actions outside of a norm is provided. The method comprises: generating an actor state vector and a peer group state vector, wherein the actor state vector identifies a characteristic for an actor in each of a plurality of categories and the peer group state vector identifies a characteristic for a peer group in each of the plurality of categories, transforming the actor state vector into a first sampled wave series representation using a first wave series transformation, transforming the population state vector into a second sampled wave series representation using a second wave series transformation, and filtering the first sampled wave series representation and the second sampled wave series representation to identify a deviation of the first wave series representation from the second wave series representation in a phase or a magnitude. | 09-04-2014 |
20140245305 | Systems and Methods for Multi-Tenancy Data Processing - System and methods are provided for rotating real time execution of data models using an application instance. Input data are received for real time execution of a plurality of data models. An application instance is assigned for executing the plurality of data models simultaneously. Resources of the application instance are automatically distributed based on a set of rotation factors. The plurality of data models are executed simultaneously using one or more data processors. Execution results for one or more of the plurality of data models are output. | 08-28-2014 |
20140237001 | System And Method For Fast Identification Of Variable Roles During Initial Data Exploration - Systems and methods are provided for identifying data variable rules during initial data exploration. In one example, a computer-implemented method of determining a role for a data variable is disclosed. The method comprises identifying to a plurality of data nodes a set of data records containing data values assigned to each data node, a maximum number of levels to record in a sorted data structure at the data nodes, and the data node responsible for each of a plurality of variables. The method further comprises receiving for each variable from the data node responsible for the variable a plurality of unique data values for the variable, a count for each of the unique data values and an overflow count for the variable, wherein the number of unique data values does not exceed the maximum number of levels. A role for a variable can be determined based upon the unique data values, counts and overflow count for the variable. | 08-21-2014 |
20140222491 | Systems and Methods for Determining Pack Allocations - Systems and methods are provided for determining a distribution of each of a plurality of inner packs to a plurality of stores. Mismatch cost data and product demand data are received for the plurality of stores. A first inner pack quantity for distribution is determined based on the product demand data. A supply difference amount is determined, where the supply difference amount is a difference between the first inner pack quantity and the number of first inner packs available for distribution. A determination is made that adjusting the first inner pack quantity for the particular store based on the supply difference amount would have less effect on mismatch costs than other stores, and the first inner pack quantity is adjusted for the particular store based on the supply difference. | 08-07-2014 |
20140188918 | TECHNIQUES TO PERFORM IN-DATABASE COMPUTATIONAL PROGRAMMING - Various embodiments are generally directed to an apparatus and method for generating a general request having structures and information to perform an analytical calculation on data stored in a distributed database system and converting the structures and information of the general request to a compute request having a request format conforming to a query language used by the distributed database system. Various embodiment may also include sending the compute request to a node of the distributed database system and receiving a compute response from the node of the distributed database system, the compute response including a result set of the analytical calculation performed on data local to the node from an analytic container implemented by the node, the analytic container including an embedded process to replicate an execution environment hosted within the distributed database system used by a client application. | 07-03-2014 |
20140188830 | Social Community Identification for Automatic Document Classification - Systems and methods for identifying data files that have a common characteristic are provided. A plurality of data files are received. The plurality of data files include one or more data files having the common characteristic. A list of key terms is generated from the plurality of data files. Data files from the plurality of data files that have an association with a social community are identified, where the social community is defined by one or more features. The list of key terms is updated based on an analysis of the identified features. The updated list of key terms is used to identify other data files that have the common characteristic. | 07-03-2014 |
20140181002 | Systems and Methods for Implementing Virtual Cubes for Data Processing - System and methods are provided for processing a multi-dimensional data structure represented as multi-dimensional cubes. A first multi-dimensional cube and a second multi-dimensional cube are received, the first multi-dimensional cube including first cube property data and first user data, the second multi-dimensional cube including second cube property data and second user data. A virtual multi-dimensional cube including virtual cube property data for accessing and performing computer-based operations upon the first user data and the second user data are generated, the virtual cube property data including a first mapping from the first cube property data to the virtual cube property data and a second mapping from the second cube property data to the virtual cube property data. | 06-26-2014 |
20140172705 | SYSTEMS AND METHODS FOR EXTENDING SIGNATURE TECHNOLOGY - Systems and methods extending signatures are provided. Some of the disclosed systems and methods can include receiving, on a computing device, transaction data associated with one or more entities, wherein the transaction data is associated with one or more keys; determining one or more keys associated with the one or more entities, wherein each entity has corresponding signature data, and wherein signature data includes one or more associated keys; matching the one or more keys associated with the transaction data to one or more keys associated with the signature data corresponding to the one or more entities; and retrieving the signature data corresponding to the one or more entities. The systems and methods may further comprise updating the signature data with the transaction data; using a scoring engine to score the updated signature data; and storing the updated signature data. | 06-19-2014 |
20140172551 | Using Transaction Data and Platform for Mobile Devices - Systems and methods for using historical and current financial transaction data in implementing a marketing strategy are provided. A system and method can include updating stored signature data using current data associated with an entity. The signature data includes historic data including credit card transactions or debit card transactions associated with the entity. One or more model variables are generated using the updated signature data associated with the entity. A marketing score for the entity is determined by applying one or more model variables to a marketing model. The marketing score indicates a likelihood that the entity will respond to an offer. Whether the marketing score exceeds a predetermined marketing threshold is determined. Based upon determining that the marketing score exceeds the predetermined marketing threshold and determining that the entity is within the geographic area, an indication for triggering transmission of the offer to the entity is generated. | 06-19-2014 |
20140172547 | Scoring Online Data for Advertising Servers - Systems and methods for using online activity data in implementing a marketing strategy are provided. A system and method can include generating, on a computing device, variables using signature data that includes historic clickstream data and current clickstream data associated with an entity. A subset of the variables can be identified using a covariance matrix for the variables. Scores can be generated by applying the subset of the variables to models. Weighted scores can be generated by associating weights with the scores. The weighted scores can be used for selecting online advertisements. Target data can be received that includes online advertisement click data associated with the entity. New scores of the current data can be generated using the models. The weights associated with the new scores can be modified using the target data. | 06-19-2014 |
20140156382 | Systems and Methods for Optimizing Distribution of Advertisement Information - In accordance with the teachings described herein, systems and methods are provided for optimizing distribution of advertisement information. In one example, call tracking data may be generated from a plurality of telephone calls made to a business entity, where the call tracking data includes geographical information to identify locations from which the plurality of telephone calls originated. A call distribution may be determined from the call tracking data, where the call distribution groups the call tracking data based at least in part on distances between the business entity and the locations from which the plurality of telephone calls originated. A probability density function may be generated from the call distribution, where the probability density function is for determining a probability that a telephone call will be received by the business entity in response to advertisement information delivered to a call location, and wherein the probability density function expresses the probability as a function of distance between the call location and the business entity. The probability density function may then be used in the generation of the advertisement distribution plan. | 06-05-2014 |
20140122401 | System and Method for Combining Segmentation Data - Systems and methods are provided for combining multiple segmentations into a single unique segmentation that contains attributes of the original segmentations. This new segmentation forms an ensemble or combination segmentation that has a unique set of attributes from the original segmentations without enumerating every possible set of combinations. In one example, two or more segments are combined into a single segmentation using a technique such as k-means clustering or Self-Organizing Map Neural Networks. After the first combination phase is performed, a Bayesian technique is then applied in a second phase to adjust or further alter the ensemble combination of segments. | 05-01-2014 |
20140122390 | Systems and Methods for Conflict Resolution and Stabilizing Cut Generation in a Mixed Integer Program Solver - Systems and methods for conflict resolution and stabilizing cut generation in a mixed integer linear program (MILP) solver are disclosed. One disclosed method includes receiving a mixed integer linear problem (MILP), the MILP having a root node and one or more global bounds; pre-processing the MILP, the MILP being associated with nodes; establishing a first threshold for a learning phase branch-and-cut process; performing, by one or more processors, the learning phase branch-and-cut process for nodes associated with the MILP, wherein performing the learning phase branch-and-cut process includes: evaluating the nodes associated with the MILP, collecting conflict information about the MILP, and determining whether the first threshold has been reached; responsive to reaching the first threshold, removing all of the nodes and restoring a root node of the MILP; and solving, with the one or more processors, the MILP using the restored root node and the collected conflict information. | 05-01-2014 |
20140089247 | Fast Binary Rule Extraction for Large Scale Text Data - Systems and methods for identifying data files that have a common characteristic are provided. A plurality of data files including one or more data files having a common characteristic are received. A potential rule is generated by selecting key terms from a list that satisfy a term evaluation metric, and the potential rule is evaluated using a rule evaluation metric. The potential rule is added to the rule set if the rule evaluation metric is satisfied. Based upon the potential rule being added to the rule set, data files covered by the potential rule are removed from the plurality of data files. The potential rule generation and evaluation steps are repeated until a stopping criterion is met. After the stopping criterion has been met, the rule set is used to identify other data files having the common characteristic. | 03-27-2014 |
20140067887 | Grid Computing System Alongside A Distributed File System Architecture - Systems and methods are provided for a grid computing system that performs analytical calculations on data stored in a distributed file system. A grid-enabled software component at a control node is configured to invoke the distributed file system software at the control node to provide block locations for a plurality of the data blocks; determine a configuration of data blocks to read at the plurality of worker nodes; instruct the grid-enabled software components at the plurality of worker nodes to retrieve locally stored data, perform an analytical calculation on the retrieved data, and send the results of the analytical calculation on the retrieved data to the grid-enabled software component at the control node; and assemble the results of the analytical calculations performed by the grid-enabled software components from the plurality of worker nodes. | 03-06-2014 |
20140059073 | Systems and Methods for Providing a Unified Variable Selection Approach Based on Variance Preservation - This disclosure describes a method, system and computer-program product for parallelized feature selection. The method, system and computer-program product may be used to access a first set of features, wherein the first set of features includes multiple features, wherein the features are characterized by a variance measure, and wherein accessing the first set of features includes using a computing system to access the features, determine components of a covariance matrix, the components of the covariance matrix indicating a covariance with respect to pairs of features in the first set, and select multiple features from the first set, wherein selecting is based on the determined components of the covariance matrix and an amount of the variance measure attributable to the selected multiple features, and wherein selecting the multiple features includes executing a greedy search performed using parallelized computation. | 02-27-2014 |
20140020069 | Authorization Caching in a Multithreaded Object Server - Systems and methods are included for accessing resource objects in a multi-threaded environment. A request is received from a requester to perform an operation with respect to a resource object, where the requested resource object has multiple associations with other objects. A determination as to whether an authorization cache entry corresponding to the requested resource object contains sufficient permission data for granting or denying the request for access to the requested resource object is made. A grant or deny of access to the requested resource object is returned when the authorization cache entry corresponding to the requested resource object contains sufficient permission data. | 01-16-2014 |
20140013192 | TECHNIQUES FOR TOUCH-BASED DIGITAL DOCUMENT AUDIO AND USER INTERFACE ENHANCEMENT - Techniques for digital document audio and user interface enhancement are generally described herein. In one embodiment, for example, an apparatus may comprise a processor circuit and a digital document application operative on the processor circuit. The digital document application may comprise a document recorder component arranged for execution by the processor circuit to receive a source document file and generate an annotated document file, the document recorder component arranged to retrieve a text element from the source document file, generate a user interface view with the text element and an audio narration guide proximate to the text element for presentation on an output device, receive positions of an object on the audio narration guide from an input device, and generate an audio element for the text element based on the positions. Other embodiments are described and claimed. | 01-09-2014 |
20130346350 | COMPUTER-IMPLEMENTED SEMI-SUPERVISED LEARNING SYSTEMS AND METHODS - Computer-implemented systems and methods for determining a subset of unknown targets to investigate. For example, a method can be configured to receive a target data set, wherein the target data set includes known targets and unknown targets. A supervised model such as a neural network model is generated using the known targets. The unknown targets are used with the neural network model to generate values for the unknown targets. Analysis with an unsupervised model is performed using the target data set in order to determine which of the unknown targets are outliers. A comparison of list of outlier unknown targets is performed with the values for the unknown targets that were generated by the neural network model. The subset of unknown targets to investigate is determined based upon the comparison. | 12-26-2013 |
20130339218 | Computer-Implemented Data Storage Systems and Methods for Use with Predictive Model Systems - Systems and methods for performing fraud detection. As an example, a system and method can be configured to contain a raw data repository for storing raw data related to financial transactions. A data store contains rules to indicate how many generations or to indicate a time period within which data items are to be stored in the raw data repository. Data items stored in the raw data repository are then accessed by a predictive model in order to perform fraud detection. | 12-19-2013 |
20130269027 | TECHNIQUES TO EXPLAIN AUTHORIZATION ORIGINS FOR PROTECTED RESOURCE OBJECTS IN A RESOURCE OBJECT DOMAIN - Techniques to explain authorization origins for protected objects in an object domain are disclosed. In one embodiment, for example, an apparatus may comprise a processor circuit, a request processor component operative on the processor circuit to receive and process a request for an authorization origin of a resource object, the authorization origin comprising an access control with a permission arranged to control access to the resource object based on an identity, and a resource origin component operative on the processor circuit to identify the authorization origin of the resource object from a set of interrelated resource objects and associated access controls, retrieve authorization origin information for the authorization origin, and present the authorization origin information in a user interface view. Other embodiments are described and claimed. | 10-10-2013 |
20130268318 | Systems and Methods for Temporal Reconciliation of Forecast Results - In accordance with the teachings described herein, systems and methods are provided for generating a forecast. A first forecast model of a first type is applied to generate first forecast results. A second forecast model of a second type is applied to generate second forecast results. The first and second forecast results are combined using an optimization model to generate combined forecast results, where the optimization model applies one or more constraints to preserve one or more attributes of the first or second forecast results in the combined forecast results. | 10-10-2013 |
20130262425 | TECHNIQUES TO PERFORM IN-DATABASE COMPUTATIONAL PROGRAMMING - Techniques to perform in-database computational programming. In one embodiment, for example, an apparatus may comprise a processor circuit and a client application operative on the processor circuit to generate a general request to perform an analytical calculation on data stored in a distributed database system based on a compute model, where the client application uses a threaded kernel service layer. The apparatus may also comprise a compute request component operative on the processor circuit to convert the general request to a compute request having a request format used by the distributed database system, and send the compute request to a node of the distributed database system having an analytic container. Other embodiments are described and claimed. | 10-03-2013 |
20130254780 | TECHNIQUES TO REMOTELY ACCESS OBJECT EVENTS - Techniques to remotely access object events are described. An apparatus may comprise a processor and a memory communicatively coupled to the processor. The memory may be operative to store a remote event bridge having a surrogate object that when executed by the processor is operative to allow an observer object for a first process to subscribe to an event of a subject object for a second process using the surrogate object. In this manner, the remote event bridge and the surrogate object operates as an interface between subject objects and observer objects without any modifications to either class of objects. Other embodiments are described and claimed. | 09-26-2013 |
20130238399 | Computer-Implemented Systems and Methods for Scenario Analysis - Computer-implemented systems and methods are provided for implementing a scenario analysis manager that performs multiple scenarios based upon time series data that is representative of transactional data are provided. A system and method provides candidate predictive models for a first scenario for selection where the set of candidate predictive models includes an identification of variables associated with a model. Model selection data is received from a scenario analysis manager where a selected model is configured to predict a future value of a first variable based on values of a second variable. Time series data is received representative of past transaction activity of the first variable and the second variable, and data representative of a future value of the second variable is also received. The future value of the first variable is determined using the selected model, the time-series data and the future value of the second variable. | 09-12-2013 |
20130159348 | Computer-Implemented Systems and Methods for Taxonomy Development - Systems and methods are provided for generating a set of classifiers. A location is determined for each instance of a topic term in a collection of documents. One or more topic term phrases are identified, and one or more sentiment terms within each topic term phrase. Candidate classifiers are identified by parsing words in the one or more topic term phrases, and a colocation matrix is generated. A seed row of the colocation associated with a particular attribute is identified, and distance metrics are determined by comparing each row of the colocation matrix to the seed row. A set of classifiers are generated for the particular attribute, where classifiers in the set of classifiers are selected using the distance metrics. | 06-20-2013 |
20130117652 | TECHNIQUES TO GENERATE CUSTOM ELECTRONIC FORMS - Techniques to generate custom electronic forms are described. An apparatus may comprise a logic device and an application having a form manager component. The form manager component may be operative on the logic device to manage one or more forms for a user interface of the application during a run-time mode of the application. The form manager component may have a custom prompt module operative to determine whether an application context interface is available for a dynamic form prompt of a form. The form manager component may determine whether a custom language interface is available for the dynamic form prompt when the application context interface is available. The form manager component may retrieve custom content in a custom presentation language for the dynamic form prompt when the custom language interface is available. Other embodiments are described and claimed. | 05-09-2013 |
20130080978 | TECHNIQUES TO PRESENT HIERARCHICAL INFORMATION USING A MULTIVARIABLE DECOMPOSITION VISUALIZATION - Techniques to present hierarchical information as orthographic projections are described. An apparatus may comprise an orthographic projection application arranged to manage a three dimensional orthographic projection of hierarchical information. The orthographic projection application may comprise a hierarchical information component operative to receive hierarchical information representing multiple nodes at different hierarchical levels, and parse the hierarchical information into a tree data structure, an orthographic generator component operative to generate a graphical tile for each node, arrange graphical tiles for each hierarchical level into graphical layers, and arrange the graphical layers in a vertical stack, and an orthographic presentation component operative to present a three dimensional orthographic projection of the hierarchical information with the stack of graphical layers each having multiple graphical tiles. Other embodiments are described and claimed. | 03-28-2013 |
20130055060 | TECHNIQUES TO REMOTELY ACCESS FORM INFORMATION - Techniques to remotely access form information are described. An apparatus may comprise a logic device and an application having a form manager component operative on the logic device to manage one or more forms for a user interface of the application. The form manager component may be arranged to generate a form with form information retrieved from a web service using a form information query. The form information query may comprise a data structure having static form configuration information, dynamic form configuration information, or a combination of static form configuration information and dynamic form configuration information. Other embodiments are described and claimed. | 02-28-2013 |
20120233573 | TECHNIQUES TO PRESENT HIERARCHICAL INFORMATION USING ORTHOGRAPHIC PROJECTIONS - Techniques to present hierarchical information as orthographic projections are described. An apparatus may comprise an orthographic projection application arranged to manage a three dimensional orthographic projection of hierarchical information. The orthographic projection application may comprise a hierarchical information component operative to receive hierarchical information representing multiple nodes at different hierarchical levels, and parse the hierarchical information into a tree data structure, an orthographic generator component operative to generate a graphical tile for each node, arrange graphical tiles for each hierarchical level into graphical layers, and arrange the graphical layers in a vertical stack, and an orthographic presentation component operative to present a three dimensional orthographic projection of the hierarchical information with the stack of graphical layers each having multiple graphical tiles. Other embodiments are described and claimed. | 09-13-2012 |
20120124100 | Grid Computing System Alongside a Distributed Database Architecture - Systems and methods are provided for a grid computing system that performs analytical calculations on data stored in a distributed database system. A grid-enabled software component at a control node is configured to invoke database management software (DBMS) at the control node to cause the DBMS at a plurality of the worker nodes to make available data to the grid-enabled software component local to its node; instruct the grid-enabled software components at the plurality of worker nodes to perform an analytical calculation on the received data and to send the results of the data analysis to the grid-enabled software component at the control node; and assemble the results of the data analysis performed by the grid-enabled software components at the plurality of worker nodes. | 05-17-2012 |
20120047519 | TECHNIQUES TO REMOTELY ACCESS OBJECT EVENTS - Techniques to remotely access object events are described. An apparatus may comprise a processor and a memory communicatively coupled to the processor. The memory may be operative to store a remote event bridge having a surrogate object that when executed by the processor is operative to allow an observer object for a first process to subscribe to an event of a subject object for a second process using the surrogate object. In this manner, the remote event bridge and the surrogate object operates as an interface between subject objects and observer objects without any modifications to either class of objects. Other embodiments are described and claimed. | 02-23-2012 |
20110307475 | TECHNIQUES TO FIND PERCENTILES IN A DISTRIBUTED COMPUTING ENVIRONMENT - Techniques to search for data elements in a distributed computing environment are described. An apparatus may comprise a processor and a memory unit communicatively coupled to the processor. The memory unit may store a correlation module that when executed by the processor is operative to determine a target rank position at a target percentile rank within a total data set. The correlation module may determine a target data item at the target rank position for the total data set using candidate data items at candidate rank positions for each of multiple sorted data subsets of the total data set, and correlation values associated with each of the candidate data items. Other embodiments are described and claimed. | 12-15-2011 |
20110035205 | TECHNIQUES TO AUTOMATICALLY GENERATE SIMULATED INFORMATION - Techniques to automatically generate simulated information are described. A method comprises receiving by a processor a structured input file with definitions to generate simulated data for a simulation database, and producing by the processor a data generator program based on the structured input file, the data generator program arranged to generate the simulated data for the simulation database using multiple data generating sessions executed concurrently or sequentially. Other embodiments are described and claimed. | 02-10-2011 |