Class / Patent application number | Description | Number of patent applications / Date published |
707688000 | Statistics maintenance | 49 |
20100082555 | MANAGEMENT DEVICE AND COMPUTER SYSTEM - A management device connected to a file server providing a computer with file data stored in a storage subsystem, and collects information about access to the file data. In the management device, a log of access to the file data stored in the storage subsystem is collected as access data, and the access data for the same file data is grouped. With such a configuration, a large amount of information about access to the file data can be easily used, and the amount of information can be compressed. | 04-01-2010 |
20100114840 | SYSTEMS AND ASSOCIATED COMPUTER PROGRAM PRODUCTS THAT DISGUISE PARTITIONED DATA STRUCTURES USING TRANSFORMATIONS HAVING TARGETED DISTRIBUTIONS - A data structure that includes at least one partition containing non-confidential quasi-identifier microdata and at least one other partition containing confidential microdata is formed. The partitioned confidential microdata is disguised by transforming the confidential microdata to conform to a target distribution. The disguised confidential microdata and the quasi-identifier microdata are combined to generate a disguised data structure. The disguised data structure is used to carry out statistical analysis and to respond to a statistical query is directed to the use of confidential microdata. In this manner, the privacy of the confidential microdata is preserved. | 05-06-2010 |
20100131472 | DETECTION AND UTILZATION OF INTER-MODULE DEPENDENCIES - Methods for detecting inter-module dependencies involve receiving by a software configuration control system check-in for each of a plurality of software components accompanied by check-in information consisting at least in part of defect information, which is utilized to identify coupling between any of the checked-in software components that were checked in together on a same defect and any of the checked-in software components that were checked in on a defect that was introduced by a defect in another software component. Warnings and reports are generated of a likely incidence of coupling between any of the software components identified as having been checked in together on a same defect, as well as between any of the software components identified as having been checked in on a defect that was introduced by a defect in another software component and such other software component. | 05-27-2010 |
20100169285 | AUTOMATED GROUPING OF MESSAGES PROVIDED TO AN APPLICATION USING STRING SIMILARITY ANALYSIS - Messages which are provided to an application are monitored. Similarities between the messages are determined based on a distance algorithm, in one approach, and messages which are similar are assigned to a common group. For example, the messages may be HTTP messages which include a URL, HTTP header parameters and/or HTTP post parameters. The messages are parsed to derive a string which is used in the distance calculations. Additionally, application runtime data such as response times is obtained and aggregated for the group. Further, a representative message can be determined for each group for comparison to subsequent messages. Results can be reported which include a group identifier, representative message, count and aggregated runtime data. | 07-01-2010 |
20100198796 | TREEMAP VISUALIZATIONS OF DATABASE TIME - Particular embodiments generally relate to displaying database time using a treemap. A set of database time values is determined for a set of dimensions. The database time values measure performance of one or more databases by the time the database takes to respond to a request. The set of database time values is correlated to a set of cells in the treemap. A size of the cell is determined based on the database time value associated with it. For example, the database time value is correlated to an area of the cell. A layout of the set of cells is determined and the treemap is displayed using the layout. For example, the largest values of database time may be shown with the largest sized cells. This makes it easier for an administrator to review and analyze the database performance across multiple dimensions and determine problem areas affecting the performance of the one or more databases. | 08-05-2010 |
20100274770 | TRANSDUCTIVE APPROACH TO CATEGORY-SPECIFIC RECORD ATTRIBUTE EXTRACTION - Disclosed are methods and apparatus for segmenting and labeling a collection of token sequences. A plurality of segments of one or more tokens in a token sequence collection are partially labeled with labels from a set of target labels using high precision domain-specific labelers so as to generate a partially labeled sequence collection having a plurality of labeled segments and a plurality of unlabeled segments. Any label conflicts in the partially labeled sequence collection are resolved. One or more of the labeled segments of the partially labeled sequence collection are expanded so as to cover one or more additional tokens of the partially labeled sequence collection. A statistical model, for labeling segments using local token and segment features of the sequence collection, is trained based on the partially labeled sequence collection. This trained model is then used to label the unlabeled segments and the labeled segments of the sequence collection so as to generate a labeled sequence collection. The labeled sequence collection is then stored as structured output records in a database. | 10-28-2010 |
20100293151 | ANALYZING INTERNAL HARD DRIVES AND CONTINUOUSLY AUTHENTICATING PROGRAM FILES WITHIN MICROSOFT OPERATING SYSTEM - A method of performing an analysis on internal hard drive(s), which includes an analysis of the file management system, individual files that exist on the hard drive, developing a Unique ID for each program (i.e., executable) file and continuously analyzing (i.e., scanning) the hard drive(s) to detect physical changes in the previously analyzed program files. The method may be implemented on a computer unit, which can be a 32/64-bit Microsoft PC O/S, or a 32/64-bit Microsoft Server O/S. | 11-18-2010 |
20100306179 | Using Information Usage Data to Detect Behavioral Patterns and Anomalies - Activity data is analyzed or evaluated to detect behavioral patterns and anomalies. When a particular pattern or anomaly is detected, a system may send a notification or perform a particular task. This activity data may be collected in an information management system, which may be policy based. Notification may be by way e-mail, report, pop-up message, or system message. Some tasks to perform upon detection may include implementing a policy in the information management system, disallowing a user from connecting to the system, and restricting a user from being allowed to perform certain actions. To detect a pattern, activity data may be compared to a previously defined or generated activity profile. | 12-02-2010 |
20110035363 | REAL-TIME DATABASE PERFORMANCE AND AVAILABILITY MONITORING METHOD AND SYSTEM - Database performance and availability monitoring of changes impacting database performance, availability and continuity to the underlying business may be performed. A method for doing so may include analytical and visual real-time analysis engines to identify and provide alert notifications on changes in database performance statistics (such as CPU consumption, physical I/O, etc.) related to a sample period of time on a single database or across multiple databases. Result data may be displayed through a series of charts and/or summary tables that may indicate whether correlations exist between unexpected database performance and relative changes in database performance statistical parameters. | 02-10-2011 |
20110040733 | SYSTEMS AND METHODS FOR GENERATING STATISTICS FROM SEARCH ENGINE QUERY LOGS - A computer-implemented method includes calculating first statistics about a user-identified event within a first subset of a database of events; selecting a second subset of the database of events based on said first statistics; calculating second statistics about the user-identified event within the second subset of the database of events; merging the first and second statistics as statistics of the user-identified event within the entire database of events; and generating a result including at least a portion of the merged statistics of the user-identified event. | 02-17-2011 |
20110055168 | SYSTEM, METHOD, AND COMPUTER-READABLE MEDIUM TO FACILITATE APPLICATION OF ARRIVAL RATE QUALIFICATIONS TO MISSED THROUGHPUT SERVER LEVEL GOALS - A system, method, and computer-readable medium that provide mechanisms for tracking the number of queries received for processing for a workload to facilitate arrival rate qualifications to Throughput Service Level Goals are provided. A number of queries counter associated with a particular workload is incremented each time a query assigned to the particular workload is received thereby tracking the arrival rate of workload queries. When a system performance condition comprising a non-compliant system performance level with respect to a Throughput Service Level Goal is identified, the number of queries counter is compared with the Throughput Service Level Goal. If the arrival rate of queries for the workload is greater than the Throughput Service Level Goal of the workload, actions associated with non-compliance of the Throughput Service Level Goal may then be performed. If the number of queries counter is less than or equal to the Throughput Service Level Goal, the preliminary identification of the missed Service Level Goal as a system performance condition event is dismissed or otherwise ignored. | 03-03-2011 |
20110060726 | TECHNIQUE TO GATHER STATISTICS ON VERY LARGE HIERARCHICAL COLLECTIONS IN A SCALABLE AND EFFICIENT MANNER - Techniques are provided for efficiently collecting statistics for hierarchically-organized collections of data. A database system leverages container-level modification time stamps and stored subtree-level change information to gather statistical information from only those resources in a hierarchical collection for which the statistics have changed since the last time that statistics were gathered for the collection. A lockless data structure is also described for storing the subtree-level change information in which an identifier corresponding to each subtree in a collection containing a changed resource may be stored. This data structure may be a table that is distinct from one or more tables representing the collection. In one embodiment of the invention, the immediate parent resource of a particular modified resource may be omitted from the subtree table by leveraging modification time stamps while gathering statistics based on tracked subtree-level information. | 03-10-2011 |
20110082839 | GENERATING INTELLECTUAL PROPERTY INTELLIGENCE USING A PATENT SEARCH ENGINE - A search platform that can generate intellectual property intelligence within an organization using a patent search engine. The patent search engine can monitor and log activity of users in connection with patent-related activities, such as searching, commenting on, and reviewing patent documents associated with a shared workspace of the organization. Based on this captured activity, the search engine can provide the organization with statistical information in patent-related activities occurring within the organization. | 04-07-2011 |
20110137874 | Methods to Minimize Communication in a Cluster Database System - An ordering of operations in log records includes: performing update operations on a shared database object by a node; writing log records for the update operations into a local buffer by the node, the log records each including a local virtual timestamp; determining that a log flush to write the log records in the local buffer to a persistent storage is to be performed; in response, sending a request from the node to a log flush sequence server for a log flush sequence number; receiving the log flush sequence number by the node; inserting the log flush sequence number into the log records in the local buffer; and performing the log flush to write the log records in the local buffer to the persistent storage, where the log records written to the persistent storage comprises the local virtual timestamps and the log flush sequence number. | 06-09-2011 |
20110196846 | PSEUDO-VOLUME FOR CONTROL AND STATISTICS OF A STORAGE CONTROLLER - Exemplary method, system, and computer program embodiments for facilitating information between at least one host and a storage controller operational in a data storage subsystem are provided. In one embodiment, a pseudo-volume, mappable to the at least one host and mountable as a filesystem, is initialized. The pseudo-volume is adapted for performing at least one of providing diagnostic and statistical data representative of the data storage subsystem to the at least one host, and facilitating control of at least one parameter of the storage controller. | 08-11-2011 |
20110276542 | MANAGEMENT OF LATENCY AND THROUGHPUT IN A CLUSTER FILE SYSTEM - Some embodiments of a system and a method to detect contention for resource in a cluster file system have been presented. For instance, a processing device may measure time spent performing actual operations by each of a set of nodes in a cluster file system when a respective node holds a lock on a resource and time spent performing overhead operations by the set of nodes without synchronization of clocks across the cluster file system. Then the processing device can determine latency and throughput of the cluster file system based on the time spent performing actual operations and the time spent performing overhead operations. | 11-10-2011 |
20110282847 | Methods and Systems for Validating Queries in a Multi-Tenant Database Environment - In accordance with embodiments, there are provided mechanisms and methods for validating queries. These mechanisms and methods for validating queries can enable embodiments to provide more reliable and faster execution of queries both in development and in production. In an embodiment and by way of example, a method for validating queries is provided. The method embodiment includes capturing a query that is directed to a multi-tenant database. A plan is determined by which the query will be applied to the database. The plan is statically analyzed for performance. Then a performance measure is applied to the query. | 11-17-2011 |
20110313977 | Systems and Methods for Reservoir Sampling of Streaming Data and Stream Joins - Algorithms and concepts for maintaining uniform random samples of streaming data and stream joins. These algorithms and concepts are used in systems and methods, such as wireless sensor networks and methods for implementing such networks, that generate and handle such streaming data and/or stream joins. The algorithms and concepts directed to streaming data allow one or more sample reservoirs to change size during sampling. When multiple reservoirs are maintained, some of the algorithms and concepts periodically reallocate memory among the multiple reservoirs to effectively utilize limited memory. The algorithms and concepts directed to stream joins allow reservoir sampling to proceed as a function of the probability of a join sampling operation. In memory limited situations wherein memory contains the sample reservoir and a join buffer, some of the stream join algorithms and concepts progressively increase the size of the sampling reservoir and reallocate memory from the join buffer to the reservoir. | 12-22-2011 |
20120209817 | Methods, Systems, and Products for Maintaining Data Consistency in a Stream Warehouse - Methods, systems, and products characterize consistency of data in a stream warehouse. A warehouse table is derived from a continuously received a stream of data. The warehouse table is stored in memory as a plurality of temporal partitions, with each temporal partition storing data within a contiguous range of time. A level of consistency is assigned to each temporal partition in the warehouse table. | 08-16-2012 |
20120271801 | DATA PROCESSING METHOD AND DEVICE - The invention concerns a method of processing data to provide output data based on a group of data samples having a time stamp falling within at least one rolling time period, the method comprising: receiving a new data sample (D | 10-25-2012 |
20130080401 | SYSTEM FOR HIERARCHICAL INFORMATION COLLECTION - According to one embodiment, an information collection apparatus includes an information collector, a database, a shift width estimator, and a collection controller. The collector collects time-series data from a start time with a period of collection. The database accumulates the time-series data. The shift width estimator detects a loss of data in the time-series data and estimates a shift width corresponding to a difference between a time the loss has occurred and a collection time of data of an outlier. The collection controller obtains a correction value of the start time and a correction value of the collection period to eliminate the shift width, and controls the collection of the time-series data. | 03-28-2013 |
20130124484 | Persistent flow apparatus to transform metrics packages received from wireless devices into a data store suitable for mobile communication network analysis by visualization - A persistent flow apparatus maintains a datamart store with up-to-date transformations of packages as the packages are received from wireless recording devices. Each flow apparatus generates measures in a format which can be interactively analyzed along certain dimensions. A persistent flow is stateful to incrementally process metrics packages over multiple collection periods which are not correlated with the times the metrics are recorded at the device. A persistent flow is data driven by the receipt of new packages received from wireless recording devices having selected attributes and ignores unqualified packages. | 05-16-2013 |
20130144844 | COMPUTER SYSTEM AND FILE SYSTEM MANAGEMENT METHOD USING THE SAME - The present invention provides a computer system capable of providing the latest statistics on a file system in real time. The computer system includes a memory having a program for executing statistics on a file system, and a controller for executing the program. When the file system is accessed, the controller compares first statistic information before access processing on the file system with second statistic information after the access processing; and if it is determined that any difference exists between the first statistic information and the second statistic information, the controller updates the statistic result of the file system based on the difference. | 06-06-2013 |
20130198147 | DETECTING STATISTICAL VARIATION FROM UNCLASSIFIED PROCESS LOG - A system and associated method for detecting a statistical variation of a process from a textual log of the process as performed by a process behavior analysis (PBA) system for monitoring the process operating in an Information Technology (IT) delivery system. The PBA system includes a PBA engine and a data storage storing exception rules used by the PBA engine. The PBA engine merges entities appearing in the textual log into one or more groups based on similarities of respective time series of the entities. Control charts are generated for merged entities and the PBA engine subsequently analyzes process behavior of the process by use of the control charts for exceptions defined in the stored exception rules. The PBA engine generates a PBA report for the process pursuant to the analysis result of the textual log with detailed information including to what type of exceptions had or had not occurred. | 08-01-2013 |
20130246369 | METHOD AND SYSTEM FOR STORING AND RETRIEVING DATA - A method of accessing data in a database management system includes in a first storage structure, storing one or more datasets for each of plural devices, each dataset comprising unordered timestamped data elements indicating statuses of a particular device related condition at different points in time, in a second storage structure, storing and mapping between: device identifiers identifying the devices, condition identifiers identifying the device related conditions, and timestamps identifying when two temporally consecutive data elements of a given dataset indicate different statuses, receiving a new data element indicating a status of one particular device related conditions, storing the new data element in the first storage structure, determining whether or not the status indicated by the new data element is different from the status indicated by a temporally preceding data element stored in first storage structure, and updating the timestamps if the determination is positive. | 09-19-2013 |
20130325816 | APPARATUS AND METHOD FOR EVALUATING DATA POINTS AGAINST CADASTRAL REGULATIONS - A system for evaluating data points against cadastral regulations to include a plurality of software modules programmed into a computer system with software and hardware configured to store and update a cadastral rule database containing a plurality of rules for determining the validity of the cadastral data. The cadastral database obtained from a data source reference data that is indicative of a plurality of established reference data points wherein the received input data corresponds to a plurality of measured data points with steps to co-process the input data and the referenced data according to the plurality of cadastral rules to determine an indication for the plurality of data points. | 12-05-2013 |
20140012816 | EVALUATION APPARATUS, DISTRIBUTED STORAGE SYSTEM, EVALUATION METHOD, AND COMPUTER READABLE RECORDING MEDIUM HAVING STORED THEREIN EVALUATION PROGRAM - An evaluation apparatus includes: a calculation unit configured to calculate the evaluation value of the evaluation target content by using an evaluation value estimation algorithm, based on a count value for the evaluation target content and a sum value of respective count values for the plurality of contents; a verification unit configured to verify whether the sum value of the respective count values for the plurality of contents reaches a predetermined value; and a processing unit configured to reduce the respective count values of the plurality of contents, when the sum value of the respective count values for the plurality of contents reaches the predetermined value, and is capable of detecting a sudden data spike at high speed in the evaluation value estimating algorithm. | 01-09-2014 |
20140012817 | Statistics Mechanisms in Multitenant Database Environments - Statistics mechanisms in multitenant database environments. A master statistics file is maintained in a multitenant database system. The master statistics file has statistics corresponding to multiple tenants within the multitenant database system. Statistics for a selected table within the multitenant database system are generated. The selected table corresponding to a selected tenant of the multitenant database system. The master statistics file is updated based on the generated statistics for the selected table. | 01-09-2014 |
20140074797 | METHODS AND SYSTEMS FOR GENERATING A BUSINESS PROCESS CONTROL CHART FOR MONITORING BUILDING PROCESSES - Systems and methods generate building process summary data depicting a process over time. A method includes receiving a process value and attribute information. The method includes calculating statistical moments for the received data. The method includes retrieving a “where used” database list for a specific process. The method further includes determining if received attribute information matches database record attributes. Where there is a match, the method includes storing calculated statistical moments for the received data into a current database record. While the received attribute information matches additional database record attributes according to the “where used” database list, the method includes storing calculated statistical moments for the received data into additional database records as building process summary data. | 03-13-2014 |
20140122443 | Method, Apparatus and Computer Program for Detecting Deviations in Data Repositories - Techniques for detecting deviations in data repositories comprising a plurality of data posts, each data post comprising a number of data attribute values. A method comprises identifying comparable data post pairs, each pair comprising first and second data posts in first and second data repositories, respectively, wherein the first data post in a pair is comparable with the second data post of the same pair. Data attribute values of the first data post are compared with data attribute values of the second data post within a plurality of comparable data post pairs to determine quantified similarities between the data attribute values of the first and second data posts of each comparable data post pair. Statistical values of the quantified similarities are calculated by comparing comparable determined quantified similarities for each data post pair, and the calculated statistical values are used for detecting deviations for a first comparable data post pair. | 05-01-2014 |
20140207740 | Isolating Resources and Performance in a Database Management System - Techniques for tenant performance isolation in a multiple-tenant database management system are described. These techniques may include providing a reservation of server resources. The server resources reservation may include a reservation of a central processing unit (CPU), a reservation of Input/Output throughput, and/or a reservation of buffer pool memory or working memory. The techniques may also include a metering mechanism that determines whether the resource reservation is satisfied. The metering mechanism may be independent of an actual resource allocation mechanism associated with the server resource reservation. | 07-24-2014 |
20140214774 | SYSTEM AND METHOD FOR MANAGING CONTENT - A method, computer program product, and computer system for assigning an action to execute on content based upon, at least in part, an occurrence of a statistical event. Statistics associated with a corresponding portion of the content published on one or more websites is received. The occurrence of the statistical event with respect to the corresponding portion of the content is determined based upon, at least in part, receiving the statistics. The action on the content is executed based upon, at least in part, determining the occurrence of the statistical event with respect to the corresponding portion of the content. | 07-31-2014 |
20140297600 | Bayesian Sleep Fusion - Systems and methods to estimate a subject's sleep status over time by applying data-fusion algorithms to sleep data sets collected from multiple sleep data sources are disclosed. Embodiments employ Bayes' Theorem to combine sleep data from actigraphy, sleep diary, direct observation, sleep schedules, work schedules, performance tests, neurobehavioral tests and/or the like. Particular embodiments assign data error characteristics to each source, determine likelihoods of correct reporting of sleep status from each source, and apply Bayesian analysis to each source-specific likelihood to determine an overall sleep status estimate. Data error characteristics may account, without limitation, for data insertion errors, data deletion errors, and sleep timing errors. Heuristics may be also used to correct common errors found within collected sleep data and/or to infer sleep status from atypical sources of sleep data. Particular embodiments may also use the combined sleep status estimate for fatigue prediction utilizing various biomathematical fatigue models. | 10-02-2014 |
20140310249 | DATABASE COST TRACING AND ANALYSIS - Web services hosted at a data center may employ architectural patterns that tend to obfuscate the source of queries made against databases and other resources in the data center. The queries may be the source of performance, capacity or utilization problems, and may contribute to the cost of hosting the web service. Web service invocations may be associated with identifiers that can be included in modified queries sent to databases and other resources. Summarized cost information may be calculated based on recorded associations between the identifiers and query performance information. | 10-16-2014 |
20140317065 | REFERENCE COUNTER INTEGRITY CHECKING - Disclosed is a method for checking the integrity of a reference counter for objects in a file system. A unique identifier can be associated with the reference referring to the object. A reference check can be associated with the object and set to a predefined initial value before any references referring to the object are added. When a new reference referring to the object is added, the reference counter is increased by one and the identifier associated with the new reference is added to the reference check. When an existing reference referring to the object is about to be removed, the reference counter is decreased by one and the identifier associated with the existing reference is subtracted from the reference check. If the reference check is not equal to the initial value when the reference counter is zero, then an error message is sent to the file system. | 10-23-2014 |
20140337298 | METHOD AND SYSTEM FOR CONSTRUCTING AND PRESENTING A CONSUMPTION PROFILE FOR A MEDIA ITEM - A server and device for constructing and presenting a consumption profile for a media item are provided. In general, consumption of a media item by a number of first users is tracked. Thereafter, before and/or during playback of the media item by a second user, a consumption profile for the media item is constructed and presented to the second user. | 11-13-2014 |
20150058299 | METHODS AND APPARATUS FOR METERING PORTABLE MEDIA PLAYERS - Example methods, apparatus, and articles of manufacture to collect metering information associated with media presented by portable and computer media presentation devices are disclosed. A disclosed example method accesses a first data structure including metering information generated by at least one of first media presentation software executed by a personal computer or second media presentation software executed by a portable media presentation device. The metering information is associated with media presented by at least one of the first or second media presentation software. At least some of the metering information is then extracted from the first data structure and stored in a second data structure. The at least some of the metering information is then communicated from the second data structure to a data collection facility. | 02-26-2015 |
20150066869 | Module Database with Tracing Options - A database of module performance may be generated by adding tracing components to applications, as well as by adding tracing components to modules themselves. Modules may be reusable code that may be made available for reuse across multiple applications. When tracing is performed on an application level, the data collected from each module may be summarized in module-specific databases. The module-specific databases may be public databases that may assist application developers in selecting modules for various tasks. The module-specific databases may include usage and performance data, as well as stability and robustness metrics, error logs, and analyses of similar modules. The database may be accessed through links in module description pages and repositories, as well as through a website or other repository. | 03-05-2015 |
20150112949 | SYSTEMS AND METHODS FOR ANONYMIZING AND INTERPRETING INDUSTRIAL ACTIVITIES AS APPLIED TO DRILLING RIGS - Various systems and methods are disclosed for making and using an anonymized database for an industrial enterprise, such as oilfield operations. Providing statistical performance indicators for groupings of an activity in the oilfield allow for the information in confidential data sets to be shared without compromising the confidentiality of any one data entry. Comparisons may be made between or among oilfields with differing technologies, differing rig configurations, or even different crews when sufficient data are available. | 04-23-2015 |
20150149418 | ESTIMATION OF QUERY INPUT/OUTPUT (I/O) COST IN DATABASE - Disclosed herein are system, method, and computer program product embodiments for calibrating and using a stable storage model. An embodiment operates by generating, by a central computer, an access request for a stable storage, wherein the access request comprises a plurality of page accesses; measuring a cost to execute the access request on the stable storage; amortizing the cost over the plurality of page accesses; and calibrating, by the central computer, a stable storage model based on the amortized cost. | 05-28-2015 |
20150149419 | Techniques for Automatic Data Placement with Compression and Columnar Storage - For automatic data placement of database data, a plurality of access-tracking data is maintained. The plurality of access-tracking data respectively corresponds to a plurality of data rows that are managed by a database server. While the database server is executing normally, it is automatically determined whether a data row, which is stored in first one or more data blocks, has been recently accessed based on the access-tracking data that corresponds to that data row. After determining that the data row has been recently accessed, the data row is automatically moved from the first one or more data blocks to one or more hot data blocks that are designated for storing those data rows, from the plurality of data rows, that have been recently accessed. | 05-28-2015 |
20150379052 | PARALLEL MATCHING OF HIERARCHICAL RECORDS - Identifying matching transactions between two log files. First and second log files contain operation records of transactions in a transaction workload. The first and second log files are split into first and second corresponding partition files, based on distinct sequences of operation record types beginning operation records of the transactions in each of the log files. A record location in a first partition file, and a window of sequential record locations in a corresponding second partition file at a defined offset relative to the record location in the first file are advanced one record location at a time. If each operation record of a complete transaction at a record location in a first file has a matching record in the associated window of record locations in a second file, the corresponding transactions match. | 12-31-2015 |
20160004621 | PROACTIVE IMPACT MEASUREMENT OF DATABASE CHANGES ON PRODUCTION SYSTEMS - Database change test system that includes an SQL performance analyzer (SPA) to efficiently test execute workload set of queries on a production or test database management system (DBMS) and report to a user the impact of database changes, is provided. Techniques are described that limit the resource consumption of test execution of workload set of queries, especially to enable such test execution on a production DBMS. A method and apparatus for storing in persistent storage query test profile that specifies query test restrictions and execution plan parameters, which indicate how to generate execution plan operators for query execution plans; storing a workload set of queries in persistent storage; establishing a session with a database management systems; retrieving the query test profile; configuring the session according to the test profile; receiving user input requesting to execute the workload set; and executing the queries according to the query test profile, is also provided. | 01-07-2016 |
20160034502 | Automatic Detection of Potential Data Quality Problems - Technical solutions for detection potential data quality problems are provided. In some implementations, a method includes: automatically without human intervention, identifying a subset of side effect data associated with a set of enterprise data. The side effect data include a plurality of data fields. The method further includes: selecting a first set of data quality detection rules in accordance with a first data field in the plurality of data fields; identifying one or more candidate data quality problems in the set of side effect data by comparing the set of side effect data to the first set of data quality detection rules; and responsive to identifying the one or more candidate data quality problems: causing to be displayed to a user: information representing the one or more candidate data quality problems; and one or more candidate solutions for correcting the one or more candidate data quality problems. | 02-04-2016 |
20160034517 | WEB BASED DATA MANAGEMENT - Approaches are provided for assessing and displaying data. An approach includes determining one or more aggregate measures of data quality for data. The approach further includes assessing an overall data quality for the data based on the determined one or more aggregate measures of data quality. The approach further includes displaying the data, the determined one or more aggregate measures of data quality, and the assessed overall data quality. | 02-04-2016 |
20160092490 | STORAGE APPARATUS AND DATA MANAGEMENT METHOD - A storage apparatus and data management method capable of utilizing storage resources effectively is proposed. A storage apparatus storing primary data and analysis data obtained based on the primary data as a result of the execution of specified analysis processing by an external computing system is designed so that metadata of the analysis data includes regeneratable attribute information indicating whether or not the corresponding analysis data can be regenerated by means of the analysis processing by the external computing system; and a control unit regularly or irregularly selects the analysis data, which satisfies a specified condition and can be regenerated, based on the metadata for each piece of the analysis data and deletes the selected analysis data from one or more storage devices. | 03-31-2016 |
20160117336 | CONCURRENT ACCESS AND TRANSACTIONS IN A DISTRIBUTED FILE SYSTEM - Embodiments described herein provide techniques for maintaining consistency in a distributed system (e.g., a distributed secondary storage system). According to one embodiment of the present disclosure, a first set of file system objects included in performing the requested file system operation is identified in response to a request to perform a file system operation. An update intent corresponding to the requested file system operation is inserted into an inode associated with each identified file system object. Each file system object corresponding to the inode is modified as specified by the update intent in that inode. After modifying the file system object corresponding to the inode, the update intent is removed from that inode. | 04-28-2016 |
20160162536 | STATISTICS MECHANISMS IN MULTITENANT DATABASE ENVIRONMENTS - Statistics mechanisms in multitenant database environments. A master statistics file is maintained in a multitenant database system. The master statistics file has statistics corresponding to multiple tenants within the multitenant database system. Statistics for a selected table within the multitenant database system are generated. The selected table corresponding to a selected tenant of the multitenant database system. The master statistics file is updated based on the generated statistics for the selected table. | 06-09-2016 |
20160378634 | AUTOMATED VALIDATION OF DATABASE INDEX CREATION - Automated validation of the creation of indices in an environment that include multiple and potential many databases, such as perhaps a cloud computing environment. A validation module validates index impact of a created index by using a validation data store that contains validation data originating from a database collection. Index impact may be estimated by evaluating validation data generated prior to and after the creation of the index to thereby determine whether the created index results in overall improved query performance on the database collection for those queries that target the newly indexed database entity. Such validation data need not even contain private data that was contained within the query itself, and might be, for instance, query performance data, or execution plans associated with the query, with private data redacted. | 12-29-2016 |