Clustering and grouping

Subclass of:

707 - Data processing: database and file management or data structures

707705000 - DATABASE AND FILE ACCESS

707736000 - Preparing data for information retrieval

Patent class list (only not empty are listed)

Deeper subclasses:

Class / Patent application number	Description	Number of patent applications / Date published
707740000	Cataloging	647
707738000	Based on topic	314
707739000	Latent semantic index or analysis (LSI or LSA)	165

Document	Title	Date
Entries
20100049766	System, Method, and Computer Program for a Consumer Defined Information Architecture - A system, computer program, and method for organizing and managing data structures including based on input from a feedback agent is provided, the method including: (a) a method for faceted classification that is applicable to a domain of information, said method of faceted classification including: (i) a facet analysis of said domain or receiving the results of facet analysis of the domain; and (ii) applying a faceted classification synthesis of said domain; and (b) a complex-adaptive method for selecting and returning information, on one or more iterations, from said faceted classification synthesis, said complex-adaptive method varying the organizing and managing of data structures in response to said returned information. A system and method for faceted classification of a domain of information is also provided that includes providing a faceted data set including facet attributes with which to classify information, such facet attributes including optionally facet attribute hierarchies for the facet attributes; (b) providing a dimensional concept taxonomy in which the facet attributes are assigned to objects of the domain to be classified in accordance with concepts that associate meaning to the objects, said concepts being represented by concept definitions defined using said facet attributes and associated with the objects in the dimensional concept taxonomy, said dimensional concept taxonomy expressing dimensional concept relationships between the concept definitions in accordance with the faceted data set; and (c) providing or enabling a complex-adaptive system for selecting and returning dimensional concept taxonomy information to vary the faceted data set and dimensional concept taxonomy in response to the dimensional concept taxonomy information. In another aspect of the method of the present invention the method for faceted classification of the domain of information further includes performing faceted classification synthesis to relate a set of concepts represented by concept definitions defined in accordance with a faceted data set including facet attributes, and optionally facet attribute hierarchies. The invention also provides a computer system for enabling a user to manipulate dimensional concept relationships. A further aspect of the system is a system for organizing and managing data structures including based on input from a feedback agent, in which the system includes or is linked to a complex-adaptive system for selecting and returning dimensional concept taxonomy information to vary a faceted data set and a dimensional concept taxonomy in response to dimensional concept taxonomy information, the dimensional concept taxonomy expressing dimensional concept relationships between the concept definitions in accordance with the faceted data set.	02-25-2010
20100049767	SYSTEM AND METHOD FOR EVALUATING A COLLECTION OF PATENTS - System and method evaluate a collection of patents. A set of citations between the patent in the collection is established and a density associated with the collection of patents as a function of the set of citations is determined.	02-25-2010
20100057804	Method and Apparatus for partitioning high-dimension vectors for use in a massive index tree - A method and apparatus for partitioning high-dimension vectors for use in a massive index tree have been disclosed.	03-04-2010
20100057805	System, Method and Computer-Readable Medium for Providing Pattern Matching - A system, method and computer-readable medium are disclosed for identifying representative data using sketches. The method embodiment comprises generating a plurality of vectors from a data set, modifying each of the vectors of the plurality of vectors and selecting one of the plurality of generated vectors according to a comparison of a summed distance between a modified vector associated with the selected generated vector and remaining modified vectors. Modifying the generated vectors may involve reduced each generated vector to a lower dimensional vector. The summed distance then represents a summed distance between the lower dimensional vector and remaining lower dimensional vectors.	03-04-2010
20100070502	Collision Free Hash Table for Classifying Data - Methods, systems, and computer program products are used for classifying data using a collision free hash table. In an example, a respective category index for each of a set of categories is determined. A respective class counter for each of the categories based on the respective category index is generated. A respective event index for each of a set of events associated with captured data based on respective first event values are determined substantially simultaneously in parallel. Selected ones of the respective class counters based on the respective event indices are incremented substantially simultaneously in parallel.	03-18-2010
20100070503	IDENTIFYING PRODUCT ISSUES USING FORUM DATA - Product issues are identified through an analysis of forum data stored in a forum database. Forum threads are identified within the forum data and clustered together by grouping related forum threads. Once the forum threads have been clustered, the clustered forum threads can be analyzed to identify product issues. Once the product issues have been identified, steps may be taken in an attempt to resolve the identified issues.	03-18-2010
20100070504	GLOBAL INFORMATION NETWORK ARCHITECTURE - The present invention provides a Global Information Architecture (GINA) to create an object-oriented, software-based modeling environment for the modeling of various data sources and allowing queries and transactions across those sources. The modeling environment is described in itself. Introspection is achieved since the model is described in the model, and early validation that the infrastructure is correct is established in that the infrastructure must execute against itself. Object traversal is done via vectors that describe how an object can be reached from other objects. Objects are linked by describing what type of object (data source) is to be reached and on the basis of what possible attribute values of that object. GINA allows different users to have different views of these data sources depending upon their WorldSpace. A user's view of the data source is controlled by his WorldSpace, which are the attributes he has that makes him unique. These attributes can include (among others) his username, roles, language, locale, and organization. These WorldSpace views can also impact the behavior of the data sources. GINA allows for object to object event driven behavior and provides a configuration centric versus coding centric methodology for integrating those various data sources.	03-18-2010
20100076976	Method of Automatically Tagging Image Data - The present invention provides a method of automatic tagging of image data taken at geographical locations, which represent points of interest (POI). In addition to the image data, each image file contains image metadata, which is structured information about the image data resources. The metadata of a POI includes its title, description, associated keywords, geographical identification data, etc., and it describes the POI as a resource of information. Each POI has its unique identifier that connects it to a record in a database, which contains the metadata of a plurality of POIs. The auto-tagging method, subject to the present invention, identifies the POI where image data has been taken. It then retrieves POI metadata from the database or reads it directly from an information tag placed within the POI, and assigns this metadata to all the image files containing image data taken at the POI. In particular, this invention focuses on barcode representation of the unique POI identifier (UPIID) and of its metadata (if the metadata is read directly during the image taking process). The implementation of this method does not require additional devices for location detection (such as GPS), and is applicable to all digital cameras regardless of type and complexity.	03-25-2010
20100076977	System and Method for analyzing and reporting extensible data from multiple sources in multiple formats - A system and method for analyzing and reporting data from multiple sources is provided. The system is a foundation for an analytical platform that covers not only traditional relational data, but also a new generation of extensible data formats designed for the web, such as those based on XML (FIXML, FpML, ebXML, XBRL, ACORD, etc.), as well as HTML, E-mail, Excel, PDF, and others. In a preferred embodiment, the eXtensible on-line analytical processing (XOLAP), is a scalable client/server platform that allows the multi-dimensional analysis of modern data types, as well as traditional relational data, by bringing them all into an internal common XML-based model, without the time and expense of creating a data warehouse.	03-25-2010
20100082623	ITEM CLUSTERING - Methods and system for item clustering are described. In one embodiment, compatibility data may be accessed for an item. The compatibility data may include a plurality of parent items with which the item is compatible. A particular parent item within the compatibility data may be identified. An item cluster for the item and an additional item may be created based on compatibility of the item and the additional item with the particular parent item within the compatibility data. A compatibility identifier may be associated with the item cluster. The compatibility identifier may be associated with the parent item. Additional methods and systems are disclosed.	04-01-2010
20100082624	SYSTEM AND METHOD FOR CATEGORIZING DIGITAL MEDIA ACCORDING TO CALENDAR EVENTS - System and method for categorizing digital media based on correspondence between characteristics of individual digital media items and characteristics associated with one or more calendar events is disclosed. Data is acquired and processed for each of a plurality of digital media items that is representative of characteristics of each of the respective digital media items. Further, data is acquired and processed for each of a plurality of calendar events that is representative of characteristics of each of the respective calendar events. Then, a group of digital media items are related together based on matching characteristics of each digital media item in the group to characteristics of a calendar event.	04-01-2010
20100082625	METHOD FOR MERGING DOCUMENT CLUSTERS - A method for merging document clusters includes the following steps. An association graph among document clusters is established. The association graph is an oriented graph. Each document cluster is represented by one node in the association graph, and each node is searched in a pair-wise manner. An oriented edge is established between any two nodes having associated weights there-between reaching a preset value. An arrow of the oriented edge points to a node capable of serving as a descriptor for the other node. An associated weight is assigned to the oriented edge to represent an association degree between the two nodes. Any two document clusters that can serve as a descriptor for each other and have an association degree there-between reaching a preset threshold value are merged into a single document cluster.	04-01-2010
20100082626	METHOD FOR FILTERING OUT IDENTICAL OR SIMILAR DOCUMENTS - A method for filtering out identical or similar documents includes storing a plurality of documents to be filtered as a pat tree (PT) data structure profile based on a pat tree data structure, searching for all string nodes with a consecutive character length reaching a lower threshold in the PT profile and all documents to which the string nodes belong, and finding documents having identical consecutive characters with a length reaching a higher threshold from the documents. Another technical solution includes searching for all string nodes with a consecutive character length reaching a lower threshold in the PT profile and all documents to which the string nodes belong, and finding documents having identical consecutive characters with such a length that a ratio of the length of the identical consecutive characters to a total character length of the original document reaches a ratio threshold from the documents, these documents are similarity.	04-01-2010
20100088315	EFFICIENT LARGE-SCALE FILTERING AND/OR SORTING FOR QUERYING OF COLUMN BASED DATA ENCODED STRUCTURES - The subject disclosure relates to querying of column based data encoded structures enabling efficient query processing over large scale data storage, and more specifically with respect to complex queries implicating filter and/or sort operations for data over a defined window. In this regard, in various embodiments, a method is provided that avoids scenarios involving expensive sorting of a high percentage of, or all, rows, either by not sorting any rows at all, or by sorting only a very small number of rows consistent with or smaller than a number of rows associated with the size of the requested window over the data. In one embodiment, this is achieved by splitting an external query request into two different internal sub-requests, a first one that computes statistics about distribution of rows for any specified WHERE clauses and ORDER BY columns, and a second one that selects only the rows that match the window based on the statistics.	04-08-2010
20100088316	METHOD AND SYSTEM FOR MANAGING RECENT DATA IN A MOBILE DEVICE LINKED TO AN ON-DEMAND SERVICE - Systems and methods for managing recent data items in a database. A method typically includes determining whether a data object managed by an on demand service is designated as able to be accessed by a user at a mobile device and storing locally at a mobile device a plurality of most recently used items viewed for a data object designated as able to be accessed by a user at a mobile device. The method also typically includes determining a single most recently used set from among the stored plurality of most recently used items viewed for at least one data object designated as able to be accessed by a user at a mobile device.	04-08-2010
20100088317	METHOD AND APPARATUS FOR HARVESTING FILE SYSTEM METADATA - A harvester is disclosed for harvesting metadata of managed objects (files and directories) across file systems which are generally not interoperable in an enterprise environment. Harvested metadata may include 1) file system attributes such as size, owner, recency; 2) content-specific attributes such as the presence or absence of various keywords (or combinations of keywords) within documents as well as concepts comprised of natural language entities; 3) synthetic attributes such as mathematical checksums or hashes of file contents; and 4) high-level semantic attributes that serve to classify and categorize files and documents. The classification itself can trigger an action in compliance with a policy rule. Harvested metadata are stored in a metadata repository to facilitate the automated or semi-automated application of policies.	04-08-2010
20100094871	SYSTEM AND METHOD FOR PROVIDING GLOBAL INFORMTION ON RISKS AND RELATED HEDGING STRATEGIES - The present invention provides a system and method for information and data aggregation and analysis which provides risk managers, benefits managers, brokers, insurers and other insurance professional to have access to information resources, knowledge management tools, and powerful analytical models needed to increase their value and productivity. In accordance with one embodiment of the invention, the system and method provided is designed for information and data aggregation that allows for the compilation of data for mining and categorization by a knowledge management system, which stores all retrieved information in accordance with categories provided by a categorization engine referred to as a Taxonomy module. A contextualization module configured to retrieve relevant information, based on various factors, including the user's profile, and the user's particular task. The system dynamically provides relevant information as the user interacts and conducts various tasks. The stored information is analyzed by a concept clustering module, so that various concepts relating to a particular topic can be uncovered and stored. In accordance with another embodiment of the invention, the system provides for various analytical tools that allow users to carry on with highly complex analysis of insurance related topics. The range of available analytical tool dynamically varies based on the user's needs and research topics. In accordance with yet another embodiment of the invention, the system provides for a unique interactive workspace that combines the features explained above in a logical manner. To this end, the system interface provides for various job templates, so as to enable the user's to carry various projects by a template driven task assignments. As the user navigates through the workspace, the range of available information to the user chances, based on the user's profile and navigation pattern.	04-15-2010
20100094872	METHOD AND SYSTEM FOR IDENTIFICATION OF OBJECTS - Objects are identified on the basis of location and one or more characteristics by a user device and a service product. The service product provides a service, with which objects can be identified. The object to be identified is positioned, the user device is connected to the service and the service fetches information on the basis of the position of the object and its characteristic(s) from a database. The fetched information is presented to the user device. The user device presents the characteristic(s) of the object to be identified by sending a picture of the object to be identified to the service product, in which picture the service product reads the characteristics of the object and conducts additional searches in the database based on the characteristics read from the picture.	04-15-2010
20100094873	SYSTEM AND METHOD FOR AUGMENTING CONTENT IN ELECTRONIC DOCUMENTS WITH LINKS TO CONTEXTUALLY RELEVANT INFORMATION - An electronic document and associated system, methods and apparatus is described. The electronic document is loaded in a user device configured to communicate with an external device that generates instructions for augmenting content contained in the electronic document with links to contextually relevant information. The content can be augmented with one or more user interface elements, and the augmented content can be displayed with one or more attributes which can be selected by a document author. The document author can mark or otherwise designate one or more portions of the electronic document to be excluded from the augmenting process.	04-15-2010
20100106723	METHOD AND SYSTEM OF CLUSTERING FOR MULTI-DIMENSIONAL DATA STREAMS - A method for clustering multi-dimensional data streams includes: when data elements are input, determining 1-D subclusters and assigning identifiers to the determined 1-D subclusters; (b) generating a matching set that is a set of identifiers of the 1-D subclusters where each dimensional value of the data elements belongs to the range of the 1-D subclusters of the corresponding dimensions; and (c) determining subclusters by finding a set of frequently co-occurring 1-D subclusters among a set of 1-D subclusters that belong to the generated matching set. With the present invention, the processing time required to find the subclusters can be improved and the performance of the memory is further improved.	04-29-2010
20100106724	Fuzzy Data Operations - A method for clustering data elements stored in a data storage system includes reading data elements from the data storage system. Clusters of data elements are formed with each data element being a member of at least one cluster. At least one data element is associated with two or more clusters. Membership of the data element belonging to respective ones of the two or more clusters is represented by a measure of ambiguity. Information is stored in the data storage system to represent the formed clusters.	04-29-2010
20100106725	STORAGE APPLIANCE OBJECT ORIENTED SYSTEM AND METHOD - The present invention involves a storage device system and method which receives and stores complex data. The storage device includes a bulk storage for storing the complex data, a descriptor storage for storing descriptive data relating to the complex data, and a service module with a processor and software. The software enables the processor to receive the complex data and derive descriptive data relating to the complex data. Further, the software also enables the processor to organize and store descriptive data in the descriptor storage. The storage device thus may receive the complex data, derive descriptive data relating to the complex data from the complex data, and organize and store the descriptive data.	04-29-2010
20100114886	RELEVANCE CONTENT SEARCHING FOR KNOWLEDGE BASES - Embodiments of the present invention provide a novel and non-obvious method, server and computer program product for finding relevant content in a knowledge base. A method for finding items that are related to a user selected item in a knowledge base is provided. The method can include generating a first list of knowledge base items with a defined relationship to the user selected item and generating a second list of knowledge base items that belong to the same category as a category of the user selected item. The method can further include generating a third list of knowledge base items having one or more tags identical to one or more tags of the user selected item and selecting a first set of knowledge base items that are present in the first, second or third lists. The method can further include displaying the first set of knowledge base items as most relevant.	05-06-2010
20100114887	Textual Disambiguation Using Social Connections - The subject matter of this specification can be embodied in, among other things, a computer-implemented method that includes receiving a request to provide a dictionary for a computing device associated with a user; identifying word usage information for members of a social network for the user; and generating, with the word usage information for members of the social network, a dictionary for the user.	05-06-2010
20100114888	DIGITAL IMAGE RETRIEVAL BY AGGREGATING SEARCH RESULTS BASED ON VISUAL ANNOTATIONS - An approach for responding to a text-based query for a digital image is provided. A request that identifies one or more keywords is received. A number of annotated digital images are selected. Each selected annotated digital image has a bounded region, on its appearance, that has an annotation associated with at least one of the keywords. A set of candidate digital images is selected for each annotated digital image. The set of candidate images, for a particular annotated digital image, are the digital images, of a set of digital images, which have an appearance that is most similar to the particular annotated digital image. The sets of candidate images are aggregated into a single set of digital images. A response is generated that identifies those digital images in the single set of digital images which are most responsive to the one or more keywords.	05-06-2010
20100114889	REMOTE VOLUME ACCESS AND MIGRATION VIA A CLUSTERED SERVER NAMESPACE - A system and method that provides users of network data storage systems with the ability to gain the advantages of a clustered storage server system, in which volumes stored on multiple server nodes are linked into a virtual global hierarchical namespace, without first having to migrate their data to the clustered storage server system. The system employs an extended virtual global hierarchical namespace that allows client systems to access, via the extended global namespace, volumes stored on the clustered storage server system and on one or more storage servers that are remote from and do not constitute a part of the clustered system. The extended global namespace can also be employed to perform migration of volume data among the multiple nodes of the clustered storage server system and the remote storage servers.	05-06-2010
20100114890	System and Method for Discovering Latent Relationships in Data - A computerized method of querying an array of vectors includes receiving a first matrix, partitioning the first matrix into a plurality of subset matrices, and processing each subset matrix with a natural language analysis process to create a plurality of processed subset matrices. The first matrix includes a first plurality of terms and represents one or more data objects to be queried, each subset matrix includes similar vectors from the first matrix, and each processed subset matrix relates terms in each subset matrix to each other.	05-06-2010
20100114891	PHOTOGRAPH GROUPING DEVICE, PHOTOGRAPH GROUPING METHOD AND PHOTOGRAPH GROUPING PROGRAM - Even when a character of an event to be photographed or a user's photographing disposition varies, photographs can be grouped with high precision.	05-06-2010
20100114892	INTRODUCING SYSTEM, INTRODUCING METHOD, INFORMATION RECORDING MEDIUM, AND PROGRAM - Provided is an introducing system in which a server device introduces users of terminal devices to each other while motivating the users to join the system by presenting appropriate information to them during a wait time before they receive introduction. When a terminal device requests introduction of another terminal device during a time slot (	05-06-2010
20100114893	EVENT SEARCHING - Events can be searched by identifying a query that includes a time interval and a search component, determining a time increment associated with the time interval, and partitioning the time interval into partitions based on the time increment. For each partition, a relevance of each event in a collection of events that occur at a time in the partition is determined based on the query. A pre-determined number of the relevant events are displayed.	05-06-2010
20100121850	METHODS AND SYSTEMS FOR MINING WEBSITES - Mining of websites that in one embodiment includes obtaining web usage data of user sessions of a website, wherein the website has a hierarchical structure with granular levels and has mapping from each webpage of the website into the hierarchical structure, mapping the user sessions to the hierarchical structure of the website resulting in hierarchical user sessions, initiating an edit distance metrics to determine similarity in the hierarchical user sessions, and clustering similar hierarchical user sessions into groups.	05-13-2010
20100121851	Method for clustering of large high-dimensional datasets - The present invention is a method for clustering data points. The method represents data-points as vertices of a graph (a well-known mathematical construct) with distance-weighted arcs (lines joining each paid of points). The method then involves sorting the arcs in increasing order of their weights and adding them in ascending order, at each stage determining the number of connected components in the graph and the length of the longest added edge. The longest edge is a measure of the quality of the clustering (low values are good), and the connected components are the clusters.	05-13-2010
20100121852	APPARATUS AND METHOD OF ALBUMING CONTENT - A content albuming apparatus to automatically album content includes a storage unit to store user information and content, a clustering information generation unit to generate event-based clustering information to cluster the content using the user information, and a clustering unit to cluster the content according to the event-based clustering information. Accordingly, it is possible to automatically album content with greater precision and user satisfaction.	05-13-2010
20100121853	QUERY GENERATION FOR A CAPTURE SYSTEM - A document accessible over a network can be registered. A registered document, and the content contained therein, is not transmitted undetected over and off of the network. In one embodiment, the invention includes a manager agent to maintain signatures of registered documents and a match agent to detect the unauthorized transmission of the content of registered documents.	05-13-2010
20100121854	Creation of Electronically Processable Signature Files - Systems and methods can automatically generate and process signature files for an electronic signature list. Data records can be periodically searched for signature-relevant status changes. A multiplicity of documents in paper form can be provided. Each document can contain a predefined blank region for receiving a personal signature and also control information items assigned to the signature. The multiplicity of documents that have received the personal signatures can be scanned-in in a batch processing operation. At least one signature containing the personal signature in electronically processable form and a representation of the assigned control information items file can be generated for each document. The assigned control information items of each document can be independent of their corresponding personal signature in its electronically processable form. The signature files can be dispatched via a communications network controlled by the control information items.	05-13-2010
20100125580	AUTOMATIC BUDDY MANAGEMENT - Exemplary embodiments of methods and apparatuses to provide automatic buddy management are described. One or more tags associated with a user on an instant messaging (IM) network are determined. One or more groups are organized based on the one or more tags associated with the user. The one or more tags associated with the user are communicated to the IM network. The one or more groups associated with the user may be displayed on a display. One or more tags associated with one or more other users on the IM network may be received. The one or more other users may be included into the one or more groups. One or more new groups may be created based on the one or more tags associated with one or more other users.	05-20-2010
20100125581	METHODS AND SYSTEMS FOR PRODUCING A VIDEO SYNOPSIS USING CLUSTERING - Computer-implemented method, system, and techniques for summarization, searching, and indexing of video are provided, wherein data related to objects detected in the video in a selected time interval is received and the objects are clustered into clusters such that each cluster includes objects that are similar in respect to a selected feature or a combination of features. A video summary is generated based on the computed clusters.	05-20-2010
20100131506	ASSOCIATION RULE EXTRACTION METHOD AND SYSTEM - An association rule is extracted by processing a database partitioned into record units in which the same attribute is missing, from a database including missing values. The association rule is extracted from the database including the missing by means for partitioning a database so that a database including a missing analysis object becomes record blocks in which the same attribute is missing, and means for estimating an upper threshold of a support value in the entire database from local support counts in partitioned databases and thereby limiting records for which the support count is counted.	05-27-2010
20100131507	PERSONALIZATION ENGINE FOR BUILDING A DYNAMIC CLASSIFICATION DICTIONARY - A dynamic classification dictionary is built for use in profiling and targeting users for additional relevant content. Behavioral data is gathered from user activity, and user documents and actions are categorized. Author-generated document classification information is analyzed and assigned a first taxonomic noun to characterize the document. User-generated tags characterizing a portion of the document are assigned a second taxonomic noun. Search terms that resulted in the user accessing the document are identified and assigned a third taxonomic noun. Attributes related to the manner in which the document was accessed are evaluated and assigned a fourth taxonomic noun. The document is processed using pattern rules to extract a fifth taxonomic noun. The taxonomic nouns are aggregated into a composite set of taxonomic nouns, and the dynamic classification dictionary is build by storing the composite set of taxonomic nouns.	05-27-2010
20100131508	METHOD AND APPARATUS FOR ADDING A DATABASE PARTITION - A data repository system and method are provided. A method in accordance with an embodiment includes an operation that can be used to port data from one or more existing database partitions to new database partitions according to a minimally progressive hash. The method can be used to increase the overall size of databases while a system runs hot, with little or no downtime.	05-27-2010
20100138419	Method of Providing Moving Picture Search Service and Apparatus Thereof - A method of providing a moving picture search service and an apparatus therefor are disclosed. An embodiment of the present invention provides a method of providing a moving picture search service that includes: obtaining a search keyword from a client terminal; and providing as search results a moving picture cluster list displaying information about a moving picture cluster matching with the search keyword. The moving picture cluster includes a plurality of moving pictures determined to have identity, and the moving picture cluster list includes a cluster unit display area displaying information about the moving picture cluster differently from information about another moving picture cluster included in the moving picture cluster list. By providing moving picture search results related to a search keyword in groups of cluster units, users can understand the search results more easily.	06-03-2010
20100138420	VISUALIZING RELATIONSHIPS BETWEEN DATA ELEMENTS - In general, a specification of multiple contexts that are related according to a hierarchy is received. Relationships are determined among three or more metadata objects, and at least some of the metadata objects are grouped into one or more respective groups. Each of at least some of the groups is based on a selected one of the contexts and is represented by a node in a diagram. Relationships among the nodes are determined based on the relationships among the metadata objects in the groups represented by the nodes, and a visual representation is generated of the diagram including the nodes and the relationships among the nodes.	06-03-2010
20100138421	IDENTIFYING INADEQUATE SEARCH CONTENT - Systems and methods for identifying inadequate search content are provided. Inadequate search content, for example, can be identified based on statistics associated with the search queries related to the content.	06-03-2010
20100145948	METHOD AND DEVICE FOR SEARCHING CONTENTS - Disclosed are a method and a device for searching contents by using time information or spatial information. The device for contents search includes a memory unit configured to store contents having spatial information and time information as search information and to further store groups into which the contents are classified by the spatial information or the time information. The device further includes a display unit configured to display a time information search tool and a spatial information search tool in response to receipt of a request for a contents search is received, and to further display the contents belonging to a searched group. Also the device includes an input unit configured to receive an input of search information and a control unit configured to search a group having the selected search information.	06-10-2010
20100145949	METHODS AND SYSTEMS FOR MANAGING DATA - Systems and methods for managing data, such as metadata or indexes of content of files. In one exemplary method, notifications to update a metadata database or an index database are combined into a combined notification. According to other aspects, an order among logical locations on a storage device is determined in order to specify a sequence for scanning for files to be indexed. According to another aspect, a method includes determining whether to index a file based on a path name of the file relative to a plurality of predetermined path names.	06-10-2010
20100153393	Constructing album data using discrete track data from multiple sources - A method and a system are provided for constructing album data using discrete track data from multiple sources. In one example, the system identifies one or more set of tracks having a similar album title, wherein the one or more set of tracks are obtained from one or more client devices. The system then searches across the one or more set of tracks for tracks having a matching fingerprint and a matching album title. The system groups tracks that match according to an original album title in metadata to obtain grouped tracks. The system mines across the grouped tracks to generate a juxtaposition of track data from the one or more client devices. The system then generates album data for one or more albums based on the juxtaposition of track data.	06-17-2010
20100153394	Method and Apparatus for Reclassifying E-Mail or Modifying a Spam Filter Based on Users' Input - A method is disclosed including passing a plurality of e-mails through a spam filter and classifying at least of the plurality of e-mails as not spam. Thereafter, the plurality of e-mails are received at each of a plurality of user computers. The method may further include receiving a plurality of reports, the plurality of reports including one report from each of the plurality of user computers that one or more of the plurality of e-mails are spam that was not classified as spam by the spam filter. Based on the plurality of reports, one or more of the plurality of e-mails is reclassified as spam and/or the spam filter is modified.	06-17-2010
20100153395	Method and Apparatus For Track and Track Subset Grouping - A method comprises storing real-time multimedia data in a plurality of tracks and/or track subsets; and identifying one or more multi-track groups, each multi-track group being associated with a relationship among one or more of the plurality of tracks and/or track subsets.	06-17-2010
20100153396	NAME INDEXING FOR NAME MATCHING SYSTEMS - Methods, systems and computer software program code products enabling the matching of a large number of names across any of a range of different languages comprise: receiving incoming names in any of a set of languages or scripts; generating high-recall keys based on the received incoming names; executing a full-text index process based on the generated high-recall keys; and looking up candidates for matching.	06-17-2010
20100153397	MAINTAINING A RELATIONSHIP BETWEEN TWO DIFFERENT ITEMS OF DATA - Data is stored persistently. At least two different items of the data are stored in two different non-conflicting regions or two different physical clusters. A relationship is maintained between the two different items of data. The relationship enables a process to reach any one of the data items from the other data item. Consistency of the relationship is maintained notwithstanding updates of either or both of the items.	06-17-2010
20100153398	LEVERAGING CONCEPTS WITH INFORMATION RETRIEVAL TECHNIQUES AND KNOWLEDGE BASES - Various embodiments are described which leverage techniques for breaking down critical ideas from an inputted phrase into concepts in order to provide a response that is more relevant to the inputted phrase. In this regard, concepts and/or concept patterns are utilized with information retrieval searching to provide more relevant and concise documents in response to an inputted phrase. In addition, concepts and/or concept patterns are utilized with respect to assessing information (e.g., documents) available in a knowledge base and building appropriate pre-defined responses to an inputted phrase.	06-17-2010
20100153399	WINDOW GROUPiNG - A framework is provided for obtaining window information. The window information can be applied to different assignment models to assign windows to different groups. A group may correspond to a task being performed by a user. The window information can be semantic or temporal information captured as window events and properties of windows whose events are captured. Temporal information can be information about switches between windows. Semantic information can be window titles. Temporal information, semantic information, or both, can be used to assign windows to groups.	06-17-2010
20100153400	SYSTEMS AND METHODS FOR RATIONAL SELECTION OF CONTEXT SEQUENCES AND SEQUENCE TEMPLATES - Provided are systems and methods for rational selection of context sequences and sequence templates including a computer implemented method for obtaining a repository of attributes sets where the attributes sets are statistically associated with a sequence template representing two or more context sequences.	06-17-2010
20100161607	SYSTEM AND METHOD FOR ANALYZING GENOME DATA - A system and method for analyzing genome data includes receiving genome analysis data generated by a genome analysis device, such as a microarray scanner, reducing the genome analysis data, and transmitting the reduced genome analysis data over a wide area network to a client computer. The reduced genome analysis data may provide a summary of the unreduced genome analysis data. One of several methods may be used to reduce the genome analysis data for transmittal over the wide area network.	06-24-2010
20100161608	METHODS AND APPARATUS FOR CONTENT-AWARE DATA DE-DUPLICATION - The systems and methods partition digital data units in a content aware fashion without relying on any ancestry information, which enables one to find duplicate chunks in unrelated units of digital data even across millions of documents spread across thousands of computer systems.	06-24-2010
20100161609	Method and device for clustering categorical data and identifying anomalies, outliers, and exemplars - One aspect of the invention is a method for assigning categorical data to a plurality of clusters. An example of the method includes identifying a plurality of categories associated with the data. This example also includes, for each category in the plurality of categories, identifying at least one element associated with the category. This example also includes specifying a number of clusters to which the data may be assigned. This example additionally includes assigning at least some of the data, wherein each assigned datum is assigned to a respective one of the clusters. This example further includes, for at least one of the clusters, determining, for at least one category, the frequency in data assigned to the cluster of at least one element associated with the category. Further, some examples of the invention provide for detecting outliers, anomalies, and exemplars in the categorical data.	06-24-2010
20100161610	QUERY RESTRICTION FOR TIMELY AND EFFICIENT PAGING - Systems and methods are presented for retrieving records from a database and presenting them to a user through a timely and efficient query restricting process. The query request is then modified through the use of a determined partitioning field and a modified query which partitions the field relative to a partitioning value. Records are retrieved from the database. A small set of records is presented to the user, as is a prompt to retrieve more records. An application which receives query requests determines: restricting fields, partition size and whether or not the partition is within a predetermined range. The application returns a data set and receives requests for more records. These systems and methods provide a storage efficient solution that is particularly useful for maintaining a time efficient user response for a dynamic database.	06-24-2010
20100169318	CONTEXTUAL REPRESENTATIONS FROM DATA STREAMS - A user's experience with internet content may be given semantic meaning based upon extracting features of the content and creating kind classifications from the features. Kind classifications may be used to enrich a user's experience with internet content by providing meaningful navigation and discovery of information. As provided herein, a data stream (e.g., HTML, audio, video, unstructured data, etc.) is received, and features (e.g., text, phrases, titles, paragraphs, image data, etc.) may be extracted from the data stream. Kind classifications may be created based upon the extracted features. For example, a shirt image kind classification may be created based upon a button image feature, a collar image feature, and a sleeve image feature. The user's experience may be enriched by a presentation of actions allowing the user to view similar shirts, purchase the shirt, and/or discover other information relating to the shirt, for example.	07-01-2010
20100169319	Verification of Data Categorization - Verification and categorization of data in a system that interfaces with common knowledge repositories having different application programming interfaces. The system inputs a data tree structure with categories of information. The relationships in the data tree are queried against common knowledge repositories. A report of potentially erroneous categorizations in the data tree may be output for further review.	07-01-2010
20100169320	METHOD AND SYSTEM FOR EMAIL SEARCH - A method and system for performing email search, the said method comprising of enabling the user to find relations between emails and build network relations and to further retrieve groups based on the relations (and intersections of relations) as per the user's choice; the system comprising of giving and having the user select predetermined options for a search with a further ability to “drill-down” the results with the aid of filters to view further mails/results, and being also able to search on search results and also provide for storing user searches.	07-01-2010
20100174714	DATA MINING USING AN INDEX TREE CREATED BY RECURSIVE PROJECTION OF DATA POINTS ON RANDOM LINES - The present invention relates to a method computer program product for datamining with constant search time, the method and computer program product comprises the steps of: traversing a search tree to a leave, retrieving a one or more data store identifier from said leave, read data pointed to by said data store identifier, locating one or more value in said data, referencing one or more data descriptor, retrieve the n-nearest data descriptor neighbors, terminate said search.	07-08-2010
20100174715	GENERATING DOCUMENT TEMPLATES THAT ARE ROBUST TO STRUCTURAL VARIATIONS - A template or wrapper tree for a document such as a web page is generalized from the bottom up (from leaf toward root of a logical tree structure of the template). At a given level in the tree, sub-trees are clustered and the clustered sub-trees are generalized, and the process is repeated at a next higher level in the tree, resulting in a generalized template or wrapper tree. This can be done by generating a nested pattern regular expression based on the sub-tree clusters, merging sub-trees based on the nested pattern regular expression, and then replacing sub-trees in a tree-based regular expression of the template or wrapper at the given level with the merged sub-trees. This process is repeated at a next higher level of the tree (progressing from leaf towards root) until the wrapper or tree-based regular expression that represents the template is fully generalized.	07-08-2010
20100174716	METHODS AND SYSTEMS FOR IMPROVING TEXT SEGMENTATION - Methods and systems for improving text segmentation are disclosed. In one embodiment, at least a first segmented result and a second segmented result are determined from a string of characters, a first frequency of occurrence for the first segmented result and a second frequency of occurrence for the second segmented result are determined, and an operable segmented result is identified from the first segmented result and the second segmented result based at least in part on the first frequency of occurrence and the second frequency of occurrence.	07-08-2010
20100185618	Techniques For Specifying And Collecting Data Aggregations - Data records containing one or more fields, which can be considered keys and/or values, are received, and processed such that data values of records that contain key values of interest are aggregated together. The keys of the resultant aggregations or “resultant keys” are created under the control of simple parameters to an aggregation framework. Similarly, the particular aggregations performed are also under the control of a similar set of simple parameters to the aggregation framework. Mapping of keys to reduce originality is one of the important features of resultant key creation. Finally, the structure of the parameters used to control aggregation is simple, flexible, and powerful.	07-22-2010
20100191731	METHODS AND SYSTEMS FOR AUTOMATIC CLUSTERING OF DEFECT REPORTS - One embodiment of the invention provides a method of grouping defects. The method includes the steps of obtaining a plurality of defect reports, preprocessing the defect reports, and applying a clustering algorithm, thereby grouping the defect reports. Another embodiment of the invention provides a computer-readable medium whose contents cause a computer to perform a method comprising: obtaining a plurality of defect reports; preprocessing the defect reports; and applying a clustering algorithm, thereby grouping the defect reports. Another aspect of the invention provides a system for grouping defect reports. The system includes: a preprocessing module, a representation module in communication with the preprocessing module, and a clustering module in communication with representation module.	07-29-2010
20100191732	DATABASE FOR A CAPTURE SYSTEM - A tag database storing tags indexing captured object can be searched efficiently. In one embodiment, such a search begins by receiving a query for one or more objects captured by a capture system, and determining whether a query time range exceeds a time range of a set of fast tables. In one embodiment, the invention further includes searching the set of fast tables if the query time range does not exceed the time range of the fast tables, the set of fast tables containing tags having meta-data related to captured objects. In one embodiment, the invention further includes searching a set of hourly tables if the query time range does exceed the time range of the fast tables. In one embodiment, the present invention further includes searching a set of daily tables if the query time range also exceeds the time range of the hourly tables.	07-29-2010
20100191733	MUSIC LINKED PHOTOCASTING SERVICE SYSTEM AND METHOD - A music linked photocasting service system and method are provided. The music linked photocasting service method includes reproducing music at the request of a user, analyzing a mood of the reproduced music at prescribed times, until music reproduction is completed, searching photographs suitable for a analyzed mood of the music, and displaying the searched photographs.	07-29-2010
20100198826	MAINTAINING A HISTORICAL RECORD OF ANONYMIZED USER PROFILE DATA BY LOCATION FOR USERS IN A MOBILE ENVIRONMENT - A system and method are provided for maintaining a historical record of anonymized user profile data for mobile device users. In one embodiment, a central system, which includes one or more servers, operates to obtain current locations and user profiles for users of mobile devices. The central system processes the current locations and the user profiles of the users over time to maintain a historical record of anonymized user profile data by location. By anonymizing the user data, privacy of the users of the mobile devices is maintained. The central system may then use the historical record of anonymized user profile data to respond to historical requests. The historical requests may be made by users of the mobile devices, subscribers, and/or third-party services.	08-05-2010
20100205176	Discovering City Landmarks from Online Journals - A blog-based city landmark discovery framework is described to discover and summarize popular scenes and their representative views from blog photos to provide online personalized tourist suggestions. First, a location extraction algorithm is implemented to infer geographical associations of blog photos from their contextual descriptors, thus providing the ability to harvest city scene photos from web blogs. Second, a visual-textual hierarchical clustering scheme is adopted to organize crawled photos into a scene-view structure, and present a PhotoRank algorithm to discover representative views within each scene by viewing the representative photo selection problem as a popularity ranking problem in a visual correlation environment. Third, author, context and content issues are evaluated in a unified Landmark-HITS model to discover representative scenes as well as build author correlations. The author correlations further facilitate a collaborative filtering process for online personalized tourist suggestions based on an author's previous travel logs.	08-12-2010
20100205177	OBJECT IDENTIFICATION APPARATUS AND METHOD FOR IDENTIFYING OBJECT - An object identification apparatus includes an image data input unit configured to input captured image data including an object, an object identification data generation unit configured to generate data for identifying the object by extracting a feature vector from a partial area of the input image data to convert the feature vector according to the partial area, an object dictionary data storage unit configured to store object dictionary data generated from previously recorded image data, and an object identification unit configured to identify a class to which the object belongs, which is included in the image data input by the image data input unit, based on the data for identifying the object and the object dictionary data.	08-12-2010
20100205178	DATA MANAGEMENT SYSTEM AND METHOD TO HOST APPLICATIONS AND MANAGE STORAGE, FINDING AND RETRIEVAL OF TYPED ITEMS WITH SUPPORT FOR TAGGING, CONNECTIONS, AND SITUATED QUERIES - A data management method to host applications and manage storage, finding and retrieval of typed items with support for tagging, connections, and situated queries is provided.	08-12-2010
20100217763	METHOD FOR AUTOMATIC CLUSTERING AND METHOD AND APPARATUS FOR MULTIPATH CLUSTERING IN WIRELESS COMMUNICATION USING THE SAME - An automatic clustering method using an Average-linkage algorithm and a KPower Means algorithm, and a method and apparatus for multi-path clustering required for a spatial channel modeling (SCM) in a wireless communication environment are provided. The automatic clustering method, including: a first step of obtaining an initial cluster centroid using a hierarchical clustering algorithm; a second step of moving the initial cluster centroid using a two dimensional clustering algorithm; a third step of clustering a data set according to the moved initial cluster centroid; and a fourth step of calculating a validation index with respect to the clustered data set and determining an optimal number of clusters.	08-26-2010
20100223264	REPORTING INCLUDING FILLING DATA GAPS AND HANDLING UNCATEGORIZED DATA - A reporting system is described herein that allows a report author to declare data reporting structures that specify to a reporting application how to dynamically categorize data with changing or potentially unknown characteristics. The reporting system may extend RDL and the data grouping provided by Microsoft SQL Server Reporting Services by adding new elements to the XML-based RDL schema. The reporting system allows the report author to specify for the system to fill gaps in the data, so that the report has a similar layout even as data changes from period to period. The reporting system also allows the report author to specify whether data that does not fit any predefined group bucket is displayed in a report. Thus, the reporting system allows unsophisticated database users to define reports that group data consistently regardless of missing values or other changes in the underlying data.	09-02-2010
20100228731	LARGE GRAPH MEASUREMENT - As provided herein, a pairwise distance between nodes in a large graph can be determined efficiently. URL-sketches are generated for respective nodes in an index by extracting labels from respective nodes, which provide a reference to a link between the nodes, aggregating the labels into sets for respective nodes, and storing the sets of labels as URL-sketches. Neighborhood-sketches are generated for the respective nodes in the index using the URL-sketches, by determining a neighborhood for a node and generating a sketch using labels that are associated with the respective neighboring nodes. A distance between two nodes is determined by computing an approximate number of paths and an approximate path length between the two nodes, using the neighborhood sketches for the two nodes.	09-09-2010
20100228732	INFORMATION OFFERING APPARATUS AND METHOD - An information offering apparatus and an information offering method are provided. The information offering apparatus is configured to arrange information, which is generated as a user uses services, according to a time period, group the arranged information, and then display the arranged information together with the time period.	09-09-2010
20100235356	ORGANIZATION OF SPATIAL SENSOR DATA - A measurement of an object from which data is collected may be determined. A scale of the object may be determined by determining the absolute or relative magnitude of the object in comparison to a magnitude of surrounding objects such as the total magnitude of the illustration. An appropriate container shape and size for the object may be determined by searching for a container size with a scale similar to the scale of the object. The object may be stored in a database with the appropriate container shape, size and the scale being attributes.	09-16-2010
20100235357	CAPACITY MANAGEMENT FOR DATA NETWORKS - A method of processing capacity information is disclosed. The capacity information relates to data capacity in a data network in which a consumer circuit is carried on, and consumes bandwidth made available by, a bearer circuit. The method comprises storing, in a network information database, an entity representing the bearer circuit, and associating capacity information with the bearer circuit entity specifying a first bandwidth quantity defining a quantity of bandwidth made available by the bearer circuit. Also stored is an entity representing the consumer circuit, and capacity information is associated with the consumer circuit entity specifying a second bandwidth quantity defining a quantity of bandwidth allocated to the consumer circuit. The consumer capacity information is then associated with the bearer capacity information in the database to indicate that the second bandwidth quantity allocated to the consumer circuit is to be consumed from the first bandwidth quantity made available by the bearer circuit. The resulting capacity model can be used to support service provisioning, service assurance and SLA management, network engineering and network planning processes.	09-16-2010
20100241627	INFORMATION RETRIEVAL APPARATUS - An information retrieval device includes an input operation unit that can input a character through an operation of a user, a database that stores data of a plurality of character strings beforehand, an information extraction unit that checks a character string inputted by the input operation unit with the data of the character strings stored in the database, and extracts the data of a character string corresponding to the character string inputted by the input operation unit from the database, and a display unit that displays the character string extracted by the information extraction unit, wherein the information extraction unit performs the check by replacing each character in at least one of the character string inputted by the input operation unit and the character strings stored in the database with a character contained in a classified character group predetermined in compliance with an attribute of each character.	09-23-2010
20100250536	Method for integrating road names and points of interest in source data - The present invention is directed to a method for integrating road names recorded in a source data, particularly comprising steps of merging all interconnected segments having this road name into a user road group and merging these user road groups if they belongs to the same physical road entity. The present invention also relates to a method for integrating points of interest recorded in a source data, particularly comprising steps of, beginning from node with highest priority in the tree to the other nodes one at a time, finding nodes with same point of interest to redirect all child links of lower priority node to higher priority node among the same point of interest nodes and delete link between the same point of interest nodes; creating geometry relationship based on above new arrangement.	09-30-2010
20100250537	METHOD AND APPARATUS FOR CLASSIFYING A CONTENT ITEM - Newly created personal classes can be incorporated into classification of a content item, step	09-30-2010
20100250538	ELECTRONIC DISCOVERY SYSTEM - Embodiments of the invention relate to systems, methods, and computer program products for improved electronic discovery and custodian management. Embodiments herein disclosed provide for an enterprise wide e-discovery system that provides for data to be identified, located, retrieved, preserved, searched, reviewed and produced in an efficient and cost-effective manner across the entire enterprise system. In addition, by structuring management of e-discovery based on case/matter, custodian and data and providing for linkage between the same, further efficiencies are realized in terms of identifying, locating and retrieving data and leveraging results of previous e-discoveries with current requests.	09-30-2010
20100250539	Shape based picture search - The present application relates to a method for implementing picture search and a website server thereof. A method for implementing picture search includes: classifying, according to keywords in advance in a picture database, corresponding pictures by shape of objects in the pictures, and determining a sample picture for each shape type; wherein, after a server receives a picture search request sent from a client, the method includes: searching, by the server, in the picture database for the sample picture of several shape types classified in advance corresponding to the keywords in said search request, and returning, to the client, the searched sample picture of the several shape types; receiving, by the server, the sample picture of a certain shape type determined by the client, and searching, in the picture database for the pictures which correspond to said keywords and satisfy a predetermined request with the characteristic value of said determined sample pictures; returning, by the server, said found pictures to the client. The present application enables the user to search pictures of similar shapes according to the shape types, thereby satisfying the user's search demands.	09-30-2010
20100250540	METHOD FOR MANAGING A RELATIONAL DATABASE OF THE SQL TYPE - A method is provided for managing a relational database of the SQL type for information technology and network infrastructure service information, including a method in which the following are created, in a system for managing a database of the MySQL type,	09-30-2010
20100250541	TARGETED DOCUMENT ASSIGNMENTS IN AN ELECTRONIC DISCOVERY SYSTEM - Embodiments of the invention relate to systems, methods, and computer program products for improved electronic discovery. More specifically, embodiments relate to computer program products for targeted document review assignments by determining concept-related data groupings within the overall corpus of data associated with a case and assembling the targeted document review assignments based on the concept-related data groupings. As such, document reviewers are presented with assignments that have highly conceptually-related documents, which results in further efficiency in the review process.	09-30-2010
20100250542	DATA CLASSIFICATION METHOD AND DATA CLASSIFICATION DEVICE - A separation surface set storage part stores information defining a plurality of separation surfaces which separate a feature space into at least one known class region respectively corresponding to at least one known class and an unknown class region. Each of the at least one known class region is separated from outside region by more than one of the plurality of separation surfaces which do not intersect to each other. A data classification apparatus determine a classification of a classification target data whose inner product in the feature space is calculable by calculating to which region of the at least one known class region and the unknown class region determined by the information stored in the separation surface set storage part the classification target data belongs. A method and apparatus for data classification which can simultaneously perform identification and outlying value classification with high reliability in a same procedure are provided.	09-30-2010
20100250543	EFFICIENT HANDLING OF MULTIPART QUERIES AGAINST RELATIONAL DATA - A query having multiple parts may be processed to form an intermediate results set. This intermediate results set may be partitioned into a plurality of groups. Thereafter, the groups may be sorted into a plurality of containers so that each container contains data sufficient to calculate one requested result in the multipart query. Related techniques, apparatuses, systems, and computer program products are also described.	09-30-2010
20100262604	DATABASE MESSAGE ANALYSIS SUPPORT TECHNIQUE - A method includes: collecting message sequences including a series of messages issued in response to one processing request; classifying the collected message sequences into groups of the message sequences whose simplified message sequences generated by excluding words other than reserved words from a database message that is a message including a SQL sentence are identical, wherein the database message is included in the series of messages; generating, for each group, a normalized expression including the reserved words in the database message as fixed character strings and arbitrary character strings replaced with portions other than the fixed character strings in the database message, for the database message included in the message sequence belonging to the group; and generating a rule for converting the database message considered to be identical with the normalized expression into a series of fixed character strings included in the normalized expression.	10-14-2010
20100268712	SYSTEM AND METHOD FOR AUTOMATICALLY GROUPING KEYWORDS INTO AD GROUPS - A system and method for automatically grouping keywords into a plurality of ad groups. The method includes receiving a list of keywords and calculating a frequency of occurrence associated with each keyword in the list of keywords. The keywords in the list of keywords are automatically grouped into a plurality of ad groups based on at least the frequency of occurrence or a predetermined ad grouping list. The plurality of ad groups are subsequently saved in an uploadable format and exported to an external application.	10-21-2010
20100268713	METHOD FOR AUTOMATED DOCUMENT SELECTION - Provided is a method for the automated selection of sample documents or pages from a large collection, and more particularly an application of the method in a proof presentment environment—where the method is employed for selection and review of representative or extreme pages from a large document, such as one scheduled for printing. The method characterizes pages or documents in a multi-dimensional vector space based upon a set of characteristics, and then uses clustering techniques to group the pages, enabling the selection of typical pages from the groups, outlier pages from extremes lying outside of the groups, or both typical and outlier pages.	10-21-2010
20100268714	SYSTEM AND METHOD FOR ANALYSIS OF INFORMATION - The present invention relates to an information analysis system comprising: a summary table creation unit for analyzing an input file if the file is inputted, extracting a field list corresponding to the field list information stored in a provided database, and creating a summary table including the extracted field list; a preprocessing module for performing a preprocess including at least one of field refinement, group creation, and sub-data set creation, for fields of the summary table created by the summary table creation unit; a matrix creation unit for creating a matrix based on matrix setting information inputted by a user, for the fields created by the summary table creation unit or the preprocessing module; a cluster analysis unit for analyzing a cluster of corresponding fields according to a cluster analysis method inputted by the user, for fields selected by the user among the fields created by the summary table creation unit or the preprocessing module; and a visualization data creation unit for creating visualization data according to a visualization method selected by the user, for data created by at least one of the matrix creation unit, the preprocessing module, and the cluster analysis unit, in which methods such as a matrix, preprocessing, cluster analysis, and the like are allowed to be used in analyzing files so that accuracy and efficiency of information analysis can be enhanced.	10-21-2010
20100274785	Database Analysis Using Clusters - A method for mapping relationships in a database results in a cluster graph. A representative sample of records in each of a plurality of tables in the database is analyzed for nearest neighbor join edges instantiated by the record. Records with corresponding nearest neighbor join edges are grouped into clusters. Cluster pairs which share a join relationship between two tables are identified. A weighting may be applied to cluster pairs based on the number of records for the cluster pair. Meaningful cluster pairs above a weighted threshold may be ordered according to table and displayed as a cluster graph. Analyses of the cluster graph may reveal important characteristics of the database.	10-28-2010
20100274786	System And Method For Performing Longest Common Prefix Strings Searches - A method and system a method for compressing and searching a plurality of strings. The method includes inputting a plurality of strings into a compression engine. The method also includes converting each of the plurality of strings into a new, prefix-preserving compressed string, using the compression engine. For every string P that is a strict prefix of a string S, P's resulting compressed string is a strict prefix of S's resulting compressed string.	10-28-2010
20100274787	SUMMARIZATION OF SHORT COMMENTS - A method and a system for summarization of short comments are provided. The system comprises a memory to store a comments collection. The comments collection stores a plurality of comments for later access. The comments respectively include an overall rating and at least one phrase. The system also includes one or more processors to implement an aspect module to identify a first head term and a second head term based on a first portion of the comments and to map the first head term and the second head term into an aspect cluster. The one or more processor also implement a rating module to predict an aspect rating corresponding to the aspect cluster based on the respective overall ratings of the portion of the comments.	10-28-2010
20100274788	METHOD OF ENCAPSULATING INFORMATION IN A DATABASE AND AN ENCAPSULATED DATABASE - In a method of encapsulating information in a database, a message is partitioned into a plurality of object class entries within the database. An object class pointer is generated for each of a first subset of the plurality of object class entries, the generating further including executing a pointer key algorithm, the algorithm additionally generating a random number for each object class entry and concatenating the randomly generated numbers to form a single parameter string adapted to obfuscate a path between a pointer and its corresponding object class entry. The plurality of object class entries are stored in non-adjacent storage locations within the database, with each of a second subset of the plurality of object class entries stored in association with one of the generated pointers.	10-28-2010
20100281027	METHOD AND SYSTEM FOR DATABASE PARTITION - The present invention provides a flexible, dynamic database partition method and system. The method includes the steps of acquiring a data partition rule, where the data partition rule is used to identify a first relationship between a data partition condition and a database partition; establishing a second relationship between the data partition condition and a data partition key based on the data partition rule and a third relationship between the database partition and the data partition key; adding the data partition key to a data item where the data item is stored in the database based on the second relationship between the data partition condition and the data partition key; and storing the data item in the database partition based on the data partition key of the data item.	11-04-2010
20100287160	METHOD AND SYSTEM FOR CLUSTERING DATASETS - A method and system for clustering a plurality of data elements is provided. According to embodiments of the present invention, a bit vector is generated based on each of the data elements. Bit operations are used to group each data element into a cluster. Clustering may be performed by partition clustering or hierarchical clustering. Embodiments of the present invention cluster data elements such as text documents, audio files, video files, photos, or other data files.	11-11-2010
20100293164	ACCESSING MEDICAL IMAGE DATABASES USING MEDICALLY RELEVANT TERMS - The invention relates to a system for accessing a database comprising a plurality of image data sets. The system comprises an acquisition unit for acquiring a query for searching the database for an image data set or an image data subset comprised in an image data set, the query comprising at least one medically relevant term defining search criteria; a determining unit for determining the image data set or the image data subset comprised in the image data set, based on the strength of semantic matches between the at least one medically relevant term and (a) corresponding medical annotation(s) describing the image data set; and a retrieving unit for retrieving the determined image data set or image data subset from the database. By enabling semantic matches between medical annotations describing the image data set and the medically relevant term comprised in the query, this invention enables searching for medical images with high-level medical information that is meaningful for medical diagnosis and treatments.	11-18-2010
20100293165	Subscriber Identification System - A subscriber identification system	11-18-2010
20100299329	Apparatus and Methods for Providing Answers to Queries Respective of a User Based on User Uniquifiers - A method for providing an answer to an input query respective of a user using a user device comprises collecting data respective of the user by using a plurality of sensors on the user device, wherein the plurality of sensors sense the user activity on the user device; generating a plurality of uniquifiers from the data, wherein each uniquifier of the plurality of uniquifier characterizes the user; evaluating periodically the plurality of uniquifiers; storing at least the evaluated plurality of uniquifiers in a memory of the user device; and responsive to the input query, providing an answer based on at least one evaluated uniquifier of the evaluated plurality of uniquifiers stored in the memory	11-25-2010
20100299330	ONTOLOGY-INTEGRATION-POSITION SPECIFYING APPARATUS, ONTOLOGY-INTEGRATION SUPPORTING METHOD, AND COMPUTER PROGRAM PRODUCT - Attribute mapping information is stored, in which a superclass of an associated class in an integration-source ontology already associated with an integration-destination ontology, an attribute of the superclass, and an integration destination attribute of a class in the integration destination are associated with each other. An integration target class in the integration source is specified to acquire an attribute of a superclass of the integration target class. An associated class having the shortest distance from the integration target class is specified, to specify an integration-destination-associated class associated with the specified associated class. An inheritance relation is followed from the integration-destination-associated class to specify a class having the integration destination attribute corresponding to the attribute in the mapping information as a position where the class associated with the integration target class is present.	11-25-2010
20100312765	INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD AND PROGRAM THEREFOR - An information processing apparatus includes a first position acquiring unit adapted to acquire a position metadata piece from a target data piece, the position metadata piece indicating a position, a second position acquiring unit adapted to acquire position metadata pieces from a plurality of other data pieces different from the target data piece, a target acquiring unit adapted to acquire target metadata pieces other than the position metadata pieces from the other data pieces, an analysis unit adapted to analyze a distribution of the target metadata pieces based on positions indicated by the position metadata pieces acquired from the other data pieces, and an assignment unit adapted to assign to the target data piece a target metadata piece that has a value related to the target data piece, the target metadata piece being selected from among the analyzed target metadata pieces, based on the distribution and the position indicated by the position metadata piece acquired from the target data piece.	12-09-2010
20100312766	Computer system for automatic organization, indexing and viewing of information from multiple sources - A computer data processing system including a central processing unit configured with a novel integrated computer control software system for the management of data objects including dynamic and automatic organization, linking, finding, cross-referencing, viewing and retrieval of multiple objects regardless of nature or source. The inventive system provides underlying component architecture having an object-oriented database structure and a metadata database structure which is unique in storing only one instance of each object while linking the object to multiple collections and domains by unique metadata links for the grouping into and retrieval from any of the collections. The system employs configurable, extensible attribute/properties of data objects in metadata format, and a truly user-friendly configurable interface that facilitates faster, more unified, comprehensive, useful and meaningful information management. Additional features include a sticky path object hierarchy viewing system, key phrase linking, viewing by reference, and drag-and-drop relationship link creation.	12-09-2010
20100325109	KEYWORD CLASSIFICATION AND DETERMINATION IN LANGUAGE MODELLING - A computer-implemented method and apparatus defines a keyword class vector. A set of seed keywords is determined from a set of keywords and first and second most similar keywords from the set of seed keywords are then determined. A class vector is determined from first and second keyword vectors associated with the first and second most similar keywords. The method and apparatus also classifies a keyword in a keyword class. A similarity for a keyword vector associated with the keyword is determined with reference to a plurality of class vectors, each class vector having an associated class and determines a most similar class vector of the plurality of class vectors from the similarity determination. The keyword is then classified in a most similar class associated with the most similar class vector.	12-23-2010
20100325110	Efficient Method for Clustering Nodes - Methods and computer storage media for clustering nodes are provided. An input file is received that is comprised of primary nodes, secondary nodes and metrics that relate to the association between the primary nodes and the secondary nodes. Upon receiving the input file, the input file is abridged to reduce the number of nodes contained in the input file. The unique initial primary nodes are then clustered with their associated secondary node. The clusters containing the unique initial primary nodes are replaced if a subsequent related cluster satisfies a pre-defined condition. In some embodiments, multiple clusters are then merged until the cluster size reaches a pre-defined size. In some embodiments, the input file is cleaned and sorted prior to being abridged.	12-23-2010
20100325111	Methods and Systems for Selecting and Presenting Content Based on Context Sensitive User Preferences - A method of selecting and presenting content based on context-sensitive learned user preferences is provided. The method includes providing a set of content items having descriptive terms. The method includes receiving user input for identifying items and, in response thereto, presenting a subset of items. The method includes receiving user selections of said items and analyzing the descriptive terms of those items to learn the user's content preferences. The method includes determining the context in which the user performed the selections and associating those contexts with the user content preferences learned from the corresponding user selections. The method includes, in response to subsequent user input, determining a context of said subsequent input and selecting and ordering a collection of items based on comparing those items' descriptive terms with the user's learned content preferences associated with the determined context in which the user entered the subsequent input.	12-23-2010
20100332474	METHOD AND APPARATUS FOR PREDICTING OBJECT PROPERTIES AND EVENTS USING SIMILARITY-BASED INFORMATION RETRIEVAL AND MODEL - Method and apparatus for predicting properties of a target object comprise application of a search manager for analyzing parameters of a plurality of databases for a plurality of objects, the databases comprising an electrical, electromagnetic, acoustic spectral database (ESD), a micro-body assemblage database (MAD) and a database of image data whereby the databases store data objects containing identifying features, source information and information on site properties and context including time and frequency varying data. The method comprises application of multivariate statistical analysis and principal component analysis in combination with content-based image retrieval for providing two-dimensional attributes of three dimensional objects, for example, via preferential image segmentation using a tree of shapes and to predict further properties of objects by means of k-means clustering and related methods. By way of example, one of a process failure event, an intrusion event and a fire event and residual objects may be predicted and located and qualified such that, for example, properties of the residual objects may be qualified, for example, via black body radiation and micro-body databases including charcoal assemblages.	12-30-2010
20100332475	METHOD AND APPARATUS FOR PREDICTING OBJECT PROPERTIES AND EVENTS USING SIMILARITY-BASED INFORMATION RETRIEVAL AND MODELING - Method and apparatus for predicting properties of a target object comprise application of a search manager for analyzing parameters of a plurality of databases for a plurality of objects, the databases comprising an electrical, electromagnetic, acoustic spectral database (ESD), a micro-body assemblage database (MAD) and a database of image data whereby the databases store data objects containing identifying features, source information and information on site properties and context including time and frequency varying data. The method comprises application of multivariate statistical analysis and principal component analysis in combination with content-based image retrieval for providing two-dimensional attributes of three dimensional objects, for example, via preferential image segmentation using a tree of shapes and to predict further properties of objects by means of k-means clustering and related methods. By way of example, a fire event and residual objects may be located and qualified such that, for example, properties of the residual objects may be qualified, for example, via black body radiation and micro-body databases including charcoal assemblages.	12-30-2010
20100332476	WEB GRAPH COMPRESSION THROUGH SCALABLE PATTERN MINING - A method and a processing device are provided for compressing a web graph including multiple nodes and links between the multiple nodes. Nodes of the web graph may be clustered into groups including no more than a predetermined number of nodes. A list of links of the clustered nodes may be created and sorted based on a frequency of occurrence of each of the links. A prefix tree may be created based on the sorted list of links. The prefix tree may be walked to find candidate virtual nodes. The candidate virtual nodes may be analyzed according to a selection criteria and a virtual node may be selected. The prefix tree may be adjusted to account for the selection of the virtual node and the virtual node may be added to the web graph.	12-30-2010
20110010368	SYSTEM AND METHOD FOR GROUPING CLAIM RECORDS ASSOCIATED WITH A PROCEDURE - A system and computer-implemented method for grouping medical records implements a multi-level analysis of the records. The level of analysis for each record is determined based upon the time proximity of each record to the defining medical procedure or service (anchor procedure) to be analyzed. Once an anchor procedure is identified, claim records are processed to determine whether any of the records should be grouped with the anchor procedure into a procedure episode group (PEG). First, the date of service for each claim record is identified to determine whether the claim record falls within time window. The claim records falling within the window then are assessed to determine whether each claim record is sufficiently related to the anchor procedure (for example, by determining whether the diagnostic, procedure, or episode treatment group coding of each claim record is associated with the anchor procedure). The requisite level of relationship between the claim records and the anchor procedure depends upon the position of the records within the time window. Only those claim records having the requisite relationship level associated with the portion of the time window in which they fall are included in the PEG.	01-13-2011
20110016123	Scalable Real Time Event Stream Processing - Scalable systems and methods for near real time processing of a substantial volume of event data streams are disclosed. A concurrent throughput receiver is coupled to an input of a processor for receiving the substantial volume of event data streams, and implementing substantially concurrent throughput of the substantial volume of event data streams. The systems and methods may provide for real time application monitoring, such as by aggregating information to a distributed cache from a plurality of entities, where the entities are structured in a nodal hierarchy, and summarizing to a summary database the information from the nodal hierarchy into summary level statistics for each of the nodes of the nodal hierarchy.	01-20-2011
20110016124	Optimized Partitions For Grouping And Differentiating Files Of Data - Methods and apparatus teach a digital spectrum of a data file. The digital spectrum is used to map a file's position in multi-dimensional space. This position relative to another file's position reveals closest neighbors. Certain of the closest neighbors are grouped together to define a set. Overlapping members in the groups may be further differentiated from one another by partitioning. An optimized partition of set S of N overlapping groups yields a maximum strength for groups and members in that partition. Among other things, the optimized partition includes relative strengths of every individual member in every possible partition and weighting functions applied to the relative strengths and to subgroups of files within the partitions.	01-20-2011
20110016125	METHOD AND SYSTEM FOR USER CENTERED INFORMATION SEARCHING - Disclosed is a method and system for user-centered information search. The user-centered information search may include generating an object as a classification unit of an information search structure and a property of the object, generating a class and determining a property of the class using the object; and detecting a search result corresponding to an information request from a user using at least one of the object, property, and class.	01-20-2011
20110016126	Method of displaying adaptive album art for portable terminal and apparatus for providing the same - A method of displaying an adaptive album art for a portable terminal is provided, which includes confirming whether the album art exists by reading meta data frame that corresponds to at least one sound source data, classifying and storing the sound data by meta data items commonly included in the corresponding meta data information if the album art exists, extracting an image file from the album art of the sound source data classified by meta data items, and matching the extracted image file to a changeable disc album art preset by meta data items to display the matched image.	01-20-2011
20110016127	Hierarchy of Servers for Query Processing of Column Chunks in a Distributed Column Chunk Data Store - An improved system and method for query processing in a distributed column chunk data store is provided. A distributed column chunk data store may be provided by multiple storage servers operably coupled to a network. A storage server provided may include a database engine for partitioning a data table into the column chunks for distributing across multiple storage servers, a storage shared memory for storing the column chunks during processing of semantic operations performed on the column chunks, and a storage services manager for striping column chunks of a partitioned data table across multiple storage servers. Query processing may be performed by storage servers or query processing servers operably coupled by a network to storage servers in the column chunk data store. To do so, a hierarchy of servers may be dynamically determined to process execution steps of a query transformed for distributed processing.	01-20-2011
20110016128	DISTRIBUTING CONTENT INDICES - A query-centric system and process for distributing reverse indices for a distributed content system. Relevance ranking techniques in organizing distributed system indices. Query-centric configuration subprocesses (1) analyze query data, partitioning terms for reverse index server(s) (RIS), (2) distribute each partitioned data set by generally localizing search terms for the RIS that have some query-centric correlation, and (3) generate and maintain a map for the partitioned reverse index system terms by mapping the terms for the reverse index to a plurality of different index server nodes. Indexing subprocess element builds distributed reverse indices from content host indices. Routines of the query execution use the map derived in the configuration to more efficiently return more relevant search results to the searcher.	01-20-2011
20110016129	METHOD AND SYSTEM FOR VARIABLE OR DYNAMIC CLASSIFICATION - A method, system and device for variable or dynamic classification of users, devices, computers, systems, or information are provided, including at least one of means for sensing one or more inputs, including at least one of an event, a parameter, and time; and means for generating a classification or policy for allowing access to information based on one or more of the sensed inputs.	01-20-2011
20110022594	CONTENTS REPRODUCING DEVICE, CONTENTS REPRODUCING METHOD, AND PROGRAM - Music content are reproduced in the order of the newest time information that has been registered. A music content database	01-27-2011
20110022595	ASPECT-LEVEL NEWS BROWSING SERVICE SYSTEM AND METHOD FOR MITIGATING EFFECTS OF MEDIA BIAS - The present invention relates to an aspect-level news browsing service system and method for mitigating effects of media bias, which group news articles having different aspects on the same event on the basis of aspects, and simultaneously provide grouped news articles to users. The aspect-level news browsing service system may include a user terminal for accessing a news service server over an Internet and receiving aspect-level news article information from the news service server. A news provision server may transmit news article information to the news service server over the Internet. The news service server may extract aspects from the received news article information, classify the news article information based on the extracted aspects, and may transmit the aspect-level news article information to the user terminal depending on the aspects to enable the news article information to be displayed. The Internet may be configured to connect the user terminal to the news service server.	01-27-2011
20110022596	Method and system for document indexing and data querying - Generating a document index comprises: obtaining a document to be indexed; performing a monadic partition operation on the document to obtain a plurality of monadic partitions; and for each monadic partition in the plurality of monadic partitions: determining whether said each monadic partition is a filter character; in the event said each monadic partition is a filter character, forming a polynary partition by combining the monadic partition with at least one other monadic partition adjacent to the monadic partition, and indexing the polynary partition; and in the event that the monadic partition is not a filter character, indexing the monadic partition. Querying data comprising: receiving a data query; performing a monadic partition operation on the data query to obtain a plurality of monadic partitions; and for each monadic partition in the plurality of monadic partitions: determining whether said each monadic partition is a filter character; in the event that the monadic partition is a filter character, forming a polynary partition by combining the monadic partition with at least one monadic partition adjacent to the monadic partition, and searching a preset index using the polynary partition to obtain a search result corresponding to the polynary partition; and in the event that the monadic partition is not a filter character, searching the preset index using the monadic partition to obtain a search result corresponding to the monadic partition; and combining the search results to form a final query search result.	01-27-2011
20110029519	Population clustering through density-based merging - A method and/or system for analyzing data using population clustering through density based merging.	02-03-2011
20110029520	DATA CURATION - A method of data curation and a data processing apparatus for performing the method are provided. The method comprises the steps of (i) identifying a first set of variables which represent predetermined characteristics of data stored in one or more of a number of data packages; (ii) identifying a second set of variables which represent different possible states of each said number of data packages; (iii) identifying a functional relationship between the first and second sets of variables so as to provide a functional representation based on said sets of variables; (iv) allocating different states to the data associated with each said number of data packages according to an iterative procedure, wherein the iterative procedure comprises iteratively calculating values of said variables and of the functional representation until the values satisfy predetermined convergence criteria, and the allocation of a state to one or more of the data packages is effected in dependence upon a comparison of the calculated values of said variables and of the functional representation; and (v) performing an action on the data associated with each said number of data packages corresponding to the allocation of states in step (iv).	02-03-2011
20110029521	SYSTEMS AND METHODS FOR COMMUNICATION AMONG COLLABORATING USERS - Embodiments relate to methods and systems for building representations of related subjects. The representations may include a plurality of nodes, each being associated with a subject. Users may be able to access records and/or source documents related to a plurality of subjects and add or modify node characteristics based thereon. Users may interact with (e.g., by adding to or modifying) documents, files, and/or records and may also make other changes or additions to nodes in the system. A process may then identify what other users may be interested in such interaction and why. For example, a score may be associated with the interaction and other users based on factors such as whether the users have linked to the document, file or record and/or the node the interaction may apply to. Interacting users, identified users, and interaction details may be stored in a database. Identified users may be notified of the interaction.	02-03-2011
20110029522	Photo-image Discovery Device Database Management - A system and method are provided for managing a database in a photo-image discovery device. A photo-image discovery device acquires photo-images from a photo-capable device, and automatically classifies the acquired photo-images, without user intervention, in a database management directory. Then, the acquired photo-images are managed in a photo-image discovery device storage-in-transit memory using database management directory rules cross-referenced to classification. For example, the method may uplink photo-images to a network-connected site in response to their classification. Photo-image classification may be based upon photo-capable device type, photo-capable device ID, image creation data, image creation location, or file format.	02-03-2011
20110029523	Identifying a test set of target objects - Methods and structures having and/or implementing integrated steps for use in a planning phase of experimentation, which can allow the researcher to explore the experimental space while reducing the number experiments performed.	02-03-2011
20110029524	DISPERSED STORAGE NETWORK VIRTUAL ADDRESS FIELDS - A dispersed storage network includes a dispersed storage device to store data. The dispersed storage device includes a processing module operable to slice a data segment of a data object into data slices, in which the number of data slices corresponds to a number of pillars for storing the data object. The processing module further creates a slice name for each of the data slices. The slice name includes routing information containing a vault identifier that identifies at least one user of the data object and a slice index based on the vault identifier and a pillar identifier that identifies a pillar associated with the data slice. In addition, the slice name includes a source data name containing an identifier of the data object.	02-03-2011
20110029525	System And Method For Providing A Classification Suggestion For Electronically Stored Information - A system and method for providing a classification suggestion for electronically stored information is provided. A corpus of electronically stored information including reference electronically stored information items each associated with a classification and uncoded electronically stored information items are maintained. A cluster of uncoded electronically stored information items and reference electronically stored information items is provided. A neighborhood of reference electronically stored information items in the cluster is determined for at least one of the uncoded electronically stored information items. A classification of the neighborhood is determined using a classifier. The classification of the neighborhood is suggested as a classification for the at least one uncoded electronically stored information item.	02-03-2011
20110029526	System And Method For Displaying Relationships Between Electronically Stored Information To Provide Classification Suggestions Via Inclusion - A system and for providing reference documents as a suggestion for classifying uncoded documents is provided. A set of reference electronically stored information items, each associated with a classification code, is designated. One or more of the reference electronically stored information items is combined with a set of uncoded electronically stored information items. Clusters of the uncoded electronically stored information items and the one or more reference electronically stored information items are generated. Relationships between the uncoded electronically stored information items and the one or more reference electronically stored information items in at least one cluster are visually depicted as suggestions for classifying the uncoded electronically stored information items in that cluster.	02-03-2011
20110029527	System And Method For Displaying Relationships Between Electronically Stored Information To Provide Classification Suggestions Via Nearest Neighbor - A system and for providing reference documents as a suggestion for classifying uncoded documents is provided. Reference electronically stored information items and a set of uncoded electronically stored information items are designated. Each of the reference information items are previously classified. At least one uncoded electronically stored information item is compared with the reference electronically stored information items. One or more of the reference electronically stored information items similar to the at least one uncoded electronically stored information items are identified. Relationships are depicted between the at least one uncoded electronically stored information item and the similar reference electronically stored information items for classifying the at least one uncoded electronically stored information item.	02-03-2011
20110029528	CITATION RECORD EXTRACTION SYSTEM AND METHOD, AND PROGRAM PRODUCT - A citation record extraction system is provided. An HTML rendering engine receives a publication list web page, parses the publication list web page to obtain layout information of the web page. A web page sequence builder generates a web page characteristic sequence for the web page according to the layout information. A web page repeated pattern analyzer analyzes repeated pattern presented in the web page characteristic sequence, screens out non-citation record therefrom, and obtains a citation record of the publication list web page.	02-03-2011
20110029529	System And Method For Providing A Classification Suggestion For Concepts - A system and method for providing a classification suggestion for concepts is provided. A corpus of concepts including reference concepts each associated with a classification and uncoded concepts are maintained. A cluster of uncoded concepts and reference concepts is provided. A neighborhood of reference concepts in the cluster is determined for at least one of the uncoded concepts. A classification of the neighborhood is determined using a classifier. The classification of the neighborhood is suggested as a classification for the at least one uncoded concept.	02-03-2011
20110029530	System And Method For Displaying Relationships Between Concepts To Provide Classification Suggestions Via Injection - A system and method for displaying relationships between concepts to provide classification suggestions via injection is provided. A reference set of concepts each associated with a classification code is designated. Clusters of uncoded concepts are designated. One or more of the uncoded concepts from at least one cluster are compared to the reference set. At least one of the concepts in the reference set that is similar to the one or more uncoded concepts is identified. The similar concepts are injected into the at least one cluster. Relationships between the uncoded concepts and the similar concepts in the at least one cluster are visually depicted as suggestions for classifying the uncoded concepts.	02-03-2011
20110029531	System And Method For Displaying Relationships Between Concepts to Provide Classification Suggestions Via Inclusion - A system and for displaying relationships between concepts to provide classification suggestions via inclusion is provided. A set of reference concepts each associated with a classification code is designated. One or more of the reference concepts are combined with a set of uncoded concepts. Clusters of the uncoded concepts and the one or more reference concepts are generated. Relationships between the uncoded concepts and the one or more reference concepts in at least one cluster are visually depicted as suggestions for classifying the uncoded concepts in that cluster.	02-03-2011
20110029532	System And Method For Displaying Relationships Between Concepts To Provide Classification Suggestions Via Nearest Neighbor - A system and method for displaying relationships between concepts to provide classification suggestions via nearest neighbor is provided. Reference concepts previously classified and a set of uncoded concepts are provided. At least one uncoded concept is compared with the reference concepts. One or more of the reference concepts that are similar to the at least one uncoded concept are identified. Relationships between the at least one uncoded concept and the similar reference concept are depicted on a display for classifying the at least one uncoded concept.	02-03-2011
20110035376	STORING NODES REPRESENTING RESPECTIVE CHUNKS OF FILES IN A DATA STORE - To provide a data store, nodes representing respective chunks of files are stored in a predefined structure that defines relationships among the nodes, where the files are divided into the chunks. The nodes are collected into plural groups stored in persistent storage, where some of the nodes are collected into a particular one of the groups according to a locality relationship of the some of the nodes.	02-10-2011
20110035377	METHOD - A method of analysing automatically analysing online posts, such that they may then be responded appropriately. The method may comprise: extracting a list of keywords from each of the plurality of posts; generating one or more keyword clusters based on the keywords extracted from each of the plurality of posts and classifying new posts in accordance with the one or more keyword clusters.	02-10-2011
20110040756	System and Method for Providing Recommendations - Methods, systems, and computer program products for providing recommendations for an activity to a user are provided. In one method, the method tracks status information of a plurality of users, and detects a trigger for providing recommendations for an activity. In response the trigger, the method identifies a cluster of users based on the status information of the users. The method further retrieves profiles and behavioral characteristics of the users in the identified cluster, and provides one or more recommendations for the activity to the user based, at least in part, upon the behavioral characteristics and the profiles.	02-17-2011
20110040757	METHOD AND APPARATUS FOR ENHANCING OBJECTS WITH TAG-BASED CONTENT - An approach is provided for enhancing objects with tag-based content. One or more memory tags associated with one or more objects are detected within proximity of a mobile device. The memory tag contains supplemental information related to the one or more objects. One of the detected memory tags is selected by receiving an input signal or by applying one or more selection criteria. Selection of one of the detected memory tags initiates reading of the supplemental information from the selected memory tag. The supplemental information includes recognition information to associate the supplemental information with a specific section or portion of a respective one of the objects.	02-17-2011
20110040758	GRID-BASED DATA CLUSTERING METHOD - A grid-based data clustering method comprises: a parameter setting step, a partition step, a searching step, a seed-classifying step, an extension step, and a termination step. Through the above-mentioned steps, data in a data set are disposed in a plurality of grids, and the grids are classified into dense grids and uncrowded grids for a cluster to extend from one of the dense grid to gradually combine data in other dense grids nearby. Consequently, convenience in parameter setting, efficiency and accuracy in data clustering, and performance in noise filtering are achieved.	02-17-2011
20110040759	METHOD AND SYSTEM FOR AUTOMATICALLY RANKING PRODUCT REVIEWS ACCORDING TO REVIEW HELPFULNESS - A method and system for automatically ranking product reviews according to review helpfulness. Given a collection of reviews, the method employs an algorithm that identifies dominant terms and uses them to define a feature vector representation. Reviews are then converted to this representation and ranked according to their distance from a ‘locally optimal’ review vector. The algorithm is fully unsupervised and thus avoids costly and error-prone manual training annotations. In one embodiment a Multi Layer Lexical Model (MLLM) approach partitions the dominant lexical terms in a review into layers, creates a compact unified layers lexicon, and ranks the reviews according to their weight with respect to unified lexicon, all in a fully unsupervised manner. When used to rank book reviews, it was found that the invention significantly outperforms the user votes-based ranking employed by Amazon.	02-17-2011
20110040760	Estimating Social Interest in Time-based Media - Social media content items are mapped to relevant time-based media events. These mappings may be used as the basis for multiple applications, such as ranking of search results for time-based media, automatic recommendations for time-based media, prediction of audience interest for media purchasing/planning, and estimating social interest in the time-based media. Social interest in time-based media (e.g., video and audio streams and recordings) segments is estimated through a process of data ingestion and integration. The estimation process determines social interest in specific events represented as segments in time-based media, such as particular plays in a sporting event, scenes in a television show, or advertisements in an advertising block. The resulting estimates of social interest also can be graphically displayed.	02-17-2011
20110040761	ESTIMATION OF POSTINGS LIST LENGTH IN A SEARCH SYSTEM USING AN APPROXIMATION TABLE - The present invention provides a method of minimizing accesses to secondary storage when searching an inverted index for a search term. The method comprises automatically obtaining a predetermined size of a posting list for the search term, the predetermined size based on document frequency for the search term, wherein the posting list is stored in secondary storage, and reading at least a portion of the posting list into memory based on the predetermined size. Corresponding computer system and program products are also provided.	02-17-2011
20110040762	SEGMENTING POSTINGS LIST READER - A size of a posting list is determined as part of searching an inverted index. The posting list is segmented for reading into a plurality of segments based on the size. For example, the segmenting may be performed if the size is larger than a predetermined size. Finally, each of the plurality of segments is read into memory.	02-17-2011
20110040763	DATA PROCESSING APPARATUS AND METHOD OF PROCESSING DATA - One embodiment is a data processing apparatus that has a chunk store containing specimen data chunks, a manifest store containing a plurality of manifests, each of which represents at least a part of previously processed data and includes at least one reference to at least one of the specimen data chunks, and a sparse chunk index containing information on only some specimen data chunks. Input data is processed into a plurality of input data segments. Each manifest of the first set has at least one reference to one of said specimen data chunks that corresponds to one of the input data chunks of a first input data segment. Specimen data chunks corresponding to other input data chunks of the first input data segment are identified by using the identified first set of manifests and at least one manifest identified when processing previous data.	02-17-2011
20110047156	System And Method For Generating A Reference Set For Use During Document Review - A system and method for providing generating reference sets for use during document review is provided. A collection of unclassified documents is obtained. Selection criteria are applied to the document collection and those unclassified documents that satisfy the selection criteria are selected as reference set candidates. A classification code is assigned to each reference set candidate. A reference set is formed from the classified reference set candidates. The reference set is quality controlled and shared between one or more users.	02-24-2011
20110047157	SYSTEM AND METHOD FOR PROCESSING DATA - To reduce the trouble required for creating and editing configuration data composed of pairs of an element names and element values. The system includes a file storage unit	02-24-2011
20110047158	METHOD AND A SYSTEM FOR DATA VERIFICATION AND/OR AUTHENTICATION - A method for rearranging a data segment. The method comprises providing a data segment containing digital content, generating a set of human dependent variables according to a plurality of human related activities, rearranging the data segment according to the set of human dependent variables, and updating a log according to the rearranging. The digital content may be retrieved from the rearranged data segment according to the log.	02-24-2011
20110055209	SYSTEM AND METHOD FOR DELIVERING CONTENT AND ADVERTISMENTS - A processing system operable with a computing device, comprising one or more of a converter component for converting input data into a desired format for further processing, a parsing component for parsing input data into clusters having one or more desired characteristics, a notes component for receiving user inputs for insertion at desired locations within an input, an autosummary component for summarising input data, an ad component for adding advertisements to input data, a renderer component for displaying the resulting processed input data in various forms, and configurable settings to alter operation of the processing system.	03-03-2011
20110055210	Robust Adaptive Data Clustering in Evolving Environments - A computer-implemented method for automated data clustering and analysis. A computer takes a database having multiple entries and transforms the entries in the database into a set of intrinsic attributes for each entry. The computer then receives data defining one or more clustering trials to be run on the attributes from the entries in the database, each clustering trial being defined by a set of relevant intrinsic and extrinsic attributes. The computer automatically identifies the most significant intrinsic and/or extrinsic attributes of the entries being clustered for each clustering trial, and runs a clustering script to cluster the attributes in accordance with the significant attributes. The computer forms hierarchical linkages of the profiles and automatically calculates the cophenetic correlation coefficient for the linkages in each clustering trial. The invention then automatically calculates linkage threshold values for the linkages in each trial, creates cluster groups based on the threshold values, and outputs dendrograms and maps showing the results.	03-03-2011
20110055211	SYSTEM FOR SORTING AND CLASSIFYING USERS OF AN IMAGE INFORMATION MANAGEMENT SYSTEM - A system for sorting and classifying users of an image information management system is disclosed. The system for sorting and classifying users of an image information management system according to the present invention comprises some identical sub-systems, and every two sub-systems are interconnected. The sub-system comprises a user information encoding module, a user information decoding and authority identifying module, a user sorting module, a user classifying module, a command performing module, an authorized user collection database and a resource information database. The resource information database comprises real-time images, history images and control right commands of cradle heads and lens of cameras. The present invention resolves the problem of ordered accessing and utilizing of image information in a super-large-scale advanced real-time monitoring information management system, and realizes the object that local failures do not affect the normal work of the other parts by connecting every two sub-systems to each other and arranging the user identification entrance in each one of sub-systems.	03-03-2011
20110055212	DENSITY-BASED DATA CLUSTERING METHOD - A density-based data clustering method, comprising a parameter-setting step for setting a scanning radius and a minimum threshold value, a dividing step for dividing a space of a plurality of data points according to the scanning radius, a data-retrieving step for retrieving one data point out of the plurality of data points as a core data point, a searching step for calculating a distance between the core data point and each of the query points, a grouping determination step for determining whether a number of the neighboring points is smaller than the minimum threshold value.	03-03-2011
20110055213	QUERY EXTRACTING APPARATUS, QUERY EXTRACTING METHOD AND QUERY EXTRACTING PROGRAM - To provide a query extracting apparatus, query extracting method and query extracting program capable of retrieving an image that is suited to a part of lyrics while also having suitability for the other parts, a query extracting apparatus	03-03-2011
20110055214	Method and System for Pivoting a Multidimensional Dataset - A computer-implemented method for visualizing a multi-dimensional dataset at a client device is disclosed. The client device displays a first view of a subset of the multi-dimensional dataset, including displaying dimension data of a first reference dimension attribute and metric data of a first metric attribute that corresponds to the respective first reference dimension data along a first axis. After receiving a user request to partition the metric data of the first metric attribute by a first pivot dimension attribute, the client device requests and receives dimension data of the first pivot dimension attribute and the corresponding partitioned metric data of the first metric attribute from a server system and displays a second view of the subset of the multi-dimensional dataset, including displaying the first pivot dimension data and the corresponding partitioned metric data of the first metric attribute along the second axis.	03-03-2011
20110055215	Hierarchy of Servers for Query Processing of Column Chunks in a Distributed Column Chunk Data Store - An improved system and method for query processing in a distributed column chunk data store is provided. A distributed column chunk data store may be provided by multiple storage servers operably coupled to a network. A storage server provided may include a database engine for partitioning a data table into the column chunks for distributing across multiple storage servers, a storage shared memory for storing the column chunks during processing of semantic operations performed on the column chunks, and a storage services manager for striping column chunks of a partitioned data table across multiple storage servers. Query processing may be performed by storage servers or query processing servers operably coupled by a network to storage servers in the column chunk data store. To do so, a hierarchy of servers may be dynamically determined to process execution steps of a query transformed for distributed processing.	03-03-2011
20110060738	MEDIA ITEM CLUSTERING BASED ON SIMILARITY DATA - Methods and arrangements for facilitating generation of media mixes for a program participant based at least in part on media library inventory information provided by a number of program participants. Those individuals that decide to be program participants are interested in organizing, maintaining and playing their music, based at least in part, on data derived from a population of other participants in the program. A program participant must send, and the system, receive, data representative of that program participant's media inventory. The system or program determines a relative similarity of each item from the collection of program participants as compared to each other item and from the similarity information clusters of similar items are identified. The clusters can be used to identify clusters of similar items in an individual program participant's media library and therefrom mixes of similar media items can be created.	03-10-2011
20110060739	SYSTEM AND METHOD TO RESEARCH DOCUMENTS IN ONLINE LIBRARIES - A method and system for storing and searching digital documents, such as digital catalogs, are described. The method in one embodiment comprises inputting digital documents, extracting content from the digital documents, and storing the extracted content in a database so that the content is searchable. The method can include generating a hierarchy of unique database and CMS objects from document covers and pages plus meta data. The method can further include receiving a search query from a user and, in response, identifying content extracted from one of the digital documents and stored in the database, which satisfies the query. The method can further include causing a result set to be output to the user, where the result set includes the identified content which satisfies the search query and an image of a particular page of the digital document from which the identified content was extracted.	03-10-2011
20110060740	System and Method for Automatic Anthology Creation Using Document Aspects - A generic and expandable document aspect system and method for searching, browsing, presenting, and interacting with data assembled from document contents and related external data is provided. New varieties of document aspects are added to existing installations and can be accessed by users without requiring upgrades to server or clients, for example by using plug-in technology.	03-10-2011
20110066615	PERSONALIZATION ENGINE FOR BUILDING A USER PROFILE - Users of electronic documents are classified for profiling and targeting of additional relevant content. Behavioral data is gathered from user registration information and user activity, and user documents and actions are categorized. Registration information is combined with collaborative and editorial data to provide user profile information. Author-generated document classification information is analyzed and assigned a first taxonomic noun to characterize the document. User-generated tags characterizing a portion of the document are assigned a second taxonomic noun. Search terms that resulted in the user accessing the document are identified and assigned a third taxonomic noun. Attributes related to how the document was accessed are evaluated and assigned a fourth taxonomic noun. The document is processed using pattern rules to extract a fifth taxonomic noun. The taxonomic nouns are aggregated to determine a composite set of taxonomic nouns, and the user is categorized using the taxonomic nouns, and/or the author-generated classification.	03-17-2011
20110066616	Systems, Methods, and Software for Presenting Legal Case Histories - Systems and methods for automatically processing a textual document by identifying occurrences of a piece of text having a predetermined format in the textual document; determining a depth-of-treatment value for each piece of formatted text in the textual document, the depth-of-treatment value indicating a depth of treatment in the textual document afforded to the particular piece of formatted text; associating an abstract with each piece of formatted text in the textual document; and generating a data record containing each identified piece of formatted text from the textual document, the depth-of-treatment value and the abstract associated with each piece of formatted text in the textual document.	03-17-2011
20110066617	SPATIAL QUERYING IN A DATA WAREHOUSE - A data warehouse that operates to receive a spatial query and return a spatial result for the spatial query, the data warehouse comprises a regular database operating to receive and process a regular query and return a query result in response to the regular query. The data warehouse also comprises an interface layer implemented external to the regular database and operating to intercept the spatial query and translate the spatial query into the regular query for processing by the regular database. The regular database includes at least one spatial index that is accessed by the interface layer to translate the spatial query into the regular query for processing by the regular database.	03-17-2011
20110072015	TAGGING CONTENT WITH METADATA PRE-FILTERED BY CONTEXT - Generate tags for content from metadata pre-filtered based on context. A plurality of data items is accessed. Each of the data items has metadata. A context for a user is determined (e.g., at a moment of content capture). One or more of the data items are selected based on the determined context. Upon receipt of content, the received content is compared with the selected data items to identify matches. Metadata is selected from the metadata associated with the matching data items. The selected metadata is associated with the captured content.	03-24-2011
20110072016	DENSITY-BASED DATA CLUSTERING METHOD - A density-based data clustering method, comprising a parameter-setting step, a first retrieving step, a first determination step, a second determination step, a second retrieving step, a third determination step and first and second termination determination steps. The parameter-setting step sets parameters. The first retrieving step retrieves one data point and defines neighboring points. The first determination step determines whether the number of the data points exceeds the minimum threshold value. The second determination step arranges a plurality of first border symbols. The second retrieving step retrieves one seed data point from the seed list, arranges a plurality of second border symbols and defines seed neighboring points. The third determination step determines whether a data point density of searching ranges of the seed neighboring points is the same. The first termination determination step determines whether the clustering is finished. The second termination determination step determines whether to finish the method steps.	03-24-2011
20110072017	DOMAIN INDEPENDENT SYSTEM AND METHOD OF AUTOMATING DATA AGGREGATION - A computer automated method of aggregating data includes the steps of inputting a set of user-defined instructions into a computer database system, inputting a user query into the computer database system, mining the computer database system for data relevant to the user query, creating a data set comprising said data relevant to the user query, and aggregating data in the data set using domain metrics selected based on any of predefined and configurable rules and past user usage.	03-24-2011
20110072018	HIERARCHICAL ADMINISTRATION OF RESOURCES - A method and system for administering assets in a hierarchical manner is provided. A plurality of assets (e.g., computing resources, servers) are provided. A system administrator can create asset groups and administrative groups. One or more assets can be assigned to an asset group. One or more asset groups can be assigned to an administrative group. Accordingly, a user that is assigned to an administrative group has the capability to manage the assets assigned to the user's administrative group.	03-24-2011
20110072019	DOCUMENT MANAGING APPARATUS, DOCUMENT MANAGING METHOD, AND STORAGE MEDIUM - An object list LO in which information of each object included in a structured document has been collected in a list format is formed. Objects in which a distance in the vertical direction of a document is equal to a threshold value or less are included in one object group and the objects in the object group G are grouped as one group. After that, in the case where a length in the horizontal direction of each of circumscribed rectangles of two or more objects included in the object group G is equal to a length in the vertical direction or more and a length in the horizontal direction of at least one of the two or more objects is smaller than a threshold value, a block reforming process is executed. In the block reforming process, among the objects in the object group G, the objects in which the distance in the horizontal direction is equal to the threshold value or less are grouped as one object group GC.	03-24-2011
20110078143	Mechanisms for Privately Sharing Semi-Structured Data - Mechanisms are provided for anonymizing data comprising a plurality of graph data sets. The mechanisms receive input data comprising a plurality of graph data sets. Each graph data set comprises data for generating a separate graph from graphs associated with other graph data sets. The mechanisms perform clustering on the graph data sets to generate a plurality of clusters. At least one cluster of the plurality of clusters comprises a plurality of graph data sets. Other clusters in the plurality of clusters comprise one or more graph data sets. The mechanisms also determine, for each cluster in the plurality of clusters, aggregate properties of the cluster. Moreover, the mechanisms generate, for each cluster in the plurality of clusters, pseudo-synthetic data representing the cluster, from the determined aggregate properties of the clusters.	03-31-2011
20110078144	HIERARCHICAL SEQUENTIAL CLUSTERING - Embodiments of the invention provide systems and methods for analyzing sequential data. Analyzing the sequential data can include grouping or clustering data that are similar in some way, e.g., similar ranges of quantities, similar categories, etc. More specifically, a method for hierarchical clustering of sequential data can comprise creating a dotplot of the sequential data. The dotplot can represent a plurality of sequences within the sequential data. A number of clusters represented by the plurality of sequences can be initialized, e.g., one cluster per sequence. A pair of sequences of the plurality of sequences having a longest sequential match can be identified, e.g., based on a line fitting technique, and merged into a single cluster. Identifying a pair of sequences of the plurality of sequences having a longest sequential match and merging the identified pair of sequences into a single cluster can be repeated until a single cluster remains.	03-31-2011
20110078145	Automated Patient/Document Identification and Categorization For Medical Data - A method, including receiving a data source selection from a user or software application, the data source including medical information of a plurality of patients, receiving, from the user or software application, a data pattern that is related to a concept to be explored in the data source, querying the data source to find information that approximately matches the data pattern; and receiving the information from the data source, wherein the information includes unstructured data, assigning a classification to individual parts of the information based on the part's relationship to the data pattern, and outputting the classified information to the user or software application.	03-31-2011
20110082861	MEDIA ASSET USAGE BY GEOGRAPHIC REGION - Media asset usage by geographic region is described. In embodiments, media asset interaction data is received from user devices, where the media asset interaction data corresponds to a media asset and identifies the media asset when recently played at any of the user devices. Geographic location data that corresponds to each of the user devices is also received. The geographic location data and the media asset interaction data that corresponds to the media asset are aggregated, and a geographic density map is generated as a visual indication of aggregated locations and interactions with the media asset in a geographic region.	04-07-2011
20110082862	Identification Disambiguation in Databases - Various systems and methods are provided for identification disambiguation in databases. In one embodiment, a system includes an approximate structural equivalence (ASE) analyzer including logic that obtains a set of records from a database; logic that determines a knowledge homogeneity score (KHS) for a pair of records in the set of records; and logic that determines a condition of ASE for the pair of records based upon the KHS and a predefined KHS threshold. In another embodiment, a method includes determining a plurality of references shared by at least two records in a set of records; determining a weighting value for each shared reference; and determining a KHS for each pair of records in the set of records based upon at least one reference shared by the pair of records and the weighting value corresponding to the at least one shared reference.	04-07-2011
20110087665	CLIENT PLAYLIST GENERATION - Client playlist generation is described. In embodiments, relationships between media assets are determined to identify similar media assets that can be included in an automatic playlist of the similar media assets. Projection vectors of the asset-to-asset relationships can be generated for each of the media assets, where a projection vector for a media asset identifies the similar media assets. The projection vectors are then communicated to a client device that utilizes the projection vectors to generate the automatic playlist for any one of the media assets that is selected as a starting media asset of the automatic playlist.	04-14-2011
20110087666	SYSTEMS AND METHODS FOR SUMMARIZING PHOTOS BASED ON PHOTO INFORMATION AND USER PREFERENCE - Systems and methods for generating a summary of photos from a plurality of received photos are described. The received photos are classified according to predefined attributes. Two or more of the categories are selected, and a ratio value is received from a user relating to the two or more of the categories. Photos are selected from among the photos in the two or more categories based on the specified ratio and based on sorting the received photos according to time information. The selected photos comprising the summary of photos are displayed.	04-14-2011
20110087667	METHODS AND SYSTEMS FOR A GEOGRAPHICALLY DEFINED COMMUNICATION PLATFORM - Methods and systems are described for a geographically defined platform. In one embodiment, a block is divided into one or more partitioned blocks comprising geographically proximate street addresses. Residents whose street addresses are located within the same partitioned block may contribute and view resident-generated content through a spatial platform. Further, contiguous blocks may elect to combine with each other and a partitioned block may elect to separate from the larger block that comprises it.	04-14-2011
20110093463	METHOD AND SYSTEM FOR PROJECTING AND INJECTING INFORMATION SPACES - An approach is provided for managing projection and injection operations on information spaces with respect to their information content. An information space projection module receives a query to project a first information space from a second information space. In response to the query, the module extracts a subset of information content from the second information space by using a partitioning function. The module also extracts a subset of rules from the second information space by using the partitioning function. The module then creates the first information space using the extracted subset of information content, and the extracted subset of rules while maintains a link between the first and the second information spaces. An information space injection module enables further injection of the first information space back into the second information space.	04-21-2011
20110093464	SYSTEM AND METHOD FOR GROUPING MULTIPLE STREAMS OF DATA - A document clustering system and method of assigning a document to a cluster of documents containing related content are provided. Each cluster is associated with a cluster summary describing the content of the documents in the cluster. The method comprises: determining, at a document clustering system, whether the document should be grouped with one or more previously created cluster summaries, the previously created cluster summaries being stored in a memory in a B-tree data structure; and if it is determined that the document should not be grouped with the one or more previously created cluster summaries, then creating, at a document clustering system, a cluster summary based on the content of the document and storing the created cluster summary in the B-tree data structure.	04-21-2011
20110093465	PRODUCT CLASSIFICATION SYSTEM - Data is analyzed by a computer for the automated creation of a new data structure for information technology objects. The objects represent technical components from the mechanical engineering sector or the electrical industry and are assigned to a company. The objects to be structured are captured and then subjected to a parsing. Technical relationships are then created between the parsed objects to construct technical metrics. The data structure is derived from the technical metrics.	04-21-2011
20110093466	HEURISTIC EVENT CLUSTERING OF MEDIA USING METADATA - Event clusters are create based on a first metadata and second metadata of the electronic document. The event clusters are associated with an event id and each electronic document is associated with the event identifier of its corresponding event cluster. A user may then browse or otherwise access the electronic documents based on the event identifier.	04-21-2011
20110099168	Providing Increased Quality of Content to a User Over Time - A method for increasing quality of content provided to a user. Communities of practice a user is associated with are determined based on login data. A corresponding set of tags is retrieved for each of the communities of practice. All corresponding sets of tags are aggregated to define a role for the user. A personal set of tags associated with the user is retrieved. The personal set of tags is added to the aggregate of all corresponding sets of tags to create a new set of tags. A context of the user in the particular task is recorded. The new set of tags is filtered based on the context to create a sub-set of tags. A defined number of tag aware information sources are queried using the sub-set of tags. Content is received from the defined number of tag aware information sources based on the query. The content is outputted.	04-28-2011
20110099169	METHOD AND SYSTEM FOR CLUSTERING TRANSACTIONS IN A FRAUD DETECTION SYSTEM - A method of determining a clustering metric includes receiving a first set of transactions and a second set of transactions. For transaction i of the first set and transaction j of the second set, the method includes (a) determining an intersection set, (b) determining a union set; (c) computing a common linkage between transaction i and transaction j equal to the intersection set divided by the union set, and (d) incrementing index j and repeating steps (a)-(c). The method also includes (e) summing the common linkages between transaction i and the transactions of the second set, (f) normalizing the sum of the common linkages by a number of the second set, and (g) incrementing index i and repeating steps (a)-(f). The method further includes (h) summing the normalized common linkages and (i) normalizing the sum of the normalized common linkages by a number of the first set.	04-28-2011
20110099170	DATABASE LOAD ENGINE - The invention described herein provides a load engine and method for efficiently accomplishing mass conversions of customer data into an existing customer database, such as an IBM® Websphere® Customer Center (WCC). In particular, the method incorporates existing business rules for validating new customer data and for creating tables for the new customer data, creates load files for the new customer data, and provides a means for running multi-threaded data loads of the new customer data tables onto an existing customer database.	04-28-2011
20110099171	METHOD FOR CONSTRUCTING AND REVISING ROAD MAPS IN A DATABASE FOR A VEHICLE - A method for constructing and revising road maps in a vehicle map database using vehicle location signals to provide traffic flow information for recognized vehicle patterns from past vehicle travel. The method includes identifying vehicle travel segments as a series of exemplar points from the location signals. Exemplar points in each travel segment are eliminated to define the travel segment by a beginning exemplar point and an ending exemplar point. A potential ending exemplar point may be redefined if an average location of the exemplar points from a line connecting the beginning point and the potential ending point is outside of a threshold distance. The travel segments are stored in a database, where each stored travel segment includes a travel time. The method compares new vehicle travel segments to the stored vehicle travel segments to identify a match, and then revise the vehicle travel time for the stored travel segments.	04-28-2011
20110106801	SYSTEMS AND METHODS FOR ORGANIZING DOCUMENTED PROCESSES - Business Process Management (BPM) to enterprises having business processes documented in multiple representations. Embodiments of the invention reconcile and organize documented information about processes into groups that convey inter-process similarity. The discovered knowledge can be used by embodiments of the invention for many applications to find process clusters that significantly boost performance.	05-05-2011
20110106802	Fixed content storage within a partitioned content platform using namespaces - Content platform management is enhanced by logically partitioning a physical cluster that comprises a redundant array of independent nodes. Using an interface, an administrator defines one or more “tenants” within the archive cluster, wherein a tenant has a set of attributes including, for example, namespaces, administrative accounts, data access accounts, and a permission mask. A namespace is a logical partition of the cluster that serves as a collection of objects typically associated with at least one defined application. Each namespace has a private file system such that access to one namespace (and its associated objects) does not enable a user to access objects in another namespace. A namespace has capabilities (e.g., read, write, delete, purge, and the like) that a namespace administrator can choose to enable or disable for a given data account. Using the interface, an administrator for the tenant creates and manages namespaces such that the cluster then is logically partitioned into a set of namespaces, wherein one or more namespaces are associated with a given tenant. This approach enables a user to segregate cluster data into logical partitions. Using the administrative interface, a namespace associated with a given tenant is selectively configured without affecting a configuration of at least one other namespace in the set of namespaces. This architecture enables support for many top level tenants, with multiple namespaces per tenant, and wherein configuration is effected at the level of a namespace.	05-05-2011
20110106803	COMPUTER METHOD AND SYSTEM PROVIDING ACCESS TO DATA OF A TARGET SYSTEM - A computer system and method provides access to Web (global computer network) services data of a target system. The target system exposes data through multiple web services. An application interface is adapted to interface with the target system re-using existing (predefined) web services among applications for the target system. The application interface queries the exposed data. A mapping member maps between application interface query of exposed data and syntax of objects useable in a subject application. The mapping member enables the subject application to access data of object instances generated in response to the query.	05-05-2011
20110106804	FILE MANAGEMENT SYSTEM FOR DEVICES CONTAINING SOLID-STATE MEDIA - A device comprising a file management system that includes a plurality of first entries and second entries. The first entries are configured function as a logical block address mapping table for data searching operations on data files stored in data blocks of the device, and the second entries are configured to organize the plurality of data blocks into separate logical groups.	05-05-2011
20110106805	METHOD AND SYSTEM FOR SEARCHING MULTILINGUAL DOCUMENTS - A method, system and computer program product for searching multilingual documents. The method includes the steps of: receiving a search request based on at least one language; searching a first relevant document using the search request where the first relevant document (1) is written in a first language and (2) has a first image; finding a second relevant document having a second image which is similar to the first image and is written in a second language; and searching a second relevant document using the search request.	05-05-2011
20110106806	PROCESS FOR OPTIMIZING FILE STORAGE SYSTEMS - A system includes a selection module, a file module, a storage cache, and an access module. The selection module organizes small files into groups according to a selection function, which organizes the small files based on at least one of related content of the small files, related filenames of the small files, and related access patterns of the small files. The file module uses a predetermined block size and stores, for each group, a large file containing all the small files of the group. The access module receives an access request for one of the small files from a client device and determines a large file corresponding to the one of the small files based on input from the selection module. The access module selectively reads the corresponding large file from the file module into the storage cache, and accesses the one of the small files from the large file.	05-05-2011
20110113031	CONTINUOUS AGGREGATION ON A DATA GRID - A computer-readable medium, computer-implemented method, and apparatus are provided. In one embodiment, one or more events are received, a new intermediate state of a data partition is created based on the event, and the new intermediate state is stored. The new intermediate state is reduced into a form suitable for aggregation, and an aggregate value is created by aggregating the new intermediate state with other intermediate states of other data partitions.	05-12-2011
20110113032	GENERATING A CONCEPTUAL ASSOCIATION GRAPH FROM LARGE-SCALE LOOSELY-GROUPED CONTENT - A method for generating a conceptual association graph from structured content includes grouping content nodes into one or more topically biased clusters, the content nodes comprising structured digital content and unstructured digital content, the grouping based at least in part on the connectedness of each content node member to other content node members in the same cluster. The method also includes, responsive to the grouping, tagging the content nodes with one or more descriptive concepts. The method also includes, responsive to the tagging, establishing one or more associations between the one or more concepts, the one or more associations indicating a relevance of the one or more associations, the indicating based at least in part on patterns of co-occurrence of concepts in the tagged content nodes.	05-12-2011
20110113033	Substrate processing system and data retrieval method - An operation terminal, which includes an operation terminal, when connected to a group administration apparatus for administering a plurality of substrate processing apparatuses for processing substrates, generates a data acquisition request format that sets forth retrieval conditions and types of display items classified in individual tables for the substrate processing apparatuses, and then transmits it to the group administration apparatus.	05-12-2011
20110119267	METHOD AND SYSTEM FOR PROCESSING WEB ACTIVITY DATA - The present disclosure provides a computer-implemented method of processing Web activity data. The method includes obtaining a collection of Web activity data generated by a plurality of users at a plurality of Webpages, wherein the Webpages are from a plurality of unaffiliated Websites. The method also includes extracting a plurality of search terms from the Web activity data and associating each of the plurality of search terms with a corresponding Webpage. The method also includes generating statistical data from the Web activity data based, at least in part, on the search terms, the statistical data corresponding to the online activity at one or more Webpages.	05-19-2011
20110119268	METHOD AND SYSTEM FOR SEGMENTING QUERY URLS - A computer implemented method of grouping query URLs is provided. The method includes obtaining a plurality of query URLs generated at a plurality of Websites. The method also includes analyzing the query URLs to identify similarities between the URLs. The method also includes grouping the query URLs into cases based, at least in part, on the similarities, wherein each case comprises a plurality of instances, and each instance comprises a plurality of data field values corresponding to data fields with a same data field name.	05-19-2011
20110119269	Concept Discovery in Search Logs - Described is a search (e.g., web search) technology in which concepts are returned in response to a query in addition to (or instead of) search results in the form of traditional links. Each concept generally corresponds to a set of links to content that are more directed towards a possible user intention, or information need, with respect to that query. If a user selects a concept, that concept's links are exposed to facilitate selection of a document the user finds relevant. In this manner, much more than the top ten ranked links may be provided for a query, each set of other links arranged by the concepts. Also described is processing a query log or other data store to optionally find related queries and find the concepts, e.g., by clustering a relationship graph built from the query log to find dense subgraphs representative of the concepts.	05-19-2011
20110119270	APPARATUS AND METHOD FOR PROCESSING A DATA STREAM - An apparatus and method for processing a data stream using a cluster query, are provided. Collected queries are clustered into a predetermined vector space based on a feature vector of the collected queries. In response to a query received from a user, the received query is classified to a cluster and may be replaced with a centroid query of the cluster to which the received query belongs. The data stream processing apparatus processes the centroid query and provides an approximate result to the user.	05-19-2011
20110125743	METHOD AND APPARATUS FOR PROVIDING A CONTEXTUAL MODEL BASED UPON USER CONTEXT DATA - An approach is provided for providing a contextual model based upon user context data. A context modeling platform collects context data of a user from a plurality of sources, and the sources include at least online activities of the user. The context modeling platform maps the collected user context data as context data points into a multidimensional contextual model.	05-26-2011
20110125744	METHOD AND APPARATUS FOR CREATING A CONTEXTUAL MODEL BASED ON OFFLINE USER CONTEXT DATA - An approach is provided for providing a contextual model based upon user context data. A context modeling platform collects context data on offline activities of a user. The context modeling platform maps the collected user context data as context data points into a multidimensional contextual model. The context modeling platform causes, at least in part, actions that result in reception of at least one multidimensional contextual model of another user. The context modeling platform compares the multidimensional contextual model of the user with the multidimensional contextual model of the another user.	05-26-2011
20110125745	Balancing Data Across Partitions of a Table Space During Load Processing - A balancing technique allows a database administrator to perform a mass data load into a relational database employing partitioned tablespaces. The technique automatically balances the usage of the partitions in a tablespace as the data is loaded. Previous definitions of the partitions are modified after the loading of the data into the tablespace to conform with the data loaded into the tablespace.	05-26-2011
20110125746	DYNAMIC MACHINE ASSISTED INFORMATICS - A method of identifying a network of actors within a data set, the method comprising: —importing data from one or more data sources; —normalising the data in one or more fields to create a consolidated data set; —identifying one or more networks based on identical or similar instances of one or more pieces of data in the consolidated data set; and —calculating a measure of influence of one or more of the actors in an identified network.	05-26-2011
20110125747	DATA CLASSIFICATION BASED ON POINT-OF-VIEW DEPENDENCY - Data classification is used to classified input items by associating the input items with one or more classes from a set of one or more classes in a data classification system, including identifying relevant features in an input item to form a feature vector for the input item, receiving at the data classification system an indication of a point-of-view, adjusting the feature vector according to the point-of-view indication or modifying a pattern discriminator (e.g., trainer and classifier) to inline-process feature vectors depending on the provided point-of-view (e.g., SVM custom kernels), and classifying the input item into the set of classes according to the point-of-view. The point-of-view data can be introduced either as a pre-process step prior to passing it off to the pattern discrimination algorithm, or can be incorporated directly into the pattern discrimination algorithm if applicable. The pattern discrimination algorithms can detect arbitrary patterns given a similarly prepared dataset during both training and subsequent classification of unclassified documents.	05-26-2011
20110125748	Method and Apparatus for Real Time Identification and Recording of Artifacts - Methods and a system of method and apparatus for real time identification and recording of artifacts are disclosed. In one embodiment, a method of network database maintenance includes designating a network packet data to be stored in one of a packet capture repository and a file system resident database to indicate an artifact type, a protocol type, an application, a user-definable attribute, and a temporal session duration based on a real-time packet inspection. The method includes grouping the designated packet data in a database including packet data having a similar one of the artifact type, the protocol type, the application, the user-definable attribute and the temporal session duration. In addition, the method of network database maintenance includes indexing the database to point to a memory location of the designated packet data grouped in the database in the packet capture repository.	05-26-2011
20110125749	Method and Apparatus for Storing and Indexing High-Speed Network Traffic Data - Storing and indexing of high-speed network traffic data is disclosed. In one embodiment, a method of network database maintenance includes sequentially recording in real-time packet header and/or packet content attributes derived from network packets captured and stored in one of a packet capture repository and a file system in database units ordered by arrival of the network packet data. In addition, the method includes indexing each database unit to point to a memory location of the network packet data in one of the packet capture repository and the file system. The method also includes computing a hash value on certain input data and creating index bitmaps on each database unit to facilitate grouping of a similar attributes associated with the network packet data recorded in the database units. The resulting data may then be stored in compressed and/or encrypted formats on a file system for efficiency and security.	05-26-2011
20110125750	VISUAL STRUCTURING OF MULTIVARIABLE DATA - Providing visual structuring of multivariable data sets in records includes: defining a key field for sorting the records; sorting the records by the key field to find patterns; grouping equivalent field values of the key field in an equivalent field value group; forming corresponding blocks for each equivalent field value group, displaying only one field value for each block, and masking all other field values of the block.	05-26-2011
20110125751	System And Method For Generating Cluster Spines - A system and method for generating cluster spines is provided. Clusters of documents are maintained. Each document is associated with a document concept that is formed from one or more terms extracted from that document. At least one cluster concept is determined for each cluster. The document concepts are ranked and at least one of the document concepts that is highly ranked is selected as the cluster concept. One or more spines are formed. Each spine includes two or more clusters that share at least one of the cluster concepts. The shared cluster concept is identified as a spine concept. One or more of the remaining clusters is assigned to the spines based on a similarity between the cluster concepts for the remaining clusters and the spine concepts for the formed spines.	05-26-2011
20110131209	KNOWLEDGE DISCOVERY TOOL RELATIONSHIP GENERATION - A system for managing a knowledge model defining a plurality of entities is provided. The system includes an extraction tool for extracting data items from disparate data sources that determines if the data item has been previously integrated into the knowledge model. The system also includes an integration tool for integrating the data item into the knowledge model that integrates the data item into the knowledge model only if the data item has not been previously integrated into the knowledge model. Additionally, a relationship tool for identifying, automatically, a plurality of relationships between the plurality of entities may also be provided. The system may also include a data visualization tool for presenting the plurality of entities and the plurality of relationships.	06-02-2011
20110137898	UNSTRUCTURED DOCUMENT CLASSIFICATION - A document classification method comprises: (i) classifying pages of an input document to generate page classifications; (ii) aggregating the page classifications to generate an input document representation, the aggregating not being based on ordering of the pages; and (iii) classifying the input document based on the input document representation. A page classifier for use in the page classifying operation (i) is trained based on pages of a set of labeled training documents having document classification labels. In some such embodiments, the pages of the set of labeled training documents are not labeled, and the page classifier training comprises: clustering pages of the set of labeled training documents to generate page clusters; and generating the page classifier based on the page clusters.	06-09-2011
20110137899	PARTITIONED LIST - Initial items can be partitioned into a plurality of partitions. The partitions can be stored in a partitioned list in computer storage. An index to the partitions can be generated. One or more initial items can be invalidated, and additional items can be appended to the partitioned list in a storage space previously occupied by the invalidated initial items. The index can be updated to omit references to the invalidated items, and to include references to the additional items. Also, a slice of an application call tree can be generated from a partition loaded into memory from a log. A representation of the slice can be displayed on a computer display, even before the entire application call tree is generated from the log.	06-09-2011
20110137900	METHOD TO IDENTIFY COMMON STRUCTURES IN FORMATTED TEXT DOCUMENTS - A computer implemented method, computer program product and data processing system, for identifying common structures shared across a plurality of formatted text documents. The common structure is presented as a sequence of landmarks, each of which has a starting and ending marker to describe the borders of text. The common structure is identified by counting the occurrences of repeating text segments across documents. Frequently co-occurred adjacent segments become candidates for markers of landmarks. In addition, styling information of textual content within a landmark is extracted and mapped to rules. The rules are used to merge and summarize content from multiple documents, which gives an advantage over current practice of content concatenation.	06-09-2011
20110137901	METHOD FOR ENHANCING FAST BACKWARD PERFORMANCE AND ASSOCIATED ELECTRONIC DEVICE - A method for enhancing fast backward performance includes: with regard to a plurality of offsets of a multimedia file, respectively storing corresponding cluster numbers into a first buffering region/buffer, where the offsets respectively correspond to different playback time points, and the cluster numbers respectively represent a plurality of clusters belonging to the multimedia file; and utilizing at least one portion of the offsets and the cluster numbers to perform a fast backward operation of the multimedia file. An associated electronic device is further provided.	06-09-2011
20110137902	Search and Retrieval of Objects in a Social Networking System - A social networking system receives a query associated with a user and, in response, provides a combined result set comprising objects stored by a social networking system that match the query. The combined result set comprises multiple result sets obtained from different search algorithms. The various objects stored by the social networking system may be of different types representing different concepts, such as user objects, application objects, event objects, location objects, group objects, and hub/page objects, any of which may be included in the result set. The objects of the result set may be further filtered, ordered, and/or grouped based at least in part on known relationships of the user with the objects, such as geographic distances between locations associated with the user and the objects.	06-09-2011
20110137903	OPTIMIZATION AND VISUAL CONTROLS FOR REGIONALIZATION - In accordance with certain embodiments of the present disclosure, a regionalization method is disclosed. The method includes inputting a data set into a computer. The method further includes utilizing the computer to perform contiguity-constrained hierarchical clustering on the data set to generate two regions and performing a fine-tuning procedure on the two regions with the computer to iteratively modify the boundaries between the two regions.	06-09-2011
20110145237	METHOD AND SYSTEM FOR MERCHANDISE HIERARCHY REFINEMENT BY INCORPORATION OF PRODUCT CORRELATION - System, method and computer program product for adjusting a representation of a merchandise hierarchy associated with an entity such as a retailer or wholesaler of products. Product correlation information discovered in that entity's customers' shopping records are obtained and incorporated into an existing merchandise hierarchy with a constraint on the consistency with the existing hierarchy.	06-16-2011
20110145238	Technique for Fast and Efficient Hierarchical Clustering - A fast and efficient technique for hierarchical clustering of samples in a dataset includes compressing the dataset to reduce a number of variables within each of the samples of the dataset. A nearest neighbor matrix is generated to identify nearest neighbor pairs between the samples based on differences between the variables of the samples. The samples are arranged into a hierarchy that groups the samples based on the nearest neighbor matrix. The hierarchy is rendered to a display to graphically illustrate similarities or differences between the samples.	06-16-2011
20110145239	GROUPING OF COMPUTERS IN A COMPUTER INFORMATION DATABASE SYSTEM - A computer information database system manages computer profile data for a set of computers. A profile group managing server coupled to the database manages the database such that there is a multiple node tree structure of groups for the set of computers in which each node is a group level and a top level is a root, based upon primary grouping criteria that correspond to selected computer profile data. Included in a database mapping table are fields that correspond to ranges of values for computer profile data of interest corresponding to primary grouping criteria including ranges that extend between a selected high and a selected low value. The ranges for any or all of the grouping criteria may be altered. The data in the database can be manipulated to produce summaries and reports of attributes of the computers in a given group.	06-16-2011
20110145240	Organizing Annotations - A method, a system and a computer program of organizing annotations are disclosed. The method includes receiving an annotation, accessing an annotation repository and accessing a reference repository. The annotation repository includes stored annotation units. The reference repository includes stored references corresponding to the stored annotation units. The method further includes generating a reference corresponding to the annotation and initializing the reference. The method further includes recursively parsing the annotation into annotation units and comparing the parsed annotation units with the stored annotation units. The method further includes populating the reference with appropriate stored references and generating new reference in response to the comparison. The method also includes updating the annotation repository in response to the comparison. Also disclosed are a system and a computer program for organizing annotations.	06-16-2011
20110145241	Reducing Overheads in Application Processing - A method, a system and a computer program of reducing overheads in multiple applications processing are disclosed. The method includes identifying resources interacting with each of the applications from a set of applications and grouping the applications from the set of applications, resulting in at least one application cluster, in response to the identified resources. The method further includes assigning an agent corresponding to each of the identified resources and initializing the agent corresponding to each of the identified resources. The method further includes identifying parameters associated with the identified resources, pre-processing the identified parameters for each of the identified resources, and also includes selecting a clustering means for the clustering. The method further includes computing the application clusters using the selected clustering means and the identified parameters, and also includes sharing the agents corresponding to each of the identified resources interacting with the applications in the at least one application cluster. Also disclosed are a system and a computer program for reducing overheads in multiple applications processing.	06-16-2011
20110145242	Intelligent Redistribution of Data in a Database - Embodiments of the present invention include methods, systems and computer program products. The embodiments of the present invention intelligently distribute data files within a database based upon predetermined conditions. In one embodiment, the present invention includes a computer-implemented method including, classifying a data set in response to metadata corresponding to one or more data files located on a single database; and creating a data file topology comprising a data file identifier, a data file location and a data file type. The method may also include receiving a predetermined rule directory comprising a set of features corresponding to one or more file systems; and in response to the data file topology and the predetermined rule directory, reorganizing the data set such that at least a portion of the data set is moved to one of a set of new file systems having a predetermined optimized characteristic.	06-16-2011
20110145243	Sharing of Data Across Disjoint Clusters - Methods and devices are provided for sharing data across two or more different clusters. An operating system (OS) in a cluster checks a metadata record of a file system of a shared device to retrieve path group identifiers (PGIDs). A control unit list of the shared device is checked to retrieve PGIDs that are active on the shared device. An input/output supervisor (IOS) record in a couple dataset is checked to retrieve PGIDs in the cluster. The metadata record, control unit list, and IOS record are compared, and if a PGID is found in the metadata record that is not in the IOS record and if the found PGID is not in the control unit list, the found PGID is not active on the shared device. The found PGID of the different cluster is removed from metadata record, and members of the cluster can R/W to file system.	06-16-2011
20110145244	MULTI-DIMENSIONAL HISTOGRAM METHOD USING MINIMAL DATA-SKEW COVER IN SPACE-PARTITIONING TREE AND RECORDING MEDIUM STORING PROGRAM FOR EXECUTING THE SAME - The present disclosure relates to a multi-dimensional histogram method using a minimal data-skew cover in a space-partitioning tree, which is used to estimate the selectivity of queries, that is, the sizes of query results, and a recording medium storing a program for executing the multi-dimensional histogram method. In the multi-dimensional histogram method, a Database (DB) system receives information required to generate a histogram from an outside of the DB system, and then constructs a space-partitioning tree based on the information required to generate a histogram. The DB system constructs a multi-dimensional histogram based on a minimal data-skew cover in the space-partitioning tree. When the DB system receives a query from the outside, the DB system calculates the estimate of the selectivity for the query by using the multi-dimensional histogram. Further, the present disclosure includes a recording medium storing a program for executing the multi-dimensional histogram method.	06-16-2011
20110145245	ELECTRONIC DEVICE AND METHOD FOR PROVIDING INFORMATION USING THE SAME - An electronic device and method for providing information using the same are provided. According to an embodiment, the electronic device acquires relationships between or among people or assigns relationships between or among people and manages data regarding the people based on the relationships.	06-16-2011
20110145246	DESK-TOP, STREAM-BASED, INFORMATION MANAGEMENT SYSTEM - A steam-based document storage and retrieval system accepts documents that are in diverse formats and come from diverse application, automatically creates document model objects describing these documents in a consistent format and associating time stamps with the documents to automatically create a main stream in chronological order. The stream, or sub-streams meeting selected search criteria, are displayed in a variety of forms, including a receding, partly overlapping stack with aids that facilitate user interaction.	06-16-2011
20110145247	INTERPRETING LOCAL SEARCH QUERIES - A search query may be interpreted as a number of possible interpretations, and each interpretation may be explored before the results of the search are sent to a user. In one embodiment, a device may split the search query into partitions. Each of the partitions may be submitted, as a search query, to search repositories. Confidence scores based on the results returned from the repositories may be used to determine a measure of confidence of the repository in the search query interpretation.	06-16-2011
20110145248	System and Method to Determine the Validity of an Interaction on a Network - A computer implemented method classifies a user interaction on a network. User interaction data relating to a user interaction on a network is accessed. The user interaction data comprises an aggregate measure data or a unique feature data. The user interaction data is processed to generate a score for the user interaction and determines a classification of the user interaction based on the score.	06-16-2011
20110153603	TIME SERIES STORAGE FOR LARGE-SCALE MONITORING SYSTEM - Methods and apparatus are described for collecting and storing large volumes of time series data. For example, such data may comprise metrics gathered from one or more large-scale computing clusters over time. Data are gathered from resources which define aspects of interest in the clusters, such as nodes serving web traffic. The time series data are aggregated into sampling intervals, which measure data points from a resource at successive periods of time. These data points are organized in a database according to the resource and sampling interval. Profiles may also be used to further organize data by the types of metrics gathered. Data are kept in the database during a retention period, after which they may be purged. Each sampling interval may define a different retention period, allowing operating records to stretch far back in time while respecting storage constraints.	06-23-2011
20110153604	EVENT-LEVEL PARALLEL METHODS AND APPARATUS FOR XML PARSING - Embodiments of techniques and systems for parallel XML parsing are described. An event-level XML parser may include a lightweight events partitioning stage, parallel events parsing stages, and a post-processing stage. The events partition may pick out event boundaries using single-instruction, multiple-data instructions to find occurrences of the “<” character, marking event boundaries. Subsequent checking may be performed to help identify other event boundaries, as well as non-boundary instances of the “<” character. During events parsing, unresolved items, such as namespace resolution or matching of start and end elements, may be recorded in structure metadata. This structure metadata may be used during the subsequent post-processing to perform a check of the XML data. If the XML data is well-formed, individual sub-event streams formed by the events parsing processes may be assembled into a flat result event stream structure. Other embodiments may be described and claimed.	06-23-2011
20110153605	System and method for aggregating and curating media content - A method and system for aggregating and curating a plurality of media content pieces. The method including the steps of aggregating a plurality of media content pieces from various sources and curating each of the plurality of media content pieces to determine metadata associated with each of the plurality of media content pieces including an ideology level for each of the media content pieces. Each of the media content pieces and determined metadata are stored in a database. The stored/aggregated media content pieces are then made accessible to a user using a search feature that selects at least one of the plurality of media content pieces based upon metadata stored in the database.	06-23-2011
20110153606	APPARATUS AND METHOD OF MANAGING METADATA IN ASYMMETRIC DISTRIBUTED FILE SYSTEM - Provided are an apparatus and a method which can be easily implemented with flexibility enabling distributing all metadata of trees and files in an asymmetric distributed file system. The apparatus includes: a metadata storage unit storing metadata corresponding to a part of partitions of a virtual metadata address space storing metadata for directories and/or files for each of the partitions; and a metadata storage management unit controlling the metadata so that the metadata are stored in the metadata storage unit and manages a master map including information on the part of the partitions. Since all directories and files can be distributed to a plurality of metadata servers without a limitation, it is possible to prevent a load from being concentrated on a predetermined metadata server. Metadata roles of the metadata servers are very simply readjusted and as a result, the load can be easily distributed in a partition level.	06-23-2011
20110153607	COMMUNITY-DRIVEN APPROACH FOR SOLVING THE TAG SPACE LITTERING PROBLEM - In order to provide an improved tagging-based search method, the method includes one-click actions for searching as well as an “one view” indicator telling the user which search would be the most effective or most important search—by displaying a “search cloud”, called search bag.	06-23-2011
20110161323	Information Processing Device, Method of Evaluating Degree of Association, and Program - There is provided an information processing device including: a storage unit that stores information element data defining a plurality of information elements; an information acquisition unit that acquires an information set having a referential relationship with each other from an information source accessible through a communication network; a classification unit that classifies information included in the information set acquired by the information acquisition unit into information of a first class corresponding to an information element defined by the information element data and information of a second class other than the information of the first class; and an evaluation unit that evaluates a degree of association between information elements respectively corresponding to two or more information of the first class based on a referential relationship between the information of the first class and the information of the second class in the information set.	06-30-2011
20110161324	COMMUNITY-BASED PARENTAL CONTROLS - According to a general aspect, a method includes maintaining rating groups, each rating group providing a rating for content compiled based on information received from a user evaluating the content. The method also includes receiving, from a first user, a selection of a first rating group, from among the rating groups, to be applied to a set of users associated with the first user. The method also includes receiving, from a user, a request for a piece of content from the content. The method also includes determining that the user from which the request was received belongs to the set of users associated with the first user. The method also includes, based upon the determination that the user belonged to the set of users associated with the first user, accessing information associated with the first rating group and determining whether the first rating group includes a rating for the requested piece of content. The method also includes determining whether or not to provide information to the requesting user conditioned on the indication or absence of a rating for the requested piece of content within the first rating group.	06-30-2011
20110167063	TECHNIQUES FOR CATEGORIZING WEB PAGES - Web pages are efficiently categorized in a data processor without analyzing the content of the web pages. According to at least one embodiment, data is maintained that represents sample URLs grouped into a plurality of clusters. The sample URLs of a cluster are used to produce a URL regular expression pattern (“URL-regex”) that differentiates the sample URLs of the cluster from the sample URLs of other clusters and that covers at least a specified percentage of the sample URLs in the cluster. The process of producing a URL-regex is repeated for each of the clusters producing a URL-regex for each cluster. Web pages are then categorized into one of the clusters by determining which of the URL-regex patterns produced for the clusters match URLs that refer to the web pages. Thus, a web page may be categorized based on a URL that refers to the web page without having to obtain and analyze the content of the web page.	07-07-2011
20110167064	CROSS-DOMAIN CLUSTERABILITY EVALUATION FOR CROSS-GUIDED DATA CLUSTERING BASED ON ALIGNMENT BETWEEN DATA DOMAINS - A system and associated method for evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source-target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively.	07-07-2011
20110167065	DATA GENERATING APPARATUS, INFORMATION PROCESSING APPARATUS, DATA GENERATING METHOD, INFORMATION PROCESSING METHOD, DATA GENERATING PROGRAM INFORMATION PROCESSING PROGRAM AND RECORDING MEDIUM - A data generating apparatus includes an acquiring unit that acquires text data (name data) related to a name associated with position information; a classifying unit that using the acquired position data, classifies the name data according to given regions; an integrating unit that integrates neighboring regions such that the total data size of the name data included in regions to be integrated does not exceed a predetermined given data size; a storage unit that groups the name data according to integrated regions and stores the grouped name data as a name dictionary to be used in both a facility search process and a map display process; and an extracting unit that from the classified name data, extracts the name data common to regions of a given number or more, where the storage unit groups and stores the common name data as a common name dictionary different from the name dictionary.	07-07-2011
20110167066	CONTENT ITEM REVIEW MANAGEMENT - A content item review management apparatus (	07-07-2011
20110173197	METHODS AND APPARATUSES FOR CLUSTERING ELECTRONIC DOCUMENTS BASED ON STRUCTURAL FEATURES AND STATIC CONTENT FEATURES - Exemplary methods and apparatuses are provided which may be implemented using one or more computing devices to allow for super clustering of clusters of electronic documents based, at least in part, on structural and static content features.	07-14-2011
20110173198	RECOMMENDATIONS BASED ON RELEVANT FRIEND BEHAVIORS - Embodiments are directed towards determining dependent interest affinity values between users to identify users that may mirror interests and thereby have an increased probability of becoming friends. A plurality of tracked online activities are classified into a plurality of interests categories, and used to determine weighted scores for each interest based on a quantity and quality of related activities for the interest. A proportional score for each interest is also determined and used with the weighed scores to generate dependent interest affinities between pairs of users. Interest indices are obtained and rank ordered for a given user and another user based on relevant dependent interest affinities. The resulting interest indices may be filtered based on a variety of criteria. At least some information about the related other users may be displayed to the given user based on the rank ordering, as possible mirrored friends.	07-14-2011
20110173199	COMPUTER SYSTEM PERFORMANCE ANALYSIS - This invention relates to a method and device for computer system performance analysis. All instructions are split into clusters based on significant offset gaps in top-down processing steps. Comments on instruction clusters can be generated automatically or can be edited manually. The comments can be shared among users for the achievement of portability. Significant clusters can be recognized as hotspots based on predetermined metrics.	07-14-2011
20110179028	AGGREGATING DATA FROM A WORK QUEUE - One or more techniques and/or systems are disclosed herein for aggregating web-based data stored in a distributed data store so that it can be retrieved in a first-in, first-out (FIFO) manner. A unique aggregation key is generated for respective one or more data generated from a web-based event, where the one or more data are added to the distributed data store, and the aggregation key corresponds merely to the data generated from the web-based event. The one or more data from the web based event is aggregated in a FIFO queue and stored in a same partition of the distributed data store, based on the aggregation key.	07-21-2011
20110179029	EXPERIENCE INFORMATION PROCESSING APPARATUS AND METHOD FOR SOCIAL NETWORKING SERVICE - An experience information processing apparatus for a social networking service, includes an ontology unit for providing a social ontology including social connection information and location information of a user and a service ontology including web service information, service location information and tag information. Further, the experience information processing apparatus includes an experience information management unit for extracting experience information content having location information from a plurality of mobile devices, classifying the extracted experience information content using the ontology unit to establish an experience information database, and searching the established experience information database based on the location information in response to a request from the mobile device to provide a social media service by linking the social connections information, the location information, and the tag information. Furthermore, the experience information processing apparatus includes an experience information storage unit for storing the experience information database.	07-21-2011
20110179030	METHOD AND APPARATUS FOR INDEXING SUFFIX TREE IN SOCIAL NETWORK - A method for indexing a suffix tree in a social network includes: scanning an input string and dividing the string into partitions each having a common prefix; performing no-merge suffix tree indexing on the divided partitions; storing information on the partitions on which no-merge suffix tree indexing is performed; storing suffix nodes of the no-merge suffix tree; and establishing a prefix tree. The performing no-merge suffix tree indexing includes: generating a set of suffixes having the common prefix in the input string; generating a suffix set from the set of suffixes and storing the suffix set; and building the suffix set as a sub-tree.	07-21-2011
20110179031	CONFIGURATION INFORMATION MANAGEMENT DEVICE, DISTRIBUTED INFORMATION MANAGEMENT SYSTEM, AND DISTRIBUTED INFORMATION MANAGEMENT METHOD - A configuration information management device includes a configuration information storage unit for storing a configuration item indicative of information about a target of management, and an item relationship indicative of information about a connection between configuration items independently of a different configuration information management device. When a request to enter a cluster is accepted that is a group of a configuration item and an item relationship connected together, the configuration information management device determines a destination to store the cluster, and controls to cause the configuration information storage unit or the different configuration information management device to store the cluster. When a search request to search for a configuration item or an item relationship is accepted, the configuration information management device specifies a place where a cluster containing the target of the search is stored, and retrieves the configuration item or the item relationship targeted for the search from the storage place of the cluster.	07-21-2011
20110179032	CONCEPTUAL WORLD REPRESENTATION NATURAL LANGUAGE UNDERSTANDING SYSTEM AND METHOD - A Natural Language Understanding system is provided for indexing of free text documents. The system according to the invention utilizes typographical and functional segmentation of text to identify those portions of free text that carry meaning. The system then uses words and multi-word terms and phrases identified in the free to text to identify concepts in the free text. The system uses a lexicon of terms linked to a formal ontology that is independent of a specific language to extract concepts from the free text based on the words and multi-word terms in the free text. The formal ontology contains both language independent domain knowledge concepts and language dependent linguistic concepts that govern the relationships between concepts and contain the rules about how language works. The system according to the current invention may preferably be used to index medical documents and assign codes from independent coding systems, such as, SNOMED, ICD-9 and ICD-10. The system according to the current invention may also preferably make use of syntactic parsing to improve the efficiency of the method.	07-21-2011
20110179033	MULTI-PASS DATA ORGANIZATION AND AUTOMATIC NAMING - A method and a system to organize a data set into groups of data subsets in multiple passes using different parameters and to automatically name the groups is disclosed. For example, a data set is retrieved in accordance with a search query submitted by a user. The data set is organized into clusters based on a statistic(s) of the data set. The data set is then organized into groups of data subsets based on an attribute(s) indicated by the data set. Each of the groups are automatically named based on a property shared by data units of the group. The name(s) of a group may be mined from the data units of the group, retrieved from a structure that maps to attribute values indicated by the data units of the group, etc.	07-21-2011
20110184948	MUSIC RECOMMENDATION METHOD AND COMPUTER READABLE RECORDING MEDIUM STORING COMPUTER PROGRAM PERFORMING THE METHOD - A music recommendation method and a computer readable recording medium storing a computer program performing the method are provided. In the music recommendation method, music items and rating data matrix comprising ratings and user IDs are first provided. Then, the ratings of each music item are classified into positive ratings and negative ratings. Thereafter, a pre-processing phase comprising a frame-based clustering step and a sequence-based clustering step is performed to transform the music items into perceptual patterns. Then, a prediction phase is performed to determine an interest value of a plurality of target music items for an active user. Thereafter, the target music items arranged into a music recommendation list in accordance with the first interest value and the second interest values, wherein the music recommendation list is a reference for the active user to select one of the target items.	07-28-2011
20110184949	RECOMMENDING PLACES TO VISIT - A method for recommending places to visit, included using a processor to provide the following steps: assembling a collection of images, wherein each image has first and second tags with the first tag corresponding to the location where the image was taken, and the second tag corresponding to subject matter of the image; clustering the images in response to the first tags into a plurality of locations; using the images in each location to produce at least one representative image of the location; using the second tags of images of each location to produce a list of representative keywords for each location; providing a query in the form of an image or subject matter, or both; and using the query in the form of an image to search among the representative images to recommend a location to visit, or using the query in the form of subject matter to search among the keywords to recommend a location to visit.	07-28-2011
20110184950	SYSTEM FOR CREATIVE IMAGE NAVIGATION AND EXPLORATION - A system and method for assisting a user in navigation of an image dataset are disclosed. The method includes receiving a user's text query, retrieving images responsive to the query from an image dataset, providing for receiving the user's selection of a first feature selected from a set of available first features via a graphical user interface, providing for receiving the user's selection of a second feature selected from a set of available second features different from the first features via the graphical user interface, and displaying at least some of the retrieved images on the graphical user interface. The displayed images are arranged, e.g., grouped, according to levels and/or combinations of levels of the user-selected first and second features.	07-28-2011
20110184951	PROVIDING QUERY SUGGESTIONS - Methods and computer-readable media are provided for determining suggested queries. A user enters a search website, and the user is identified based on a user identification. Suggested queries are determined based on a group associated with the user. This association is created by extracting queries from data logs, categorizing the queries into groups based on their respective subject matter, associating the user with one or more groups, and determining suggested queries for each group. The suggested queries are communicated for display.	07-28-2011
20110184952	Method And Apparatus For Fast Audio Search - According to embodiments of the subject matter disclosed in this application, a large audio database in a multiprocessor system may be searched for a target audio clip using a robust and parallel search method. The large audio database may be partitioned into a number of smaller groups, which are dynamically scheduled to available processors in the system. Processors may process the scheduled groups in parallel by partitioning each group into smaller segments, extracting acoustic features from the segments; and modeling the segments using a common component Gaussian Mixture model (“CCGMM”). One processor may also extract acoustic features from the target audio clip and model it using the CCGMM. Kullback-Leibler (KL) distance may be further computed between the target audio clip and each segment. Based on the KL distance, a segment may be determined to match the target audio clip; and/or a number of following segments may be skipped.	07-28-2011
20110191342	URL Reputation System - A URL reputation system may have a reputation server and a client device with a cache of reputation information. A URL reputation query from the client to the server may return reputation data along with probabilistic set membership information for several variants of the requested URL. The client may use the probabilistic set membership information to determine if the reputation server has additional information for another related URL as well as whether the classifications are inheritable from one of the variants. If the probabilistic set membership determines that the reputation server may have additional information, a query may be made to the reputation server, otherwise the reputation may be inferred from the data stored in the cache.	08-04-2011
20110191343	Computer Research Tool For The Organization, Visualization And Analysis Of Metabolic-Related Clinical Data And Method Thereof - A computer research tool for inputting, searching, displaying, and analyzing metabolic-related clinical data utilizing a novel graphical user interface (GUI) for visual-statistical data analysis and insight generation and method thereof are disclosed.	08-04-2011
20110196866	SMALL TABLE: MULTITENANCY FOR LOTS OF SMALL TABLES ON A CLOUD DATABASE - Methods and apparatus are described for partitioning native tables in a database cluster into logical tables. Each logical table is mapped into a unique portion of the native table by an intermediary server. Clients access a logical table as an ordinary, full-fledged database table through the intermediary server, which translates queries on the logical table into queries on the corresponding portion of the native table. The mapping may use the application name, logical table name, and a version number to create a native table key for each key in the logical table. A data structure storing these mappings may be stored at the intermediary server or in a native table in the database. This approach affords clients quick and flexible access to the database with better data integrity and security than native tables allow.	08-11-2011
20110196867	SYSTEM AND METHOD OF GENERATING A PLAYLIST BASED ON A FREQUENCY RATIO - Several methods and systems to generate a playlist based on a frequency ratio are disclosed. In one aspect a method includes, presenting a list of a seed data to a user of a music device, selecting a portion of the seed data based on a preference of a user, and determining an identity of a primary song based on a match between the primary song and the preference of the user. The method also includes providing the user streaming access to the primary song in a database and determining a secondary song based on the primary song. A correlation between the secondary song and the primary song is determined based on an algorithm and the user is provided streaming access to the secondary song. A frequency ratio of the primary song and the secondary song is automatically adjusted in responsive to a selection through a selection tool.	08-11-2011
20110196868	METHODS AND APPARATUS FOR CONTACT INFORMATION REPRESENTATION - Methods and apparatus for the convenient arrangement of a user's address book according to intelligent algorithms. These intelligent algorithms, in one embodiment, take advantage of one or more of: (i) stored contact information associated with one or more users, (ii) stored geographic location information associated with the users and one or more contact entries in the user's address book, and/or (iii) stored voice and data communication information associated with the user. This algorithm arranges the entries in the users address book, using the stored information as an input, in an intelligent manner. In other embodiments, additional information is used as an input to the contact entry arranging algorithms such as, for example, entries in a user's digital calendar. Business methods utilizing the aforementioned methods and apparatus are also disclosed.	08-11-2011
20110196869	CLUSTER STORAGE USING DELTA COMPRESSION - Storage of data segments is disclosed. For each segment, a similar segment to the segment is identified, wherein the similar segment is already managed by a cluster node. In the event the similar segment is identified, a reference to the similar segment and a delta between the similar segment and the segment are caused to be stored instead of the segment.	08-11-2011
20110196870	DATA CLASSIFICATION USING MACHINE LEARNING TECHNIQUES - Systems, methods and computer program products for classifying documents are presented. Systems, methods and computer program products for analyzing documents, e.g., associated with legal discovery are also presented. Systems, methods and computer program products for cleaning up data are also presented. Systems, methods and computer program products for verifying an association of an invoice with an entity are also presented. Systems, methods and computer program products for managing medical records are presented. Systems, methods and computer program products for face recognition are presented.	08-11-2011
20110202528	SYSTEM AND METHOD FOR IDENTIFYING FRESH INFORMATION IN A DOCUMENT SET - A method of identifying a fresh document in a document set is provided. The method may include obtaining a query document that is included in a document set comprising a plurality of documents. The method may also include grouping the plurality of documents into a plurality of fine clusters based on a textual similarity between the plurality of documents. The method may also include identifying a target fine cluster within the plurality of fine clusters, the target fine cluster including the query document. The method may also include ordering the documents included in the target fine cluster by time to identify the fresh document. The method may also include generating a query response that includes the fresh document.	08-18-2011
20110202529	CONFIGURATION ITEM RECONCILIATION - This document discusses, among other things, a method for reconciliation of a configuration item with a configuration management database. Properties of the configuration item are divided into a plurality of classes. Different classes correspond to properties having a different relationship with a corresponding configuration item. At least one property of the configuration item is compared to properties of configuration items in a configuration management database. Different actions are taken with respect to the configuration item based on the class of the property being compared.	08-18-2011
20110202530	Information processing device, method and program - An information processing device includes an obtaining unit that obtains a plurality of contents to which labels indicating users' subjective evaluation of the contents are assigned as metadata, a selection unit that selects labels having a high reliability in regards to evaluation of the contents among the labels assigned to the plurality of contents obtained by the obtaining unit, a calculation unit that calculates a degree of similarity between the labels selected by the selection unit, a clustering unit that clusters the labels based on the degree of similarity calculated by the calculation unit, and a storage unit that stores a cluster obtained as a result of the clustering in the clustering unit, as one label.	08-18-2011
20110202531	Tagging Digital Media - A method for tagging digital media is described. The method includes selecting a digital media and selecting region within the digital media. The method may further include associating a person or entity with the selected region and sending a notification of the association the person or entity or a different person or entity. The method may further include sending advertising with the notification.	08-18-2011
20110208737	APPARATUS AND METHOD FOR INCREMENTAL PHYSICAL DATA CLUSTERING - In a data storage and retrieval system wherein data arranged in nodes is stored and retrieved in pages, each page comprising a cluster of nodes, a method comprising: monitoring ongoing data retrieval to find retrieval patterns of nodes which are retrieved together and to identify changes in said retrieval patterns over time; and periodically reclustering the data nodes among said pages dynamically during usage of the data to reflect said changes, so that nodes more often retrieved together are migrated to cluster together and nodes more often required separately are migrated to cluster separately, thereby to keep small an overall number of page accesses of said data storage and retrieval system during data retrieval despite dynamic changes in patterns of data retrieval.	08-25-2011
20110208738	Method for Determining an Enhanced Value to Keywords Having Sparse Data - A method for associating sparse keywords with non-sparse keywords. The method comprises determining from metrics of a plurality of keywords a list of sparse keywords and non-sparse keywords; generating a similarity score for each sparse keyword with respect of each non-sparse keyword; associating a sparse keyword with a non-sparse keyword; and storing the association between the non-sparse keyword and the sparse keyword in a database.	08-25-2011
20110208739	SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR CONDITIONALLY PERFORMING A QUERY INCLUDING AN AGGREGATE FUNCTION - In accordance with embodiments, there are provided mechanisms and methods for conditionally performing a query including an aggregate function. These mechanisms and methods for conditionally performing a query including an aggregate function can limit performance of queries including aggregate functions based on a number or records associated with such performance of such aggregate functions. The ability to limit performance of queries including aggregate functions can enable performance quality of a computer system to be maintained.	08-25-2011
20110208740	ASSOCIATING DATA WITH R-SMART CRITERIA - Organizing information and/or contacts using emotient properties are provided. A method can include associating respective emotient attributes with contacts utilizing respective index values, and grouping at least two contacts of the contacts into a group based on at least one index value of the index values. The method can further include defining an event by at least one of a time interval or a scenario related to the time interval; receiving, during the event, a message from a contact of the contacts including information associated with the contact; and determining, based on the information, a membership of the contact in the group. Another method can include receiving an emotient property of a contact associated with information, and associating the emotient property with the information utilizing an index value. Further, the method can include providing at least a portion of the information based on the index value.	08-25-2011
20110213775	Database Table Look-up - Techniques for database table look-up are provided. The techniques include storing one or more column attributes of a database table in a data structure, wherein the data structure also comprises a record identification (RID) column of a table, one or more predicate columns corresponding to the RID column, and a sequence number column that is associated with one or more updated records, generating a key using one or more portions from one or more of the one or more predicate columns, using the key to partition the data structure, wherein partitioning the data structure comprises partitioning the one or more predicate columns for evaluation, and evaluating the one or more predicate columns against the data structure for each matching predicate column-data structure partition.	09-01-2011
20110213776	SYSTEM FOR BROWSING THROUGH A MUSIC CATALOG USING CORRELATION METRICS OF A KNOWLEDGE BASE OF MEDIASETS - A system and method to navigate through a media item catalog and generate recommendations using behavioral metrics such as correlation metrics (FIGS.	09-01-2011
20110218994	KEYWORD AUTOMATION OF VIDEO CONTENT - A system and associated method for automatically processing keyword for video content. The video content contains image frames and an audio stream. An image pattern table for image patterns from the image frames and a word pattern table for word patterns from the audio stream are generated by use of respective pattern names provided by pattern recognition tools. Each pattern is associated with a respective count indicating a number of appearances of each pattern. A respective weight of each pattern is calculated as a relative frequency of each pattern. The image pattern table and the word pattern table are merged to generate a keyword list. A predefined number of most frequently appeared patterns are selected by examining the respective weight of each pattern and metadata associated with the video content are updated to utilize pattern names of the selected patterns as keyword for web searches.	09-08-2011
20110218995	DEFERRING CLASSIFICATION OF A DECLARED RECORD - A records management system classifies records according to a file plan. Record are declared, and then classified. Some records have in initially indeterminate classification and classification is deferred, either by request or due to a lack of sufficient information to classify the record according to the file plan. Unclassified records are placed into a temporary container. At some time while in the temporary container a classification event occurs with a given record which allows the records management system to classify the record and place it into a container corresponding to its classification.	09-08-2011
20110218996	APPARATUSES AND METHODS FOR SHARING CONTENTS - An apparatus and method for sharing contents are provided. The apparatus and method may store contents; receive a selection signal for content selected from among the contents; classify the selected content into groups; and generate a service code for each respective group.	09-08-2011
20110218997	METHOD AND SYSTEM FOR BROWSING, SEARCHING AND SHARING OF PERSONAL VIDEO BY A NON-PARAMETRIC APPROACH - A method for determining a predictability of a media entity portion, the method includes: receiving or generating (a) reference media descriptors, and (b) probability estimations of descriptor space representatives given the reference media descriptors; wherein the descriptor space representatives are representative of a set of media entities; and calculating a predictability score of the media entity portion based on at least (a) the probability estimations of the descriptor space representatives given the reference media descriptors, and (b) relationships between the media entity portion descriptors and the descriptor space representatives. A method for processing media streams, the method may include: applying probabilistic non-parametric process on the media stream to locate media portions of interest; and generating metadata indicative of the media portions of interest.	09-08-2011
20110218998	NAVIGATING MEDIA CONTENT BY GROUPS - Grouping media files via playlists on a computer-readable medium. One or more media files are selected according to a grouping criterion to define one or more playlists from the media files. A container group is associated with the playlists and stores values identifying each of the playlists associated with the container group along with references to each of the playlists.	09-08-2011
20110218999	SYSTEM, METHOD AND PROGRAM FOR INFORMATION PROCESSING - The index update unit analyses the information stored in a document repository to create an index for search and stores the index in a time-series divisional index storage unit and creates, from an ACL repository, an access control entry ACE in association with the index for search, which is correlation of information to be searched with access right of at least a group to which the user belongs. The ACL cache generation unit creates ACL cache data that correlates the user with access right to the information to be searched, from the ACE, and registers the ACL cache data created in an ACL cache. A search processing unit searches for an index for search in response to a request for search from said user. In case the ACL cache data correlating the user with the index for search is registered in the ACL cache, the search processing unit_takes, from among the information searched, the information, reference to which is allowed for the user as a search result, based on information in the ACL cache.	09-08-2011
20110219000	SEARCH APPARATUS, SEARCH METHOD, AND RECORDING MEDIUM STORING PROGRAM - Provided is a search apparatus, a search method, and a program that can improve search speed for a document set even when an object to be searched is a large-scale document set. A search apparatus is used which includes an abstract matrix storage unit	09-08-2011
20110225154	HARVESTING RELEVANCY DATA, INCLUDING DYNAMIC RELEVANCY AGENT BASED ON UNDERLYING GROUPED AND DIFFERENTIATED FILES - Methods and apparatus teach a digital spectrum of a file representing underlying original data. The digital spectrum is used to map a file's position. This position relative to another file's position reveals closest neighbors. When multiple such neighbors are grouped together they can be used to indicate relevance in current data under consideration on a same or different computing device. Also, relevance can be found without traditional notions of needing structured data or users initiating searching for relevance or by examining metadata/administrative information associated with the files. A dynamic relevancy agent is configured for installation on the same or different computing device to monitor events regarding the current data and to initiate the examination of relevancy. It also presents to a user suggestions of data closest to the current data. Various triggering events to undertake a relevancy examination are also described as are predetermined criteria to define relative closeness.	09-15-2011
20110225155	SYSTEM AND METHOD FOR GUIDING ENTITY-BASED SEARCHING - A system and method are provided for refining a user's query. An entity index, generated from a corpus of text documents, is provided. The entity index includes a set of entity structures, each including a plurality of terms. Each of the terms of an entity structure is a feature of the same entity. Entity structures can be retrieved from the entity index which match at least a portion of the user's query. Clusters of the retrieved entity structures are identified which have at least one of their terms in common. A cluster hierarchy is generated from the identified clusters in which nodes of the hierarchy are defined by one or more of the terms of the retrieved entity structures. At least a portion of the cluster hierarchy is presented to the user for facilitating refinement of the user's query through user selection of a node which, when formulated as a search, retrieves one or more responsive documents from the corpus of documents.	09-15-2011
20110225156	SEARCHING TWO OR MORE MEDIA SOURCES FOR MEDIA - Presented is a method for obtaining a single set of media search results from a search of media sources. The method includes providing a search query, executing a search of each of the media sources for media based on the provided search query, generating results of the searches, and consolidating the results of the searches into the single set of search results that include a list of media items with associated metadata. The method further includes organizing the media items into groups. Each of the groups includes media items that have similar content. The similarity of the content is defined by the media items' associated metadata meeting a metadata matching threshold. The method further includes filtering the groups, sorting the groups of media items with respect to each other, sorting the media items within the groups, and displaying the groups of media items.	09-15-2011
20110225157	METHOD AND SYSTEM FOR PROVIDING WEBSITE CONTENT - An exemplary embodiment of the present invention provides a method of generating Website content. The method includes generating a client profile comprising a cluster type obtained from a list of cluster types and information received from a user ID, wherein the list of cluster types is generated by processing a database of computer usage. The method includes utilizing the relevant cluster types included in the client profile to a selected Website, wherein the cluster type is used by the Website at least in part to determine the content provided by the Website.	09-15-2011
20110225158	Method and System for Abstracting Information for Use in Link Analysis - Observable data points are collected and organized into a link-oriented data set comprising nodes and links. Information is abstracted for use in link analysis by generating links between the collected data points, including deriving links and inducing links. A link can be induced by linking together a pair of nodes that satisfy a distance function. Exemplary distance functions that can be used to induce links include geo spatial proximity, attribute nearness, and name similarity. Paths can be identified between selected nodes of interest through a dataset operation, and nodes and/or links can be selectively included or excluded from the data set operation. The dataset can be augmented with pedigree information or one or more association nodes. Link information, including a trajectory and a connected path that selectively produces or excludes one or more intermediate nodes, can be displayed and/or produced in a specified format.	09-15-2011
20110231399	Clustering Method and System - The present disclosure discloses a method and system for clustering. The method includes: vectorizing a plurality of readable files to obtain a plurality of file vectors corresponding to the multiple readable files; extracting a total characteristic vector based on the file vectors; and clustering the readable files based on a ranking result of a respective similarity degree between the total characteristic vector and each of the file vectors. The present disclosure also provides a method and system for clustering webpages. An application of the methods or systems described in the present disclosure reduces the number of times of comparison of similarity degrees between file vectors, and further reduces the resulting burden on system resources. This advantageously results in reduced usage of CPU and memory, reduced run time of clustering and improved performance of clustering.	09-22-2011
20110238664	Region Based Information Retrieval System - A region based information retrieval system improves on conventional information retrieval systems by breaking down documents into one or more region(s) and processing the additional information available at a region level of analysis. When looking at regions, it becomes possible to quickly distinguish between groups of related documents, quickly ignore or focus on certain information, track recent evolutions of documents, as well as understand the historical relationships, heritage, and versions of these documents. This is all possible whether or not the document publishers specify where the content originally came from.	09-29-2011
20110246463	SUMMARIZING STREAMS OF INFORMATION - Concepts and technologies are described herein for summarizing streams of information. A stream of information is obtained and analyzed. One or more entities are identified in the stream. The data in the stream is grouped into one or more clusters corresponding to the identified entities. The data in the clusters is summarized, and a timeline corresponding to the data in the cluster is determined. In some embodiments, a format can be selected for presentation of the summarized stream data. The data in the stream can be formatted in the selected format, and the summarized data can be presented in the selected format. In some embodiments, an update feature can be used to update the data in the summarized stream. The data in the stream can be updated, and the updated summarized stream can be formatted and presented.	10-06-2011
20110246464	KEYWORD PRESENTING DEVICE - In one embodiment, there is provided a keyword presenting device including: an extraction unit configured to extract a plurality of keywords from a browsing document; a determination unit configured to arrange keywords with spellings similar to each other among the plurality of keywords to obtain a plurality of groups of similar keyword; an integration unit configured to classify the keywords into main keywords that are titles and the other sub-keywords for each group of similar keywords, and to integrate the sub-keywords into the main keywords; and a presentation unit configured to present the main keywords to a user.	10-06-2011
20110246465	METHODS AND SYSEMS FOR PERFORMING REAL-TIME RECOMMENDATION PROCESSING - Methods and systems are presented for recommending similar questions to one that a user has entered into a search engine. Previously-entered questions are subject to a clustering algorithm and placed into a hierarchy of clusters, with clusters set within clusters. For each cluster within the hierarchy, a representative vector, based on feature vectors of the items within the cluster, is calculated. A feature vector for the user's question is calculated and used, along with the representative vectors at each level in the hierarchy, to traverse and navigate the cluster hierarchy. When a leaf cluster is found, the items in the leaf cluster, such as the previously-entered questions are returned to the user. A subset of items in the leaf cluster, or items from other leaf clusters within a branch cluster, can be selected based on the number of items desired to be returned.	10-06-2011
20110246466	RELEVANCE FEEDBACK ON A SEGMENT OF A DATA OBJECT - The invention relates to a method, a system (	10-06-2011
20110252032	ANALYSIS OF COMPUTER NETWORK ACTIVITY BY SUCCESSIVELY REMOVING ACCEPTED TYPES OF ACCESS EVENTS - An analysis system is described for identifying potentially malicious activity within a computer network. It performs this task by interacting with a user to successively remove known instances of non-malicious activity, to eventually reveal potentially malicious activity. The analysis system interacts with the user by inviting the user to apply labels to identified examples of network behavior; upon response by the user, the analysis system supplies new examples of network behavior to the user. In one implementation, the analysis system generates such examples using a combination of feature-based analysis and graph-based analysis. The graph-based analysis relies on analysis of graph structure associated with access events, such as by identifying entropy scores for respective portions of the graph structure.	10-13-2011
20110252033	SYSTEM AND METHOD FOR MULTITHREADED TEXT INDEXING FOR NEXT GENERATION MULTI-CORE ARCHITECTURES - A system and method for indexing documents in a data storage system includes generating a single document hash table in storage memory for a single document using an index construction in a multithreaded and scalable configuration wherein multiple threads are each assigned work to reduce synchronization between threads. The single document hash table includes partitioning the single document and indexing strings of partitioned portions of the single document to create a minor hash table for each document sub-part; generating a document level hash table from the minor hash tables; updating a stream level hash table for the strings which maps every string to a global identifier; and generating a term reordered array from the document level hash table.	10-13-2011
20110252034	MEASURING ENTITY EXTRACTION COMPLEXITY - A named entity input is received and a target sense for which the named entity input is to be extracted from a set of documents is identified. An extraction complexity feature is generated based on the named entity input, the target sense, and the set of documents. The extraction complexity feature indicates how difficult or complex it is deemed to be to identify the named entity input for the target sense in the set of documents.	10-13-2011
20110252035	IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND PROGRAM - An image processing apparatus includes: a database configured to, out of images extracted with a first frequency from the images making up a moving image content, register first layer summary data of a first size; a distance calculation block configured to calculate the distance between the first layer summary data based on the distance between vectors of which the elements are formed by the first layer summary data registered in the database; and a classification section configured to cluster into the same class the first layer summary data between which the distance calculated by the distance calculation block falls within a predetermined distance, the classification section further clustering moving image contents into a plurality of classes based on the classes into which the first layer summary data have been clustered.	10-13-2011
20110258190	Spectral Neighborhood Blocking for Entity Resolution - A processing device of an information processing system is operative to obtain a plurality of records, documents, web pages or other data objects, and to construct a binary tree using a bipartition procedure in which subsets of the data objects are associated with respective nodes of the tree. Evaluation of a designated modularity for a given one of the nodes of the tree is used as a stopping criterion to prevent further partitioning of that node and to indicate designation of that node as a leaf node of the tree. The resulting leaf nodes of the tree provide a non-overlapping partitioning of the plurality of data objects. The processing device is further operative to perform a neighborhood search on the tree to identify pairs of the plurality of data objects that match the same entity, and to store an indication of the matching pairs of data objects.	10-20-2011
20110258191	SYSTEM AND METHOD FOR PROVIDING SEARCH RESULTS BASED ON REGISTRATION OF EXTENDED KEYWORDS - Provided is a system and method providing a search result by registering an extended keyword. A search result providing system may include a registration keyword determining unit to determine whether a registration keyword is required to be additionally registered based on at least of information associated with a registration of an input keyword, and a registration keyword registration unit to additionally register the registration keyword associated with the input keyword.	10-20-2011
20110258192	PROVIDING QUESTION AND ANSWER SERVICES - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for question and answer services. In one aspect, a method combines receiving a plurality of questions from a plurality of different servers according to a protocol that defines services for submitting questions and obtaining answers to questions. Each received question is analyzed and associated with one or more labels based on the analysis. A request from a server is received according to the protocol to obtain questions related to one or more labels. Questions associated with one or more of the labels are identified and provided in response to the request.	10-20-2011
20110264661	GRAPHICALLY DISPLAYING A FILE SYSTEM - The contents of a computer file system are displayed on a graphical user interface. File system metadata descriptive of the computer file system and file metadata descriptive of each of a plurality of files are gathered. A file selection is received indicating a file accessed by the user. A user context is determined by the file metadata. The files are clustered using the file system metadata, a set of file metadata, and the user context. The set of file clusters are mapped onto a visualization model and graphically displayed on the graphical user interface using the visualization model.	10-27-2011
20110264662	CONTEXT COLLECTION DEVICES, CONTEXT COLLECTION PROGRAMS, AND CONTEXT COLLECTION METHODS - A content storage means stores content information that correlates available pieces of content with tags that are assigned thereto and that represent contexts in advance. A usage log storage means stores information that represents a piece of content that a user used as a usage log. A user context determination means obtains a tag that is assigned to a piece of content and that is contained in the usage log from the content information and determines a context of the user based on the obtained tag.	10-27-2011
20110270834	Data Classifier - A document classifier may analyze documents for a search engine and tag the documents. A document classifier system may have several different classifiers, each with a separate algorithm for classification. Some of the data classifiers may learn or change the classification over time with a feedback loop. As those classifiers are modified, updated, replaced, or added, the documents that have already been classified by the classifier may be re-examined to update their classification. The document classifier system may maintain a database of documents with a timestamp that the document was classified that may be used to identify those documents whose classifications may be out of date.	11-03-2011
20110270835	COMPUTER INFORMATION RETRIEVAL USING LATENT SEMANTIC STRUCTURE VIA SKETCHES - A method, system and program product for computer information retrieval is disclosed. A matrix A is received. Random sign matrices S and R are generated. Matrix products of ŜTA, AR, and ŜTAR are computed. A Moore-Penrose pseudoinverse C of ŜTAR is computed. A singular value decomposition is computed of the pseudoinverse C. Three matrices ARU, Sigma, and V̂TŜTA are outputted as factorization in applications.	11-03-2011
20110270836	METHOD AND APPARATUS FOR PROVIDING AN ACTIONABLE ELECTRONIC JOURNAL - An approach is provided for creating an actionable electronic journal. The journal creator classifies context data associated with a user according to a plurality of dimensions, granularities, or a combination thereof. Next, the journal creator recognizes one or more events in the context data based, at least in part, on the classification. Then, the journal creator creates one or more hierarchies of the recognized events based, at least in part, on the dimensions and the granularities, and presents, communicates or publishes them in a visually friendly format, along with additional information such as metadata and advertisements, and further details associated with each events of the journal.	11-03-2011
20110270837	METHOD AND SYSTEM FOR LOGICAL DATA MASKING - A system and method for logically masking data by implementing masking algorithms is provided. The method includes receiving one or more inputs from user regarding type of data masking to be implemented depending on type of data entry. Data entries include alphabetical data, data comprising unique codes, data comprising dates and numerical data. Based on inputs received, the data entries are classified and appropriate masking algorithms are executed. For masking numerical data entries, the data entries are first grouped using clustering algorithms and are then shuffled using shuffling algorithms. For low level of data masking selected by a user, numerical data entries are shuffled within groups and for high level of data masking selected by a user, numerical data entries are shuffled across groups.	11-03-2011
20110270838	UPDATING GROUPS OF ITEMS - Updating a set of items is disclosed. A set of items is received. The set of items is partitioned into groups. Group dependency information for the groups is calculated. Optionally, a dependency report is produced. Optionally, groups are updated. Optionally, change impact analysis is performed.	11-03-2011
20110270839	ASSIGNING DATA FOR STORAGE BASED ON A FREQUENCY WITH WHICH THE DATA IS ACCESSED - A method, system, and apparatus for improving performance when retrieving data from one or more storage media. Files to be stored on the one or more storage media are classified into a ranking of different sets. Differences in retrieval value of different regions of the one or more storage media are exploited by selecting which files to store in which regions. For example, files that have a higher classification are stored in regions with faster retrieval values. The files can be classified based on frequency of access. Thus, files that are more frequently accessed are stored in regions that have a faster retrieval value. The files can be classified by another measure such as priority. For example, the classification for some or all of the files can be based on user-assigned priority. The classification may be based on events or data grouping.	11-03-2011
20110276571	COMMUNICATION TERMINAL, INFORMATION MANAGEMENT APPARATUS, AND PROGRAM - A terminal device in which basic information and detailed information of each a plurality of application programs are stored in a different storage area for each application program, and that prohibits access to each storage area by other application programs, is caused to execute the processes of displaying a window that includes the basic information of each application program, and, if a cursor is moved to the display position of one of the pieces of basic information, reading out, from the appropriate storage areas, the detailed information of a first application program corresponding to the basic information where the cursor is positioned and the detailed information of a second application program whose basic information is displayed adjacent to the basic information of the first application program, and displaying the detailed information of the first application program.	11-10-2011
20110276572	CONFIGURATION MANAGEMENT DEVICE, MEDIUM AND METHOD - A configuration management device includes: an update processing unit	11-10-2011
20110282873	Mind Map with Data Feed Linkage and Social Network Interaction - Embodiments of the disclosed technology comprise a method of augmenting a mind map of a plurality of objects based on at least one data feed. The method comprises providing an interface which contains visual representations of objects and associates semantic data with these objects. The interface allows for a user to access data from a data feed, and it analyzes these data in order to identify additional objects which may be semantically related to the object. A visual representation of the additional object is then augmented with a connector to the original object. Information about the relationships of the objects may be certified automatically or manually.	11-17-2011
20110282874	EFFICIENT LEXICAL TRENDING TOPIC DETECTION OVER STREAMS OF DATA USING A MODIFIED SEQUITUR ALGORITHM - Embodiments are directed towards a Modified Sequitur algorithm (MSA) using pipelining and indexed arrays to identify trending topics within a plurality of documents having user generated content (UGC). The documents are parallelized and distributed across a plurality of network devices, which place at least some of the received documents into a buffer for which the MSA may then be applied to the documents within the buffer to identify n-grams or phrases within the documents' contents. The identified phrases are further analyzed to remove extraneous co-occurrences of phrases, and/or words based on a part of speech analysis. A weighting of the remaining phrases is used to identify trending topic phrases. Links to content in the plurality of UGC documents that is associated with the trending topic phrases may then be displayed to a client device.	11-17-2011
20110282875	LOCALIZED DATA AFFINITY SYSTEM AND HYBRID METHOD - A method, system, and computer program for processing records is disclosed. The records are associated with record sets. Record sets are associated with processor sets, which include one or more processors. Records are routed to associated processor sets for processing, based on the record set associated with the record. Records are processed on processors in the processor sets. Furthermore, various localized affinities can be established. Process affinity can link server processes with processor sets. Cache affinity can link database caches with processor sets. Data affinity can link incoming data to processor sets.	11-17-2011
20110282876	ORDER-PRESERVING CLUSTERING DATA ANALYSIS SYSTEM AND METHOD - Clustering data analysis that is robust to noise and is able to extract the most reliable information from sequential data comprises the ranking all of the measurement values across a third dimension of a 3D dataset in a selected one of an increasing order or a decreasing order and producing a three dimensional array of ranked values therefrom. It further comprises identifying coherent 3D patterns from the 3D array of ranked values, and counting the number of identified coherent 3D patterns. Each coherent 3D pattern parameters with a similar ranking and across subsets of the set of elements to a same group defines a cluster.	11-17-2011
20110282877	METHOD AND SYSTEM FOR AUTOMATICALLY EXTRACTING DATA FROM WEB SITES - In accordance with an embodiment, data may be automatically extracted from semi-structured web sites. Unsupervised learning may be used to analyze web sites and discover their structure. One method utilizes a set of heterogeneous “experts,” each expert being capable of identifying certain types of generic structure. Each expert represents its discoveries as “hints.” Based on these hints, the system may cluster the pages and text segments and identify semi-structured data that can be extracted. To identify a good clustering, a probabilistic model of the hint-generation process may be used.	11-17-2011
20110289083	INTERFACE FOR CLUSTERING DATA OBJECTS USING COMMON ATTRIBUTES - Data objects are correlated by comparing attributes of two data objects and determining that the data objects are a match based on the comparison. Data elements corresponding to assignments of an identifier are generated, and the data elements are stored in a cluster.	11-24-2011
20110289084	INTERFACE FOR RELATING CLUSTERS OF DATA OBJECTS - Data objects are related by comparing attributes of data objects that belong to different clusters and determining that the data objects are an approximate match based on the comparison. Data elements corresponding to assignments of an identifier are generated, and the data elements are stored in a grouping.	11-24-2011
20110289085	RECORDING METHOD - In a recording method, in particular, for recording still pictures, while classifying the still pictures and managing the still pictures classified, wherein still picture data and management information for managing that still picture data are recorded therein. The management information includes information for classifying the still picture data into a classifying unit, each, and also includes a file format of the still picture data classified into the classifying unit mentioned above.	11-24-2011
20110289086	SYSTEM, METHOD AND APPARATUS FOR DATA ANALYSIS - A system and method for searching a database for multiple entries in the database that contain similar data, in which some embodiments of the method include collating data on physical sites from at least one database source to form a collation of site data, assigning a unique entry identifier to each entry of the site data in the collation, performing a lexical analysis of the site data and assigning a similarity metric(s) to each entry of the site data, sorting site data into at least one group with similar lexical content based on a metric threshold difference analysis of the similarity metric(s), to thereby provide at least one group, having at least one site data entry therein, and wherein where there are two or more site data entries in the at least one group, preferably they refer to the same site or to sites having a similar physical address.	11-24-2011
20110289087	METHOD OF VISUALIZING CONSUMPTION INFORMATION, CORRESPONDING DEVICE, STORAGE MEANS, AND SOFTWARE PROGRAM THEREFOR - The invention concerns a method of visualizing consumption information, the method comprising the steps of presenting (103) the consumption information to a consumer and presenting (104) to the consumer a benchmark for assessing the consumption information, wherein the method further comprises the steps of, prior to presenting (104) the benchmark, determining (101) a peer group of consumers based on a characteristic of the consumer and determining (102) the benchmark based on the peer group. The invention further concerns a corresponding device, storage means, and software program therefor.	11-24-2011
20110295854	AUTOMATIC REFINEMENT OF INFORMATION EXTRACTION RULES - A method and system for automatically refining information extraction (IE) rules. A provenance graph for IE rules on a set of test documents is determined. The provenance graph indicates a sequence of evaluations of the IE rules that generates an output of each operator of the IE rules. Based on the provenance graph, high-level rule changes (HLCs) of the IE rules are determined. Low-level rule changes (LLCs) of the IE rules are determined to specify how to implement the HLCs. Each LLC specifies changing an operator's structure or inserting a new operator in between two operators. Based on how the LLCs affect the IE rules and previously received correct results of applying the rules on the test documents, a ranked list of the LLCs is determined. The IE rules are refined based on the ranked list.	12-01-2011
20110295855	Graph-Processing Techniques for a MapReduce Engine - Systems, methods, and devices for sorting and processing various types of graph data are described herein. Partitioning graph data into master data and associated slave data allows for sorting of the graph data by sorting the master data. In another embodiment, promoting a data bucket having a first data bucket size to a data bucket having a second data bucket size greater than the first data bucket size upon reaching a memory limit allows for the reduction of temporary files output by the data bucket.	12-01-2011
20110295856	IDENTIFYING RELATED OBJECTS USING QUANTUM CLUSTERING - Techniques for grouping related objects such as documents and files using quantum clustering are disclosed. A method may include constructing a feature-object database of multiple objects. The feature-object database may have quantized selected features as keys. A connected objects database maybe built. Clusters of connected objects may be identified in the connected objects database. The clusters of identified objects may be evaluated to determine groups of related objects. The method may be implemented on a computing device.	12-01-2011
20110302163	SYSTEM AND METHOD FOR CLUSTERING CONTENT ACCORDING TO SIMILARITY - Systems and methods for clustering content according to similarity are provided that identify and group similar content using a set of tags associated with the content. A topic model of a group of content is built, producing a probability distribution of topic membership for the content. Individual items of content are then clustered using a clustering algorithm, and a distance matrix from the probability distribution is built. Based on the distance matrix, individual items of content are labeled as “must-link” or “cannot-link” pairs with the group of content. The topic model is then embedded into successively smaller dimensions using a kernel method, until the clustering is stable with respect to both the behavioral and content domains.	12-08-2011
20110302164	Order-Independent Stream Query Processing - In a system and method for order-independent stream query processing, one or more input streams of data are received, and the one or more input streams are analyzed to determine data which is older than an already emitted progress indicator. The data which is older than the already emitted progress indicator is partitioned into one or more partitions, and each of the one or more partitions are independently processed using out-of-order processing techniques. A query is received, rewritten and decomposed into one or more sub-queries that produce partial results for each of the one or more partitions, where each of the one or more sub-queries correspond to a partition. A view is also produced that consolidates the partial results for each partition. The partial results are consolidated at a consolidation time specified by the query to produce final results, and the final results are provided.	12-08-2011
20110302165	CONTENT RECOMMENDATION DEVICE AND CONTENT RECOMMENDATION METHOD - A content recommendation device deciding content to be recommended to a user among a plurality of content items includes: a clustering section creating a cluster set including clusters by clustering use statuses of content of users on the basis of a predetermined index; an effectiveness determining section determining effectiveness of the clustering by evaluating a correlation between the content and the cluster in the cluster set; a popular content deciding section selecting the cluster to which the user who becomes a recommendation partner belongs from the cluster set and deciding the popularity degree of each content item in accordance with the use status of each content item by the users in the cluster; and a recommended content deciding section evaluating the popularity degree of each content item in the cluster to which the user who becomes the recommendation partner belongs by taking into account and estimating the effectiveness of the cluster set therein and deciding the relatively popular content item among the content items as the content item to be recommended.	12-08-2011
20110302166	SEARCH SYSTEM, SEARCH METHOD, AND PROGRAM - The present invention provides a search system and a search method to make it easy to find out a document required truly among documents of a search result. This search system includes a division unit that divides a document to be searched into a plurality of blocks in accordance with designated division information, a calculation unit that calculates a hash value of each block by applying a hash function to a character string included in each block, a storage unit that stores the calculated hash value together with positional information on the block in the document, and a document grouping unit that fetches, for each document obtained by searching based on the search word, a corresponding hash value from the storage unit	12-08-2011
20110302167	Systems, Methods and Computer Program Products for Processing Accessory Information - A computer-implemented method according to one embodiment includes, for each of a plurality of accessories: determining a compatibility of an accessory; determining a type of the accessory; and determining features of the accessory. The accessories are associated into logical groups based on the compatibility, type and features thereof. A computer-implemented method according to one embodiment includes obtaining information about accessories; parsing out individual offers corresponding to the accessories; extracting meaningful phrases from the offers; classifying new offers based on the phrases; and outputting a result of the classification. Additional systems, methods and computer program products are also presented.	12-08-2011
20110307485	EXTRACTING TOPICALLY RELATED KEYWORDS FROM RELATED DOCUMENTS - Keyword extraction technique embodiments are presented which extract topically related keywords from a set of topically related documents. In one general embodiment, this keyword extraction involves first accessing a set of topically related documents. A number of candidate keywords are then identified from the set of related documents. A weighted keyword candidate-document matrix is formed using these candidate keywords, and it is partitioned into multiple groups of keyword candidates. Dense clusters of keyword candidates whose density exceeds a prescribed density threshold are then identified in each of the groups of keyword candidates. Finally, the keyword candidates associated with each dense cluster are designated as topically related keywords.	12-15-2011
20110307486	Managing Sensitive Data in Cloud Computing Environments - The illustrative embodiments provide a method, computer program product, and apparatus for managing collectively sensitive data. Collectively sensitive data is divided into a first partition for reassembly data, a second partition of the collectively sensitive data, and a third partition of the collectively sensitive data. Each of the second partition and the third partition are collectively nonsensitive in isolation. The first partition is stored in a translation table in a secure database. The translation table is configured for use in assembling collectively sensitive data from the second partition and the third partition. The second partition of the collectively sensitive data is stored in a first database associated with a first cloud computing environment. The third partition of collectively sensitive data is stored in a second database associated with a second cloud computing environment.	12-15-2011
20110307487	SYSTEM FOR MULTI-MODAL DATA MINING AND ORGANIZATION VIA ELEMENTS CLUSTERING AND REFINEMENT - A system for obtaining data from various sources. The data may be organized into cluster sets of related items. Elements of various kinds may be pulled from the data. The elements may be put together into sets of clusters for each kind of elements. The clusters may be refined relative to one another and in view of integrated properties of the cluster sets. Elements may be added or removed from the clusters during refinement. Examples of the elements may be people and events. Examples of cluster sets of such elements may be groups and goals, respectively.	12-15-2011
20110314017	TECHNIQUES TO AUTOMATICALLY MANAGE SOCIAL CONNECTIONS - Techniques to manage social connections are described. An apparatus may comprise a processor communicatively coupled to a memory. The memory may be arranged to store a social analysis component that when executed by the processor is operative to receive a list of members in a social network, receive at least one relationship indicator derived from multiple member attributes of a member, and generate a social identifier based on the relationship indicator, the social identifier representing a social connection type for a social connection or potential social connection between two or more members of the list of members in the social network. Other embodiments are described and claimed.	12-22-2011
20110314018	ENTITY CATEGORY DETERMINATION - Summaries of entities (e.g., people, places, things, concepts, etc.) may provide additional useful information to user. For example, a search engine may provide a summary of an entity within search results. A category (e.g., “writer”, “politician”, etc.) of the entity that is short and concise may be advantageous to provide within a summary of the entity. The category may allow a user to quickly determine whether the information of the entity relates to the intended entity (e.g., search results of an entity as “a writer” vs. search results of an entity as “a politician”). Potential categories and summary text may be extracted from pre-labeled data. The potential categories and summary text may be intersected to determine a set of candidate categories that may be ranked. An entity category having a desired ranked may be determined as the entity category that describes the entity in a desired way.	12-22-2011
20110314019	PARALLEL PROCESSING OF CONTINUOUS QUERIES ON DATA STREAMS - A continuous query parallel engine on data streams provides scalability and increases the throughput by the addition of new nodes. The parallel processing can be applied to data stream processing and complex events processing. The continuous query parallel engine receives the query to be deployed and splits the original query into subqueries, obtaining at least one subquery; each subquery is executed in at least in one node. Tuples produced by each operator of each subquery are labeled with timestamps. A load balancer is interposed at the output of each node that executes each one of the instances of the source subquery and an input merger is interposed in each node that executes each one of the instances of a destination subquery. After checks are performed, further load balancers or input managers may be added.	12-22-2011
20110314020	CHEMICAL ADDITIVE INGREDIENT PALETTE - Disclosed are methods for developing authorized chemical palettes for formulating products with reduced adverse environmental and/or health concerns, and advising the public to a greater extent regarding the ingredients of products formulated using these palettes. Also disclosed are computer systems to implement such methods.	12-22-2011
20110314021	Displaying Autocompletion of Partial Search Query with Predicted Search Results - A set of ordered predicted completion strings are presented to a user as the user enters text in a text entry box (e.g., a browser or a toolbar). The predicted completion strings can be in the form of URLs or query strings. The ordering may be based on any number of factors (e.g., a query's frequency of submission from a community of users). URLs can be ranked based on an importance value of the URL. Privacy is taken into account in a number of ways, such as using a previously submitted query only when more than a certain number of unique requestors have made the query. The sets of ordered predicted completion strings is obtained by matching a fingerprint value of the user's entry string to a fingerprint to table map which contains the set of ordered predicted completion strings.	12-22-2011
20110320445	Computer-Implemented Method for Clustering Data and Computer-Readable Medium Encoded with Computer Program to Execute Thereof - Inferences acquired by applying clustering analysis cannot be reliably assessed before data-originated errors are quantified, an exacting task that is often not performed. This invention presents a clustering method suited for this purpose. Designed for systems with normally distributed error, a common trait to many data systems, and built on a framework of agglomerative hierarchical clustering, this invention treats each observation as a Gaussian distribution function, uses an exact mathematical relation to track error, and gives results from which quantitative statistics are easily extracted.	12-29-2011
20110320446	Pushing Search Query Constraints Into Information Retrieval Processing - This patent application relates to interval-based information retrieval (IR) search techniques for efficiently and correctly answering keyword search queries. In some embodiments, a range of information-containing blocks for a search query can be identified. Each of these blocks, and thus the range, can include document identifiers that identify individual corresponding documents that contain a term found in the search query. From the range, a subrange(s) having a smaller number of blocks than the range can be selected. This can be accomplished without decompressing the blocks by partitioning the range into intervals and evaluating the intervals. The smaller number of blocks in the subranges(s) can then be decompressed and processed to identify a doc ID(s) and thus document(s) that satisfies the query.	12-29-2011
20110320447	High-Dimensional Stratified Sampling - In one aspect, a processing device of an information processing system is operative to perform high-dimensional stratified sampling of a database comprising a plurality of records arranged in overlapping sub-groups. For a given record, the processing device determines which of the sub-groups the given record is associated with, and for each of the sub-groups associated with the given record, checks if a sampling rate of the sub-group is less than a specified sampling rate. If the sampling rate of each of the sub-groups is less than the specified sampling rate, the processing device samples the given record, and otherwise does not sample the given record. The determine, check and sample operations are repeated for additional records, and samples resulting from the sample operations are processed to generate information characterizing the database. Other aspects of the invention relate to determining which records to sample through iterative optimization of an objective function that may be based, for example, on a likelihood function of the sampled records.	12-29-2011
20110320448	TELEPHONE NUMBERS WITH ALPHABETIC PATTERNS - An exemplary process includes storing telephone numbers in a telephone number database, identifying which telephone numbers in the telephone number database have digits that occur in an alphabetic pattern as defined by a reference list, and designating the telephone numbers with digits occurring in the alphabetic pattern as patterned telephone numbers. An exemplary process of identifying alphabetic patterns includes assigning at least one letter in an alphabet to a first digit, assigning at least one other letter in an alphabet to a second digit, accessing a telephone number database storing telephone numbers having a plurality of digits, identifying a letter combination created by at least two digits of at least one of the telephone numbers in the telephone number database, and determining whether the letter combination forms an alphabetic pattern as defined by a reference list.	12-29-2011
20110320449	TELEPHONE NUMBER GROUPS - A method includes receiving a list of sequential telephone numbers, and iteratively: identifying a first number and a last number of the list, selecting a group size, creating a group of sequential telephone numbers from the list of sequential telephone numbers based on the selected group size, and removing the created group from the list of sequential telephone numbers. Creating the group may include determining whether a first number in the list of sequential telephone numbers ends with a predetermined digit, assigning the first number as a start number of the group, and identifying an end number in the group based on the start number and the selected group size.	12-29-2011
20110320450	LOCATION BASED GROUPING OF BROWSING HISTORIES - Methods and systems that present URLs from a history of records organized by locations are described. Each record may be stored to represent a URL accessed for retrieving a web page by a browser hosted in a device at a certain point in time. Additionally, the record may include a location data indicating a physical location of the device at the certain point in time. Optionally, a timestamp indicating the certain point in time may be included in the record. Groups of the records may be clustered according to the locations. In one embodiment, at least one of the groups may be selected for presentation on a display according to where the display is currently located.	12-29-2011
20110320451	APPARATUS AND METHOD FOR SORTING DATA - A method and system for sorting data of an input file containing multiple records associated with multiple tables of a database. The multiple records include key values. The key values are segmented into ranges of key values for each table. Each range of key values for each table is a segment having a segment value. Multiple key values are selected for the multiple records. A block number, which contains a unique permutation of the segment values of the segments, is generated. The segment values denote the ranges of key values encompassing the multiple key values in each record. A sort key value for each record is ascertained, based on the generated block number for each record, and added to each record. The multiple records are sorted according to the sort key values in the multiple records. The sorted multiple records are stored in an output file.	12-29-2011
20110320452	INFORMATION ESTIMATION APPARATUS, INFORMATION ESTIMATION METHOD, AND COMPUTER-READABLE RECORDING MEDIUM - An information estimation apparatus	12-29-2011
20120005206	APPARATUS AND METHOD FOR ANALYSIS OF DATA TRAFFIC - An apparatus for defining an index in an index file representing a volume of traffic a computer system comprises a data processing module. The data processing module defines an index corresponding to a traffic data sequence and a first parameter of the traffic data sequence in a first record of the index file. An apparatus for evaluating a candidate signature representing a pre-determined class of traffic in a computing system compares a signature data sequence with entries in an index file and determines whether the candidate signature satisfies an evaluation criterion.	01-05-2012
20120005207	METHOD AND SYSTEM FOR WEB EXTRACTION - A method includes generating, a plurality of sets of pairs of records from a set of records, for each attribute-position pair in the set of records. Each attribute-position pair being indicative of a position of an attribute in a record. Further, the method includes forming, electronically, a plurality of groups, each group comprising two attribute-position pairs having different attributes. Further, the method also includes determining, electronically for each group, number of pairs of records that are common in the two attribute-position pairs of that group. Furthermore, the method includes extracting results based on a first group of the plurality of groups if the number of pairs of records that are common in the two attribute-position pairs of the first group is greater than a second threshold, is highest among the plurality of groups, and no group having three or more attribute-position pairs with different attributes is possible.	01-05-2012
20120005208	SYSTEM FOR INFORMATION DISCOVERY IN VIDEO-BASED DATA - A system for information discovery of items, such as individuals or objects, from video-based tracks. The system may compute similarities of characteristics of the items and present the results in a matrix form. A similarity portrayal may have nodes representing the items with edges between the nodes. The edges may have weights in the form of vectors indicating similarities of the characteristics between the nodes situated at the ends of the edges. The edges may be augmented with temporal and spatial properties from the tracks which cover the items. These properties may play a part in a multi-objective presentation of information about the items in terms of a negative or supportive basis. The presentation may be partitioned into clusters which may lead to a merger of items or tracks. The system may pave a way for higher-level information discovery such as video-based social networks.	01-05-2012
20120005209	SYSTEMS AND METHODS FOR IDENTIFYING INTERSECTIONS USING CONTENT METADATA - User-submitted content (e.g., stories) may be associated with descriptive metadata (intersection metadata), such as a timeframe, location, tags, and so on. The user-submitted content may be browsed and/or searched using the descriptive metadata. Intersection criteria comprising a prevailing timeframe, a location, and/or other metadata criteria may be used to identify an intersection space comprising one or more stories. The stories may be ordered according to relative importance, which may be determined (at least in part) by comparing story metadata to the intersection criteria.	01-05-2012
20120005210	Method of Structuring a Database of Objects - A method of structuring a database of objects, the objects each comprising one or more attributes, the attributes being ordered, the method being executed by at least one computer processor connected to a memory, the method classifying in memory the objects in a structure composed of a list CL of sets of formal concepts C	01-05-2012
20120005211	DOCUMENT OBJECT MODEL (DOM) BASED PAGE UNIQUENESS DETECTION - DOM based unique ID generation, including receiving a hypertext markup language (HTML) page at a computer, and identifying HTML page elements in response to the receiving, the HTML page elements comprising parent nodes, the parent nodes comprising child nodes. The method further comprising processing each of the HTML page elements, the processing comprising: grouping the child nodes by parent node into a group of child nodes, detecting patterns in the group of child nodes in response to the grouping, reducing the group of child nodes to text strings in response to the detecting, storing the text strings as text values in the parent nodes, and generating a unique identifier (ID) of the HTML page in response to the processing.	01-05-2012
20120011119	OBJECT RECOGNITION SYSTEM WITH DATABASE PRUNING AND QUERYING - A database for object recognition is generated by performing at least one of intra-object pruning and inter-object pruning, as well as keypoint clustering and selection. Intra-object pruning removes similar and redundant keypoints within an object and different views of the same object, and may be used to generate and associate a significance value, such as a weight, with respect to remaining keypoint descriptors. Inter-object pruning retains the most informative set of descriptors across different objects, by characterizing the discriminability of the keypoint descriptors for all of the objects and removing keypoint descriptors with a discriminability that is less than a threshold. Additionally, a mobile platform may download a geographically relevant portion of the database and perform object recognition by extracting features from the query image and using determined confidence levels for each query feature during outlier removal.	01-12-2012
20120011120	Conceptual Tagging with Conceptual Message Matching System and Method - A conceptual tagging and message matching system and method are provided. In one example, the system and method generate web pages or third party web pages with pieces of content combined with the message.	01-12-2012
20120011121	Data analysis using multiple systems - Data analysis is disclosed, including: receiving data to be analyzed, wherein the data includes one or more data identifiers (IDs) and one or more preset key-value pairs, wherein each preset key-value pair includes a preset key and a preset value; acquiring data to be analyzed based at least in part on the data IDs; segmenting the acquired data into one or more data elements; classifying the one or more data elements based at least in part on one preset key of the one or more preset key-value pairs; and analyzing the classified one or more data elements based at least in part on one preset value of the one or more preset key-value pairs.	01-12-2012
20120011122	APPARATUS FOR CONNECTING GETTING-IN RECORD AND GETTING-OFF RECORD OF VEHICLE, AND METHOD OF THE SAME - An apparatus for connecting getting-in and off records of a vehicle includes: a driving information memory unit for storing getting-in and off records having a pair of getting-in and getting-off records, each of which represents a getting-in or off time and place; a grouping unit for grouping the getting-in or off records representing a same getting-in or off time zone and place, so that segmentalized getting-in or off groups are generated; a connecting unit for searching a segmentalized getting-in group connected to one segmentalized getting-off group, and for storing connection information between a searched segmentalized getting-in group and the one segmentalized getting-off group; and an estimating unit for specifying a getting-off time and a getting-off point when the getting-off action of the driver is detected, and estimating a getting-in time for a next driving according to the connection information.	01-12-2012
20120016877	CLUSTERING OF SEARCH RESULTS - One particular embodiment clusters a plurality of documents using one or more clustering algorithms to obtain one or more first sets of clusters, wherein: each first set of clusters results from clustering the documents using one of the clustering algorithms; and with respect to each first set of clusters, each of the documents belongs to one of the clusters from the first set of clusters; accesses a search query; identifies a search result in response to the search query, wherein the search result comprises two or more of the documents; and clusters the search result to obtain a second set of clusters, wherein each document of the search result belongs to one of the clusters from the second set of clusters.	01-19-2012
20120016878	CONSTRAINED NONNEGATIVE TENSOR FACTORIZATION FOR CLUSTERING - Methods and systems for clustering information items using nonnegative tensor factorization are disclosed. A processing device receives one or more class labels, each corresponding to an information item, a selection for a nonnegative tensor factorization model having an associated objective function and one or more parameter values, each corresponding to one of one or more penalty constraints. The processing device determines a constrained objective function based on the objective function associated with the selected nonnegative tensor factorization model, the one or more parameter values and the one or more class labels and including the one or more penalty constraints. The processing device determines clusters for the plurality of information items by evaluating the constrained objective function. Pairwise constraints may be received in addition to or instead of the class labels.	01-19-2012
20120016879	SYSTEMS AND METHODS OF USER INTERFACE FOR IMAGE DISPLAY - Approaches to displaying image search results, and image content of computer readable media include providing a matrix display of images, with an interface to insert and remove floating date dividers, each indicative of a day on which one or more of the images was created. Available images can be abstracted according to a respective month in which the images were created, up to a determined maximum number of months, after which images are abstracted according to a year in which they were created. Selecting a month causes display of a matrix of images created during that month, while selecting a year causes display of a list of months. A selected thumbnail can be displayed for each month or year of a displayed list. Search results can grouped according to how each result satisfied the search criteria, such as a separate group for images that had names matching a search criteria, and one or more separate groups for images that satisfied a date range criteria.	01-19-2012
20120023101	SMART DEFAULTS FOR DATA VISUALIZATIONS - Smart defaults are provided for data visualization by creating a default layout of rows, columns, filters, and comparable elements that improve a user's experience in finding relevant answers within the data. Usage history of the ways that users look at data in various data sources, user specific information, and inferred relationships between a current user and similar users are used to determine elements relevant to visualization of data for a particular user such that the visualization process may be automatically started, and a relevance model is formed/adjusted based on these factors. Queries may also be executed in a preemptive fashion based on the relevance model and results provided to a requesting user more rapidly enhancing user experience with networked data visualization.	01-26-2012
20120023102	METHODS AND SYSTEMS FOR DYNAMICALLY REARRANGING SEARCH RESULTS INTO HIERARCHICALLY ORGANIZED CONCEPT CLUSTERS - Methods of and systems for dynamically rearranging search results into hierarchically organized concept clusters are provided. A method of searching for and presenting content items as an arrangement of conceptual clusters to facilitate further search and navigation on a display-constrained device includes providing a set of content items and receiving incremental input to incrementally identify search terms for content items. Content items are selected and grouped into sets based on how the incremental input matches various metadata associated with the content items. The selected content items are grouped into explicit conceptual clusters and user-implied conceptual clusters based on metadata in common to the selected content items. The clustered content items are presented according to the conceptual clusters into which they are grouped.	01-26-2012
20120030202	TECHNIQUES FOR ANALYZING DATA FROM MULTIPLE SOURCES - Techniques, including systems are methods, for analyzing data from multiple sources are disclosed and suggested herein. In an embodiment, external information from one or more external information sources and internal information from one or more internal information sources is received. The received external information and internal information are stored in one or more data stores that collectively implement one or more ontologies. One or more conditions are applied to the external information and internal information in the one or more data stores to determine a conclusion and the conclusion is provided to a user.	02-02-2012
20120030203	METHOD FOR ESTABLISHING MULTIPLE LOOK-UP TABLES AND DATA ACQUISITION METHOD USING MULTIPLE LOOK-UP TABLES - A method for establishing multiple look-up tables and a data acquisition method using multiple look-up tables are provided. In the present method, a plurality of input data is classified into a plurality of groups, and a plurality of input data and a plurality of output data corresponding to the input data are respectively provided to the groups to establish a plurality of corresponding look-up tables. At least one bit is selectively removed from the input data in each of the look-up tables corresponding to at least one of the groups, and the result input data and the corresponding output data are recorded in the look-up table corresponding to the group.	02-02-2012
20120030204	TEXT DATA PROCESSING DEVICE AND PROGRAM - Provided are categorizing unit (	02-02-2012
20120030205	PRODUCT NORMALIZATION - A computer-implemented approach for organizing input listings from various sources of input listings. Input listings are organized by mapping the input listings to consolidated listing that correspond to the input listings. The mapping of the input listings are based on various techniques such as a Stock Keeping Unit item-listing-to-consolidated-listing matching technique, a name/title item-listing-to-consolidated-listing matching technique, and a model item-listing-to-consolidated-listing matching technique.	02-02-2012
20120041951	METHOD AND DEVICE FOR TRANSMITTING GEOGRAPHICAL DATA ON AN AIRCRAFT - The invention relates to a device (	02-16-2012
20120041952	DISPLAY CONTROL APPARATUS, CONTROL METHOD THEREOF, PROGRAM, AND RECORDING MEDIUM - In the display of a search result using a virtual space, the display and operation in the space is associated with an addition and change of search instruction to facilitate to grasp the content and to operate a search. A plurality of contents each of which having a keyword is arranged in the virtual space and displayed on a display screen. When one key is set, a content to be a search target is selected from among the plurality of contents based on a position at which the key is set, a search is performed on the selected search target with the set key, and an arrangement of the contents is changed and displayed based on a relationship with the key and the plurality of contents.	02-16-2012
20120047139	REPOSITORY INFRASTRUCTURE FOR ON DEMAND PLATFORMS - In an aspect there is provided a method. The method may include providing, at a repository, storage for a plurality of tenants, providing a plurality of layers, and providing a plurality of versions; and separating, based on the plurality of layers and the plurality of versions, data for each of the plurality of tenants, wherein during runtime one of the plurality of tenants corresponds to the plurality of layers and one of the plurality of versions. Related apparatus, systems, techniques and articles are also described.	02-23-2012
20120047140	Cluster-Wide Read-Copy Update System And Method - A system, method and computer program product for synchronizing updates to shared mutable data in a clustered data processing system. A data element update operation is performed at each node of the cluster while preserving a pre-update view of the shared mutable data, or an associated operational mode, on behalf of readers that may be utilizing the pre-update view. A request is made for detection of a grace period, and grace period detection processing is performed for detecting when the cluster-wide grace period has occurred. When it does, a deferred action associated with the update operation it taken, such as removal of a pre-update view of the data element or termination of an associated mode of operation.	02-23-2012
20120047141	System and Method for Automatic Anthology Creation Using Document Aspects - A generic and expandable document aspect system and method for searching, browsing, presenting, and interacting with data assembled from document contents and related external data is provided. New varieties of document aspects are added to existing installations and can be accessed by users without requiring upgrades to server or clients, for example by using plug-in technology.	02-23-2012
20120047142	NETWORK CODING WITH LAST MODIFIED DATES FOR P2P WEB CACHING - A method may include obtaining a source file at a node in peer-to-peer network and dividing the source file into a plurality of pieces. The pieces of the source file may be encoded using network coding principles. A last-modified-date (LMD) value may be appended to each of the encoded pieces, the LMD value being the same for each of the encoded pieces of the source file. The encoded pieces with the LMD values may be sent to one or more other nodes in the peer-to-peer network.	02-23-2012
20120059823	INDEX PARTITION MAINTENANCE OVER MONOTONICALLY ADDRESSED DOCUMENT SEQUENCES - Provided are techniques for partitioning a physical index into one or more physical partitions; assigning each of the one or more physical partitions to a node in a cluster of nodes; for each received document, assigning an assigned-doc-ID comprising an integer document identifier; and, in response to assigning the assigned-doc-ID to a document, determining a cut-off of assignment of new documents to a current virtual-index-epoch comprising a first set of physical partitions and placing the new documents into a new virtual-index-epoch comprising a second set of physical partitions by inserting each new document to a specific one of the physical partitions in the second set using one or more functions that direct the placement based on one of the assigned-doc-id, a field value derived from a set of fields obtained from the document, and a combination of the assigned-doc-id and the field value.	03-08-2012
20120059824	ALLOCATING AND MANAGING RANDOM IDENTIFIERS USING A SHARED INDEX SET ACROSS PRODUCTS - Provided are techniques for selecting row identifiers from an initial index structure storing rows of randomized indexes. The row identifiers are randomized. Groups are formed with the randomized row identifiers so that each group has a predetermined number of row identifiers. At least one group is selected from the groups. Indexes are retrieved from the initial index structure that correspond to the row identifiers in the selected at least one group. The retrieved indexes are encoded by adding product information to form new identifiers.	03-08-2012
20120059825	COLLECTING DATA FROM DIFFERENT SOURCES - A system for collecting data from different sources is described. In one example embodiment, the system obtains content-related data from a plurality of source computer systems, automatically identifies, based on the content-related data, content items having respective popularity values greater than a predetermined threshold value as popular content items, and automatically generates a list of popular content items based on the popular content items.	03-08-2012
20120059826	METHOD AND APPARATUS FOR VIDEO SYNTHESIS - An approach is provided for generating a compilation of media items. A plurality of media items is received. Respective context vectors for the media items are determined. The context vectors include, at least in part, orientation information, tilt information, altitude information, geo-location information, timing information, or a combination thereof associated with the creation of the respective media items. A compilation of at least a portion of the media items is generated based, at least in part, on the context vectors.	03-08-2012
20120066222	Web architecture for green design and construction - A method and computer programming	03-15-2012
20120066223	METHOD AND COMPUTING DEVICE FOR CREATING DISTINCT USER SPACES - A method and computing device for creating distinct user spaces are described. Concerning the method, in a platform originally designed as a single user platform, user data associated with a plurality of users can be stored and segmented. In addition, links to point to user data that is associated with a current user can be generated in which the link creation can exploit a predefined path associated with storing data in the single user platform. The method can also include the step of preventing the current user from accessing user data associated with non-active users.	03-15-2012
20120066224	CLUSTERING OF ANALYTIC FUNCTIONS - A method, system, and computer program product for improved clustering of analytic functions in a data processing environment are described. A set of instances of an analytic function receiving data input from a set of data sources is identified. A first subset of instances is configured to receive input from a first subset of data sources, and a second subset of instances is configured to receive input from a second subset of data sources. The set of instances is assigned to a cluster. The cluster begins executing in a computer in the data processing environment, when the first subset of data sources begins transmitting time series data input to the first subset of instances in the cluster.	03-15-2012
20120066225	PROFILING METHOD AND SYSTEM - The invention relates to a method and system for profiling recipients into recipient categories on the basis of responses to content items provided to users. The profiling is based on rankings that are assigned to the content items, recipient categories, links between the content items and links between the content items and recipient categories. In one embodiment the ranking of a given content item is calculated on the basis of rankings of other content items having a link to the given content item, together with the ranking of the link between the content items, while the ranking of a given respondent in respect of a given recipient category is calculated on the basis of rankings of content items and/or categories that have a link to that recipient category. The links between content items and to the recipient categories indicate a particular response, by the respondent, in respect of content items. The recipients are profiled with respect to the recipient categories on the basis of the rankings assigned to the recipient categories.	03-15-2012
20120072419	METHOD AND APPARATUS FOR AUTOMATICALLY TAGGING CONTENT - A content tagging and management capability is provided for enabling automatic tagging of content and management of tagged content. A method includes receiving content including an object, and automatically associating an information structure with the object included within the content to form thereby tagged content. The content may be received locally at a content capture device, and the information structure may be automatically associated with the object by the content capture device. The automatic tagging may be performed at the content capture device when the content is captured by the content capture device. The content may be received at a computer, and the information structure may be automatically associated with the object by the computer. The information structure may be available locally or retrieved from one or more remote devices.	03-22-2012
20120072420	CONTENT CAPTURE DEVICE AND METHODS FOR AUTOMATICALLY TAGGING CONTENT - A content tagging and management capability is provided for enabling automatic tagging of content and management of tagged content. An apparatus includes a content capture mechanism configured for capturing content including an object, and a processor configured for automatically associating an information structure with the object included within the content to form thereby tagged content. A method for using a user device for automatically tagging content includes capturing content including an object, receiving object information associated with the object when a sensor associated with the object is detected, and automatically associating an information structure with the object included within the content to form thereby tagged content, where the information structure includes at least a portion of the object information associated with the object.	03-22-2012
20120072421	SYSTEMS AND METHODS FOR INTERACTIVE CLUSTERING - Systems and associated methods provide a cluster-level semi-supervision model for inter-active clustering. Embodiments accept user provided semi-supervision for updating cluster descriptions and assignment of data items to clusters. Assignment feedback re-assigns data items among existing clusters, while cluster description feedback helps to position existing cluster centers more meaningfully. The feedback can continue until the user is satisfied with the clustering achieved or one or more predetermined stopping criteria have been reached.	03-22-2012
20120072422	SYSTEM AND METHOD FOR CITATION PROCESSING, PRESENTATION AND TRANSPORT AND FOR VALIDATING REFERENCES - The present invention comprises a system and method for automatically processing one or more citations contained within a document while the document is presented by a document rendering application. The method of the present invention comprises scanning the document to identify an unformatted citation and parsing the unformatted citation to determine one or more citation terms. One or more citation libraries are queried to find citations comprising the one or more citation terms. A citation falling within the scope of the query is selected and inserted into the document. The present invention may further provide enhanced workflow solutions for authors and publishers in preparing documents in structured format for facilitating efficient and accurate validation of references cited or included in papers and other submissions for publication or for review. An author prepares a document containing a set of cited references using a formatting structure. A system includes a processor to process the document to extract embedded metadata associated with the set of cited references. The processor executes code associated with a reference validation software module and automatically recognizes the formatting structure and the embedded metadata. The processor automatically extracts the embedded metadata and compares the extracted metadata against an authority database to determine the validity of the set of cited references.	03-22-2012
20120078903	IDENTIFYING CORRELATED OPERATION MANAGEMENT EVENTS - A technique includes receiving data indicative of operation management events, where each event occurs at an associated time. The technique includes processing the data to selectively group the events in episodes based on the associated times and identifying which events are correlated based at least in part on the episodes.	03-29-2012
20120078904	Approximate Index in Relational Databases - A database table is provided. The database table includes several column tuples. A column is selected in the database table. The column tuples of the selected column are partitioned into several bins. Each bin includes a range of tuples and associated metadata. The associated metadata includes at least one of: a minimum tuple value for the tuples in the bin, a maximum tuple value for the tuples in the bin, a minimum tuple identifier for the bin and a maximum tuple identifier for the bin. The bins are sorted based on the tuple values to provide an approximate index for the database.	03-29-2012
20120078905	MANIPULATING NON-SCHEMA ATTRIBUTES FOR OBJECTS IN A SCHEMA BASED DIRECTORY - Systems and methods for defining attributes for one or more entries in a computer implemented directory structure. The method comprises grouping a set of non-schema attributes associated with a directory entry into a multivalue schema attribute, wherein the multivalue attribute comprises values associated with each of the corresponding non-schema attributes grouped into the multivalue attribute; encoding at least one of the non-schema attributes into a string having one or more parts; and performing computing operations on the non-schema attributes in the directory entries based on content of the encoded strings defined in the multivalue attribute in which the respective non-schema attributes are grouped.	03-29-2012
20120078906	AUTOMATED GENERATION AND DISCOVERY OF USER PROFILES - A robust knowledge-based management and sharing system organized by context for expertise-based or context-based searching and retrieval of relevant information is disclosed. The various embodiments and techniques described herein are used to organize a user's data and communications around the user's expertise or one or more contexts the user is associated with such as the user's projects, products, and customers. The organization of user data is derived from the user's competencies and interactions with others and is used to build and index user profiles in a manner that facilitates retrieval in search results for relevant search criteria. A linguistic processing pipeline is used to parse and index the user's data to generate the complete and partial profiles organized by context. Complete and partial profiles are generated, indexed, ranked, and stored by the system. Once a profile is built and indexed into the proper expertise or context(s), it can yield highly relevant results in searches for persons with a desired set of competencies, knowledge, experience, or connections in a particular context.	03-29-2012
20120078907	KEYWORD PRESENTATION APPARATUS AND METHOD - According to one embodiment, a keyword presentation apparatus includes an extraction unit, a selection unit and a clustering unit. The extraction unit is configured to extract, as technical terms, morpheme strings, which are not defined in a general concept dictionary, from a document set. The selection unit is configured to evaluate relevancies between each of basic term candidates and the technical terms, and to preferentially select basic term candidates having high relevancies as basic terms. The clustering unit is configured to calculate weighted sums of statistical degrees of correlation between the basic terms based on the document set, to calculate conceptual degrees of correlation between the basic terms based on the general concept dictionary, and to cluster the basic terms based on the weighted sums.	03-29-2012
20120078908	PROCESSING A REUSABLE GRAPHIC IN A DOCUMENT - A method and apparatus are provided for processing a graphic in a document so that the graphic may be reused in a different application than the one it was originally used in. For a given document, a graphic may be identified from within the document and extracted from the document. The extracted graphic may be stored in a suitable storage medium, such as a reusable graphic repository. A structural feature associated with the extracted graphic may also be extracted. The extracted graphic may then be classified based on the extracted structural feature. Furthermore, a method and apparatus are provided for generating a reusable graphic from a document.	03-29-2012
20120078909	INFORMATION SELECT APPARATUS AND INFORMATION SELECT METHOD - According to one embodiment, an information select apparatus includes a storage, an acquisition module, and a selector. The storage is configured to store a script in which at least first information indicative of a search condition of articles, second information indicative of a select condition of articles, and third information indicative of an output order of articles are described, in order to select data which is to be provided to a user. The acquisition module is configured to acquire a data group from a network according to the first information of the script. The selector is configured to select data items from the data group according to the second information of the script, and to orderly arrange the selected data items according to the third information.	03-29-2012
20120078910	Using an ID Domain to Improve Searching - Methods which use an ID domain to improve searching are described. An embodiment describes an index phase in which an image of a document is converted into the ID domain. This is achieved by dividing the text in the image into elements and mapping each element to an identifier. Similar elements are mapped to the same identifier. Each element in the text is then replaced by the appropriate identifier to create a version of the document in the ID domain. This version may be indexed and searched. Another embodiment describes a query phase in which a query is converted into the ID domain and then used to search an index of identifiers which has been created from collections of documents which have been converted into the ID domain. The conversion of the query may use mappings which were created during the index phase or alternatively may use pre-existing mappings.	03-29-2012
20120084286	METHOD AND APPARATUS FOR GROUP COORDINATION OF CALENDAR EVENTS - An approach for managing calendar information received from a plurality of data sources is described. Calendar information associated respectively with a plurality of data sources is retrieved by a calendar management platform. For each of the data sources, metadata specifying a contributor of the corresponding calendar information and for relating distribution of the calendar information is determined. Based on the first and second metadata, a data view for the calendar information is generated.	04-05-2012
20120084287	ESTIMATION OF UNIQUE DATABASE VALUES - Estimation of unique values in a database can be performed where a data field having multiple information values is provided in the database. The data field can be partitioned into multiple intervals such that each interval includes a range of information values. An interval specific Bloom filter can be calculated for each of the multiple intervals. A binary Bloom filter value can be calculated for an information value within an interval specific Bloom filter. The binary Bloom filter value can represent whether the information value is unique. A number of unique values in the database can be determined based on calculated binary Bloom filter values.	04-05-2012
20120084288	CRIMINAL RELATIONSHIP ANALYSIS AND VISUALIZATION - Systems and methods of organizing a set of information associated with a record in a centralized database are disclosed. The record may be associated with a criminal investigation and/or a person of interest (POI). In one embodiment, the method includes creating a profile for the record and a corresponding set of data associated with the profile. The method also includes graphically clustering the set of information associated with the profile. In another embodiment, the method includes linking a data associated with a particular profile to another data associated with another profile based on a set of predetermined association factors. The method also includes generating a set of links and connections between a particular profile and a set of other profiles in the database. The method further includes visually representing the set of connections to a user of the system.	04-05-2012
20120084289	METHODS AND SYSTEMS FOR A GEOGRAPHICALLY DEFINED COMMUNICATION PLATFORM - Methods and systems are described for a geographically defined platform. In one embodiment, a block is divided into one or more partitioned blocks comprising geographically proximate street addresses. Residents whose street addresses are located within the same partitioned block may contribute and view resident-generated content through a spatial platform. Further, contiguous blocks may elect to combine with each other and a partitioned block may elect to separate from the larger block that comprises it.	04-05-2012
20120089604	Computer-Implemented Systems And Methods For Matching Records Using Matchcodes With Scores - Systems and methods are provided for assigning a record to one or more record clusters. A record including a plurality of fields is received. A field in the record is identified to have a likelihood of including an input error. One or more alternative fields are generated with alternative inputs. The identified field and the one or more alternative fields are compared with a plurality of record clusters to identify a cluster with a matching field. The record is assigned to the identified cluster based at least in part on the matching field.	04-12-2012
20120089605	USER PROFILE AND ITS LOCATION IN A CLUSTERED PROFILE LANDSCAPE - Delivering targeted content includes collecting, via at least one tangible processor, user activity data for users during a specified time period. questions asked by the users during the specified time period are extracted from the user activity data, via the at least one tangible processor, and stored in user profiles for the users. The user profiles are clustered, via the at least one tangible processor, based on the questions asked. Targeted content is delivered, via the at least one tangible processor, to a subset of the users based on the clustering.	04-12-2012
20120089606	GROUPING IDENTITY RECORDS TO GENERATE CANDIDATE LISTS TO USE IN AN ENTITY AND RELATIONSHIP RESOLUTION PROCESS - Provided are a method, system, and computer program product for grouping identity records to generate candidate lists to use in an entity and relationship resolution process. A plurality of identity records are received, wherein the identity records provide attributes of entities, wherein the identity records may provide different or same values for the attributes. The received identity records are grouped into a group of identity records. A composite query on values for selected attributes of the identity records in the group is generated and applied to an entity database to obtain composite results of entity records in the entity database matching the attribute values of the composite query. For the identity records in the group, an individual query on attributes of one of the identity records is performed against the composite results of the entity records to determine a candidate list of entity records from the entity database for the identity record. For the identity records in the group, resolution rules are applied to determine entity records in the determined candidate list that are related one of the identity records in the group according to the resolution rules. Entity relationship information on the determined entity records that are related to the identity records is stored.	04-12-2012
20120089607	METHOD AND SYSTEMS FOR PROCESSING POLYMERIC SEQUENCE DATA AND RELATED INFORMATION - Methods and systems for organizing, representing and processing polymeric sequence information, including biopolymeric sequence information such as DNA sequence information and related information are disclosed herein. Polymeric sequence and associated information may be represented using a plurality of data units, each of which includes one or more headers and a payload containing a representation of a segment of the polymeric sequence. Each header may include or be linked to a portion of the associated information.	04-12-2012
20120089608	METHOD AND SYSTEMS FOR PROCESSING POLYMERIC SEQUENCE DATA AND RELATED INFORMATION - Methods and systems for organizing, representing and processing polymeric sequence information, including biopolymeric sequence information such as DNA sequence information and related information are disclosed herein. Polymeric sequence and associated information may be represented using a plurality of data units, each of which includes one or more headers and a payload containing a representation of a segment of the polymeric sequence. Each header may include or be linked to a portion of the associated information.	04-12-2012
20120096001	AFFINITIZING DATASETS BASED ON EFFICIENT QUERY PROCESSING - Embodiments of the present invention relate to systems, methods, and computer-storage media for affinitizing datasets based on efficient query processing. In one embodiment, a plurality of datasets within a data stream is received. The data stream is partitioned based on efficient query processing. Once the data stream is partitioned, an affinity identifier is assigned to datasets based on the partitioning of the dataset. Further, when datasets are broken into extents, the affinity identifier of the parent dataset is retained in the resulting extent. The affinity identifier of each extent is then referenced to preferentially store extents having common affinity identifiers within close proximity of one other across a data center.	04-19-2012
20120096002	SYSTEMS AND METHODS FOR GENERATING AND MANAGING A UNIVERSAL SOCIAL GRAPH DATABASE - A computer-implemented method for determining connections between entities includes receiving private information from a user, retrieving public information from publicly available sources, and matching the public information with the private information. The method also includes generating a graph database with the public and private information, determining connections between entities in the graph database, and determining strength of connectivity between entities in the graph database.	04-19-2012
20120096003	INFORMATION CLASSIFICATION DEVICE, INFORMATION CLASSIFICATION METHOD, AND INFORMATION CLASSIFICATION PROGRAM - It is an object of the present invention to provide an information classification device capable of classifying retrieved pieces of information into appropriate groups even if these pieces of information are the same kind of information. The information classification device according to the present invention includes spatial arrangement means and classification means. The spatial arrangement means performs processing for spatially arranging an information group of a first information type and an information group of a second information type based on relation between the information group of the first information type and the information group of the second information type. The classification means classifies the information group of the first information type based on the processing results of the spatial arrangement means.	04-19-2012
20120102031	APPARATUS AND METHOD FOR ENTITY EXPANSION AND GROUPING - A computer readable storage medium includes executable instructions to convert an entity to a standard form including normalized attributes, a tag reference and a feature. The entity is expanded with corresponding variants. The standard form and corresponding variants are combined to form an annotated entity in a first processing step. The entity is assigned to a group in a second processing step that accesses the annotated entity. The entity is processed in a single pass comprising the first processing step and the second processing step.	04-26-2012
20120102032	METHOD TO PERFORM MAPPINGS ACROSS MULTIPLE MODELS OR ONTOLOGIES - Computer-implemented methods for mapping an element of a source information model to an element of a target information model, forming a cluster of elements for mapping across information models, and evaluating a mapping of elements across information models, and a system and computer program product thereof. The method of mapping an element of a source information model to an element of a target information model includes: receiving information for mapping a first element in a source cluster to an element in the target information model; mapping the first element to the target element using the received information for mapping the first element to the target element; and mapping all other elements in the source cluster to the target element.	04-26-2012
20120102033	SYSTEMS AND METHODS FOR BUILDING A UNIVERSAL MULTIMEDIA LEARNER - The present disclosure describes a method and system called “Universal Learner (UL),” which provides a unified framework to understand multimedia signals. The UL utilizes the loosely annotated multimedia data on the Web, analyses it in various signal domains, such as text, image, audio and combinations thereof, and builds an association graph called the “Multimedia Brain,” which basically comprises visual signals, audio signals, text phrases and the like that capture a multitude of objects, experiences and their attributes and the links among them that capture similar intent or functional and contextual relationships.	04-26-2012
20120102034	SYSTEM AND METHOD FOR RECOMMENDING LOCATION-BASED KEYWORD - According to exemplary embodiments of the invention, a location-based keyword recommending system and method are provided. The location-based keyword recommending system may include a keyword collecting unit to store location information regarding a location where a keyword is input, a region setting unit to set a virtual region by performing clustering of the location information with reference to the keyword, a region combining unit to combine virtual regions overlapping each other into one virtual region, and a keyword recommending unit to provide a location-based keyword based on the keyword related to the location information of the virtual region.	04-26-2012
20120102035	Data Embedding Methods, Embedded Data Extraction Methods, Truncation Methods, Data Embedding Devices, Embedded Data Extraction Devices And Truncation Devices - In an embodiment, a data embedding method may be provided. The data embedding method may include inputting data to be encoded and data to be embedded; grouping the data to be encoded into a first set and a second set, based on an entropy of the data to be encoded; and embedding the data to be embedded into the data to be encoded by replacing a pre-determined part of the second set with the data to be encoded so that the first set remains free of data to be embedded.	04-26-2012
20120109954	UBIQUITOUS BOOKMARKING - Data for items of information are marked, stored and retrieved. A method for marking, storing and retrieving data for items of information. Markable data is received at an electronic device. The received markable data is marked using an input of the electronic device dedicated to selectively marking received markable data at the discretion of a user of the electronic device. The received marked data is stored as marked data in response to marking the received information. The stored marked data is retrieved using the electronic device. The stored marked data is obtained and presented.	05-03-2012
20120109955	ORGANIZING NEARBY PICTURE HOTSPOTS - A method of accessing an image database containing location data and determining one or more clusters of the digital images based on their location data. A hotspot location is determined for representing the cluster of the digital images and the results are stored for later access. The computer is connected to a network and receives data from a device including data identifying a current location. After determining that the device is within a selected notification distance from the hotspot location, a notification is transmitted over the network.	05-03-2012
20120109956	PROFILE PREDICTION FOR SHARED COMPUTERS - A profile prediction system may identify one of multiple user profiles for a single computer. For example, a home computer may have multiple users that may not be targeted unless the user on the home computer can be identified. The system's user identification may be based on a clustering model that considers various browsing characteristics to identify different clusters that each correspond to a particular user or user profile. The model may be generated and refined by tracking web browsing over multiple sessions. Future activity on the computer may be used to identify which user is the source of the activity and the user may receive targeted content including advertisements.	05-03-2012
20120109957	GENERATING A SUBSET AGGREGATE DOCUMENT FROM AN EXISTING AGGREGATE DOCUMENT - Embodiments described herein are directed to generating a subset aggregate document from an existing aggregate document. Data pages in an existing aggregate document that satisfy node selection criteria are identified. An aggregate document slice is created that includes the data pages that satisfy the node selection criteria. Connections between the data pages from the existing aggregate document to the aggregate document slice are imported to form at least one continuous path with the data pages.	05-03-2012
20120109958	System and Method for Managing Data Policies on Application Objects - Described herein are systems and methods for providing data policy management over application objects in a storage system environment. An application object may comprise non-virtual or virtual objects (e.g., non-virtual-based applications, virtual-based applications, or virtual storage components). An application object manager may represent application objects by producing mapping graphs and/or application object data that represent application objects in a standardized manner. A mapping graph for an application object may describe a mapping between the application object and its underlying storage objects on a storage system. Application object data may describe a mapping graph in a standardized format. Application object data representing application objects may be received by an application policy manager that manages data policies on the application objects (including virtual applications and virtual storage components) based on the received application object data. Data policies may include policies for backup, service level objectives, recovery, monitoring and/or reporting.	05-03-2012
20120109959	METHOD AND SYSTEM FOR-CLUSTERING DATA ARISING FROM A DATABASE - A method for clustering data or objects taking the form of an array, each of the elements of said array corresponding to a value of similarity existing between said objects is implemented within a computer linked with a database containing the data or objects to be classified comprising a work memory, and a processor. The method includes steps to determine a number of classes of objects by taking account of the values of the relationships computed between an object and a previously established previous class, for each of the classes found, determine the value of each of the relationships between a class and the other classes, and merge certain classes, and take each object of each class one by one, determine the value of the relationship-of each object with each of the classes other than the class into which it was classed in the initial step, if the value of the relationship is greater then transfer the object from its class to the new class, this is continued until all the values of the relationships are negative.	05-03-2012
20120109960	GENERATING RULES FOR CLASSIFYING STRUCTURED DOCUMENTS - Techniques are disclosed for generating rules for classifying structured documents, and for classifying, retrieving, or checking structured documents, using generated rules. In one example, a method for generating rules for classifying a plurality of electronic structured documents to which a same schema is applied comprises a computer performing the following steps: determining one or more variable portions defined by the schema by scanning the schema; acquiring respective feature values of the determined variable portions from each of the plurality of structured documents and associating the structured document, from which the feature values are acquired, with the acquired feature values; and generating the rules on the basis of the feature values associated with the structured document.	05-03-2012
20120109961	Signature Based System and Methods for Generation of Personalized Multimedia Channels - A system for generating personalized channels of multimedia content. The system comprises an interface to one or more multimedia sources, wherein the multimedia sources provide multimedia content to the personalized channels of multimedia content; and a server for receiving multimedia content from the one or more multimedia sources through the interface and for serving selected multimedia content to users of the system over one or more of the personalized channels; wherein a user of the system receives personalized multimedia content gathered by the server into the one or more of the personalized channels responsive of preferences of the user as observed by the system for the user.	05-03-2012
20120109962	Taxonomy-Based Object Classification - Objects, such as documents, are classified according to a taxonomy. The taxonomy includes nodes, corresponding to object classes, arranged in a hierarchy. Class keywords are associated with the nodes. Search strings are formed for the classes by traversing the taxonomic branches and concatenating the keywords associated with the classes. For each object to be classified, a search engine is used to perform searches on the object using the search strings. The searches produce search scores for each search string. Each object is classified by identifying the class(es) corresponding to the highest search score(s) for the object, and classifying the object into the identified class(es).	05-03-2012
20120109963	CLASSIFICATION HIERARCHY REGENERATION SYSTEM, CLASSIFICATION HIERARCHY REGENERATION METHOD, AND CLASSIFICATION HIERARCHY REGENERATION PROGRAM - A classification hierarchy regeneration system is provided, wherein when a new classification hierarchy is generated by restructuring an existing classification hierarchy, a classification hierarchy in view of hierarchical relationship of classifications and a classification hierarchy integrating classifications of the same meaning can be efficiently generated. The clustering means clusters a data group associated with a hierarchical classification, and generating a classification group, i.e., a group obtained by extracting a classification satisfying a condition defined in advance from classifications corresponding to respective data in a cluster. The cooccurrence degree calculation means calculates a degree of cooccurrence of two classifications selected from the classification group. The classification hierarchy regeneration means regenerates the hierarchy of classification based on the classification group and the degree of cooccurrence.	05-03-2012
20120117064	ADAPTIVE CELL-SPECIFIC DICTIONARIES FOR FREQUENCY-PARTITIONED MULTI-DIMENSIONAL DATA - A cell-specific dictionary is applied adaptively to adequate cells, where the cell-specific dictionary subsequently optimizes the handling of frequency-partitioned multi-dimensional data. This includes improved data partitioning with super cells or adjusting resulting cells by sub-dividing very large cells and merging multiple small cells, both of which avoid the highly skewed data distribution in cells and improve the query processing. In addition, more efficient encoding is taught within a cell in case the distinct values that actually appear in that cell are much smaller than the size of the column dictionary.	05-10-2012
20120117065	AUTOMATED PARTITIONING IN PARALLEL DATABASE SYSTEMS - Embodiments are directed to determining optimal partition configurations for distributed database data and to implementing parallel query optimization memo data structure to improve partition configuration cost estimation efficiency. In an embodiment, a computer system accesses a portion of database data and various database queries for a given database. The computer system determines, based on the accessed database data and database queries, a partition configuration search space which includes multiple feasible partition configurations for the database data and a workload of queries expected to be executed on that data. The computer system performs a branch and bound search in the partition configuration search space to determine which data partitioning path has the lowest partitioning cost. The branch and bound search is performed according to branch and bound search policies. The computer system also outputs the partition configuration with the determined lowest partitioning cost.	05-10-2012
20120117066	Computer implemented method for processing data on an internet-accessible data processing unit - Computer implemented method for processing data on a data processing unit accessible through the Internet, in particular for a evaluating and/or updating and/or adapting of data sets which are stored on an Internet-accessible database equipment (	05-10-2012
20120117067	METHOD AND APPARATUS FOR PROVIDING A RANGE ORDERED TREE STRUCTURE - An approach is provided for creating a range ordered tree structure. A tree index platform determines one or more ranges for grouping one or more data objects of a key-value store. Next, the tree index platform determines to specify the one or more ranges in one or more respective index objects of a data structure. Then, the tree index platform determines to associate the data structure with the key-value store.	05-10-2012
20120117068	TEXT MINING DEVICE - The text mining device 300 includes a clustering section 301. The clustering section 301 performs clustering on a plurality of characteristic expressions extracted from a document set such that characteristic expressions, in which sentences to be referred to as original sentences are the same, are compiled in one cluster, based on the similarity in original document sets which are sets of documents including the respective characteristic expressions, the documents being of the document set. Consequently, the probability of repeatedly viewing the same original document by a user can be reduced reliably.	05-10-2012
20120124044	SYSTEMS AND METHODS FOR PHRASE CLUSTERING - Systems and associated methods for enhanced concept understanding in large document collections through phrase clustering are described. Embodiments take as input an initial set of phrases and estimate centroids using a clustering process. Embodiments then generate new phrases around each of the current centroids using the current phrases. These new phrases are added to the current set, and the clustering process is iterated. Upon convergence, embodiments finalize clusters based on phrases of any given length.	05-17-2012
20120124045	Parallel Partitioning Index Scan - System, methods and articles of manufacture for joining data in the database tables comprising, performing an index scan on a global index of a first database table, determining rows in the first database table that may be joined with a second database table based on a needed partitioning, wherein the needed partitioning is determined using an index scan, determining a number of partitions in the second database table, and joining each of the corresponding partitions in the first database table with a corresponding partitions in the second database table.	05-17-2012
20120124046	SYSTEM AND METHOD FOR MANAGING DEDUPLICATED COPIES OF DATA USING TEMPORAL RELATIONSHIPS AMONG COPIES - Systems and methods are disclosed for managing deduplicated images of data objects that change over time. The method includes: organizing unique content of each data object as a plurality of content segments and storing the content segments in a data store; for each data object, creating an organized arrangement of hash structures, wherein each structure, for a subset of the hash structures, includes a hash signature for a corresponding content segment and is associated with a reference to the corresponding content segment, and for each data object, maintaining an organized arrangement of temporal structures to represent a corresponding data object over time, wherein each structure is associated with a temporal state of the data object, and wherein each temporal state is associated with the hash structures representing the content of the data object during that temporal state.	05-17-2012
20120124047	MANAGING LOG ENTRIES - Example methods, apparatus, and articles of manufacture to manage log entries are disclosed. A disclosed example method involves grouping first log entries into a first group based on a matching portion among the first log entries. The example method also involves identifying a non-matching portion of the first log entries and associating an identifier with the non-matching portion. A processor is operated to generate a text string template comprising the identifier and the at least one matching portion in a human-readable format. The identifier replaces the non-matching portion in the template.	05-17-2012
20120124048	CLUSTERING APPARATUS, AND CLUSTERING METHOD - A technique extracts an object that is characteristic although the number of appearances is less demanded. A clustering apparatus includes: a similarity degree calculating section calculating a similarity degree of a combination of optional two of objects to store the calculated similarity degree in a similarity degree table, excluding a combination of one of the optional two and itself; a merging object selecting section selecting as merging objects, two objects related to the similarity degree which satisfies a predetermined reference; a new object generating section generating a new object from the merging objects; a merging object removing section removing from the similarity degree table, a similarity degree between each of the two objects selected as the merging objects and each of the objects; and a new object adding section calculating a similarity degree between the new object and each of the plurality of objects other than the new object.	05-17-2012
20120124049	PROFILE ANALYSIS SYSTEM - To recommend information useful for a user regardless of domains and services, items are defined by basic desires as action objectives of the user, a user profile is expressed by basic desire strengths, constant strengths of the desires of the user calculated from an action history of the user are compared with current desire strengths, current desire degrees are calculated, and recommended items are presented.	05-17-2012
20120131005	File Kinship for Multimedia Data Tracking - Kinship between electronic files among personal networked devices may be ascertained between the files by determining an operational relationship between the files and with a similarity measurement.	05-24-2012
20120131006	SYSTEMS AND METHODS FOR ROBUST PATTERN CLASSIFICATION - Certain embodiments relate to systems and methods for performing data discrimination using dimensionality reduction techniques. Particularly the Sparse Matrix Transform (SMT) is applied to more efficiently determine the Fischer Discrimination vector for a given dataset. Adjustments to the SMT facilitate more robust identification of the Fischer Discrimination vector in view of various resource constraints.	05-24-2012
20120136858	Method to Coordinate Data Collection Among Multiple System Components - A method, computer program product and computer system for coordinating data collection from a component of a data processing system is disclosed. The component registers with a dispatcher, wherein the component is a computer resource of the data processing system and is configured to accept at least one query, and the registration comprising data types handled by the at least one component, wherein the dispatcher is allocated computer resources of the data processing system. The component receives from the dispatcher a notification to perform the query against specified data structures, wherein the query comprises an action. The component, responsive to receiving notification, determines whether data structures of a data type specified in the query are handled. The data processing system runs the query to determine whether the query is satisfied. The data processing system executes the action.	05-31-2012
20120136859	Entity Type Assignment - A computer system creates a plurality of objects using facts derived from electronic documents, each object including one or more facts describing an entity associated with the object. The system generates a value for an object of an unknown entity type, of the plurality of objects, by using an entity type model for a known entity type. The entity type model is based on a set of features of a plurality of objects of the known entity type, and the value indicates whether the object of an unknown entity type is of the known entity type. The system assigns the known entity type to the object of an unknown entity type in response to a determination that the value indicates the object of an unknown entity type is of the known entity type, and stores the object with the assigned entity type.	05-31-2012
20120136860	MULTI-SCALE SEGMENTATION AND PARTIAL MATCHING 3D MODELS - A scale-Space feature extraction technique is based on recursive decomposition of polyhedral surfaces into surface patches. The experimental results show that this technique can be used to perform matching based on local model structure. Scale-space techniques can be parameterized to generate decompositions that correspond to manufacturing, assembly or surface features relevant to mechanical design. One application of these techniques is to support matching and content-based retrieval of solid models. Scale-space technique can extract features that are invariant with respect to the global structure of the model as well as small perturbations that 3D laser scanning may introduce. A new distance function defined on triangles instead of points is introduced. This technique offers a new way to control the feature decomposition process, which results in extraction of features that are more meaningful from an engineering viewpoint. The technique is computationally practical for use in indexing large models.	05-31-2012
20120136861	CONTENT-PROVIDING METHOD AND SYSTEM - A content-providing method and system, including identifying a representative type cluster by clustering content related to behavioral data which represents a use history of a user, according to type of the content, mapping the representative type cluster to a time interval, and storing the representative type cluster and the time interval.	05-31-2012
20120136862	SYSTEM AND METHOD FOR PRESENTING COMPARISONS OF ELECTRONIC DOCUMENTS - Changes identified between two electronic documents are grouped according to the category of the identified change, and the categorized changes are grouped and displayed to a user. Additionally, each change is assigned a review state that can be updated to reflect the status of each change as the user reviews the changes in a comparison document.	05-31-2012
20120136863	TECHNIQUES FOR KNOWLEDGE DISCOVERY BY CONSTRUCTING KNOWLEDGE CORRELATIONS USING CONCEPTS OR TERMS - Techniques for identifying knowledge use a graphical user interface for inputting one or more terms to be explored for additional knowledge. Then a search is conducted across one or more sources of information to identify resources containing information about or information associated with said terms. The resources are decomposed into elemental units of information and stored in a data structures called nodes. A group of nodes are stored in a node pool and, from the node pool, correlations of nodes are constructed that represent knowledge using information about relation types. Information about relations types is determined using a relation classifier.	05-31-2012
20120143865	FUZZY CLUSTERING OF OCEANIC PROFILES - System and method to partition littoral regions by profiles of specific parameters using fuzzy c-mean clustering. Fuzzy cluster partitions assign each datum to a set of data clusters such that the sum cluster membership probability of the point is equal to unity. Partial memberships can supply information about transition areas from one cluster to another.	06-07-2012
20120143866	Client Performance Optimization by Delay-Loading Application Files with Cache - Systems and methods for minimizing start-up and implementation latency of a web application hosted in a computing system environment. Latency mitigation is accomplished via a programmatic approach to reduce the number of files or script needed for an initial boot of the web application. Remaining files are loaded either as needed or in a background process.	06-07-2012
20120143867	Facilitating Extraction and Discovery of Enterprise Services - Implementations of the present disclosure include methods for annotating an enterprise service that is electronically stored in an enterprise service repository. In some implementations, methods include generating one or more graphs based on one or more artifacts, the one or more artifacts resulting from a development process of the enterprise service, generating one or more metadata repositories based on the one or more artifacts, each metadata repository comprising instance data corresponding to one of the one or more graphs, storing the one or more graphs and the one or more metadata repositories to a knowledge base provided in a computer-readable medium, determining one or more annotations based on the one or more graphs and the one or more metadata repositories, associating the one or more annotations to the enterprise service, and storing the one or more annotations in the enterprise service repository.	06-07-2012
20120143868	COMPUTER READABLE ELECTRONIC RECORDS AUTOMATED CLASSIFICATION SYSTEM - Classifying an electronic document in a computer-based system is disclosed. For each classification instance in a plurality of classification instances, a confidence data indicating a degree of confidence that the electronic document is associated with that classification instance is determined. A classification, based on a first classification instance in the plurality of classification instances, is assigned without human intervention to the electronic document if the confidence data associated with the first classification instance exceeds a first threshold.	06-07-2012
20120143869	MEASURING ENTITY EXTRACTION COMPLEXITY - A named entity input is received and a target sense for which the named entity input is to be extracted from a set of documents is identified. An extraction complexity feature is generated based on the named entity input, the target sense, and the set of documents. The extraction complexity feature indicates how difficult or complex it is deemed to be to identify the named entity input for the target sense in the set of documents.	06-07-2012
20120143870	SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR AGGREGATING ON-DEMAND DATABASE SERVICE DATA - In accordance with embodiments, there are provided mechanisms and methods for aggregating on-demand database service data. These mechanisms and methods for aggregating on-demand database service data can enable embodiments to more flexibly summarize data. The ability of embodiments to provide such feature may lead to enhanced aggregation features which may be used for providing more effective ways of summarizing data.	06-07-2012
20120150858	PARTITIONING MANAGEMENT OF SYSTEM RESOURCES ACROSS MULTIPLE USERS - Exemplary method, system, and computer program embodiments for partitioning management of storage resources in a computing storage environment across multiple users are provided. In one embodiment, a resource group attribute is assigned to a storage resource object representing at least one of the plurality of storage resources in a system configuration of the computing storage environment. The resource group attribute includes a selectable value indicating a resource group object to which the storage resource object is associated. A resource group label is provided in the resource group object and is a string having no wildcards. A user resource scope is assigned to a user ID and a value of the user resource scope provides a mechanism to match to the resource group label. The user ID is authorized to perform one of creating, deleting, modifying, controlling, and managing storage resources with an association to a resource group.	06-14-2012
20120150859	Task-Based Tagging and Classification of Enterprise Resources - Embodiments of the present invention relate to systems and methods for task-based tagging and resource classification, which allow tags or metadata to emerge from execution of work-related tasks and activities. In certain embodiments, tags can be automatically extracted from activities performed, for example utilizing a textual description of tasks carried out by an employee. Accumulated tags can then be utilized to describe enterprise resources. Automatic tagging or metadata annotation can be integrated with everyday work utilizing one or more techniques. Candidate tags can be extracted from a task written description utilizing an algorithm that analyzes keywords. Candidate tags can be refined, for example by clustering utilizing a K-means approach. Candidate tags can be ranked based on an overall frequency adjusted against time, with the importance of a tag declining with time.	06-14-2012
20120158722	DATABASE PARTITION MANAGEMENT - Apparatus, systems, and methods may operate to receive a request to move at least a portion of a database table stored on a tangible medium from a current partition to a history partition, wherein the database table is partitioned into physical partitions according to a selected mapping update frequency. In response to receiving the request, activities may include modifying a logical partitioning of the database table by updating a mapping of the physical partitions to logical partitions. Other apparatus, systems, and methods are disclosed.	06-21-2012
20120158723	Data Grid Advisor - A system and method to generate an improved layout of a data grid in a database environment is provided. The data grid is a clustered in-memory database cache comprising one or more data fabrics, where each data fabric includes multiple in-memory database cache nodes. A data grid advisor capability can be used by application developers and database administrators to evaluate and design the data grid layout so as to optimize performance based on resource constraints and the needs of particular database applications.	06-21-2012
20120158724	AUTOMATED WEB PAGE CLASSIFICATION - Described herein are methods and systems implementing a web page classification system for automatically generating at least one context feature for a web page and classifying the web page based on the at least one context feature. In one implementation, a context feature generating module of the web page classification system is configured to automatically generate at least one context feature based on at least two of uniform resource locator (URL) features, title features, and meta tags features of a web page and a classifying module is configured to classify the web page based on the at least one context feature.	06-21-2012
20120158725	Dynamic hierarchical tagging system and method - A dynamical hierarchical tagging system connected to a user site through a remote communications network. The system may comprise a master controller, a job management server connected to the master controller, one or more scanners in communication with the job management server, wherein the one or more scanners are configured to scan for one or more user assets located at the user site, resulting in scan results, a scan logic processor connected to the master controller, wherein the scan logic processor is configured to store the scan results in a user database, a tagging logic engine connected to the master controller, wherein the tagging logic engine is configured to tag the scan results stored in the user database, and an indexing logic processor connected to the master controller, wherein the indexing logic processor is configured to search and index the tagged scan results stored in the user database.	06-21-2012
20120158726	Method and Apparatus For Classifying Digital Content Based on Ideological Bias of Authors - A method and apparatus for classifying a collection of digital documents based on ideological bias of authors. At least a portion of text of a digital document is received and parsed. Pairs of specific features text having specified relationships are detected. The pairs are then mapped to an ideological bias, based on an ideological bias ontology for example. Various actions can be taken on the digital documents based on the determined ideological bias.	06-21-2012
20120158727	NEGOTIABLE INSTRUMENT ELECTRONIC CLEARANCE SYSTEMS AND METHODS - Methods, devices, and systems for analyzing negotiated negotiable instruments for unlawful activity are described. A computer system, including a computer readable storage device and a processor may be provided. A plurality of electronic files may be received. Each of these electronic files of the plurality of electronic files may include an electronic image of at least a portion of a negotiable instrument and include a plurality of data fields. The plurality of electronic files may be divided into subsets based on whether data is available in particular data fields of the electronic files. Based upon the subset an electronic file is made a member of, various selection criteria may be applied to determine if the electronic file is a candidate for suspicious and/or illegal activity. Also, a listing of candidates for suspicious and/or illegal activity may be presented to a user.	06-21-2012
20120158728	SYSTEMS AND METHODS FOR TAGGING EMAILS BY DISCUSSIONS - The invention provides for techniques to process and produce email documents. The techniques provide for organizing a first plurality of email documents into a plurality of document groups, reviewing a document group from the plurality of document groups, and associating a review content with the document group. The techniques provide for ways to propagate the review content to one or more email documents associated with the document group and producing a second plurality of email documents. The techniques provide for annotating one or more email documents in accordance with the review content. Depending on the embodiment, review content may include text, graphics, audio, tag, and multimedia information. Produced documents can be searched and browsed in accordance with information in the review content. Email documents can be grouped by information in meta information and/or header information associated with the email documents into various groups, including threads or conversations, for example.	06-21-2012
20120166438	SYSTEM AND METHOD FOR RECOMMENDING QUERIES RELATED TO TRENDING TOPICS BASED ON A RECEIVED QUERY - Systems and methods for identifying candidate queries related to a trending topic based on a user query are described. A trending topic identification module identifies topics trending in one or more real-time content sources. The real-time content source(s) may include, for example, a source of microblog posts or other user-generated data, a news feed, or the like. A query recommendation module suggests at least one candidate query in response to receiving a user query. The query recommendation module obtains the at least one candidate query by comparing words and named entities of the user query with words and named entities associated with the trending topics identified by the trending topic identification module.	06-28-2012
20120166439	METHOD AND SYSTEM FOR CLASSIFYING WEB SITES USING QUERY-BASED WEB SITE MODELS - Web sites are grouped by generating feature space representations of documents, and aggregating the feature space representations into web site vectors. A document vector may be generated for each document of a plurality of documents associated with a set of web sites according to a query-based feature space model. The query-based feature space model defines features of the documents. Each document vector includes weights determined for features associated with the corresponding document. A web site vector is generated for each of the web sites using the plurality of document vectors. The web sites are grouped according to the web site vectors.	06-28-2012
20120166440	SYSTEM AND METHOD FOR PARALLEL SEARCHING OF A DOCUMENT STREAM - A system and method for searching a document for a query pattern. A plurality of streams may be stored each including a linear sequence of nodes. Each stream may be associated with nodes having a common label in a data tree of the document. A query pattern may be searched for in the streams by executing a plurality of threads. Each of two or more of the threads may be used to search different sub-streams of the plurality of streams. Each of the different sub-streams searched for by each thread in each stream may be uniquely correlated with one or more disjoint sub-trees of a partition of the tree into a plurality of sub-trees. The two or more of the plurality of threads may be executed in parallel. A result of the query pattern search may be generated using at least one of the threads.	06-28-2012
20120173526	SELECTIVELY ORGANIZING A RECIPIENT LIST BASED ON EXTERNAL GROUP DATA - An ungrouped list associated with a communication artifact can be identified. The information can be associated with a recipient, which can be a user registered within a computing system. Group data information associated with an external source can be received. The source can be a data source not associated with the artifact. The group data can include a group and/or a contact identifier. Recipient information from the recipient list can be associated with a group identifier if the recipient identifier is equivalent to the contact identifier within the group data. A grouped list can be generated from the associated data. A grouped list can be presented within an interface. The presenting can present recipient information within a logical grouping for at least a portion of the recipients of the ungrouped list. The grouping can be an organization of recipient information associated with the group identifier.	07-05-2012
20120173527	Variational Mode Seeking - A mode-seeking clustering mechanism identifies clusters within a data set based on the location of individual data point according to modes in a kernel density estimate. For large-scale applications the clustering mechanism may utilize rough hierarchical kernel and data partitions in a computationally efficient manner. A variational approach to the clustering mechanism may take into account variational probabilities, which are restricted in certain ways according to hierarchical kernel and data partition trees, and the mechanism may store certain statistics within these trees in order to compute the variational probabilities in a computational efficient way. The clustering mechanism may use a two-step variational expectation and maximization algorithm and generalizations hereof, where the maximization step may be performed in different ways in order to accommodate different mode-seeking algorithms, such as the mean shift, mediod shift, and quick shift algorithms.	07-05-2012
20120173528	SYSTEM AND METHOD FOR PROVIDING JOB SEARCH ACTIVITY DATA - One aspect of the invention provides system for facilitating a job search comprising a database, a display for displaying a user interface with past, present and future sections and one or more processors. The system stores job search data in the database, schedules one or more activities associated with the job search data and displays the job search data or one or more activities in one of the past, present and future sections. Also provided, in another aspect, is a reporting module operably connected to the database and configured to provide statistical data derived from the job search data and the one or more activities. A further aspect of the subject system is directed towards providing such data to third parties to facilitate coaching a job seeker and monitoring job search related activities.	07-05-2012
20120173529	GRAPHICALLY DISPLAYING A FILE SYSTEM - The contents of a computer file system are displayed on a graphical user interface. File system metadata descriptive of the computer file system and file metadata descriptive of each of a plurality of files are gathered. A file selection is received indicating a file accessed by the user. A user context is determined by the file metadata. The files are clustered using the file system metadata, a set of file metadata, and the user context. The set of file clusters are mapped onto a visualization model and graphically displayed on the graphical user interface using the visualization model.	07-05-2012
20120179680	SEMANTIC ASSOCIATIONS IN DATA - Methods and apparatus teach providing semantic associations between data available on one or more computing devices, including grouping together related files and creating an association between the related grouped files and at least one anchor file to provide a semantic association for the grouped files. Also is taught configuring an agent on the one or more computing devices to undertake the grouping and to create the association without a user request. Also is taught triggering an evaluation of current files against related grouped files, and creating an association between the current files and at least one of the related grouped files and the at least one anchor file. Information may be added to the created association to create additional semantic associations for one or more of the grouped files and the current files. In turn, computer program products and computing systems for accomplishing the foregoing are provided.	07-12-2012
20120179681	DATA CLASSIFICATION - A method for managing data in an enterprise by identifying data of interest from among a multiplicity of data elements in an enterprise, the method including characterizing data of interest at least by at least one non-content based data identifier thereof and at least one access metric thereof, the at least one access metric being selected from data access permissions and actual data access history and selecting data of interest by considering only data elements from among the multiplicity of data elements which have the at least one non-content based data identifier thereof and the at least one access metric thereof.	07-12-2012
20120179682	WORD PAIR ACQUISITION APPARATUS, WORD PAIR ACQUISITION METHOD, AND PROGRAM - Conventionally, it has been impossible to appropriately acquire word pairs having a prescribed relationship. Such word pairs can be appropriately acquired with a word pair acquisition apparatus including: a word class information storage unit in which word class information can be stored; a class pair favorableness degree storage unit in which a class pair favorableness can be stored; a seed pattern storage unit in which can be stored one or more seed patterns; a word pair acquisition unit that acquires one or more word pairs co-occurring with the seed pattern from sentence groups; a class pair favorableness degree acquisition unit that acquires a class pair favorableness degree; a score determination unit that uses the class pair favorableness degree to determine a score of each of the word pairs; and a word pair selection unit that acquires one or more word pairs having a high score.	07-12-2012
20120179683	Method and System for Attribute Management in a Namespace - A computer-based method and system for managing attributes of objects in a namespace and for allowing multiple views into the namespace. The namespace system allows the objects identified by the names of the namespace to be hierarchically organized. The namespace system allows for attributes of various objects, including directory objects and data objects, to be dynamically defined after creation of an object. The namespace system also allows for the querying of objects based on their dynamically defined attributes. When the namespace system receives a query specification that includes a newly defined attribute, it identifies the objects that match that query specification that includes a newly defined attribute, it identifies the objects that match that query specification.	07-12-2012
20120185477	System and method for supplying missing impact factors in a database - Systems and methods of supplying missing impact factors in a database. An example of a method is carried out by program code stored on non-transient computer-readable medium and executed by a processor. The method includes providing a matrix of each tree in the database in computer-readable medium, with each dimension in the matrix representing a node of the tree, and each of another dimension in the matrix representing an impact factor for a component of a system under consideration. The method also includes identifying at least one missing impact factor in the matrix. The method also includes estimating the missing impact factor and populating the estimated impact factor in the matrix in the computer-readable medium.	07-19-2012
20120185478	Extracting And Normalizing Organization Names From Text - A method, apparatus and article of manufacture for extracting and normalizing organization names from text. The method uses regular expressions, certain rules and dictionaries to identify potential organization names in text, then uses word similarity metrics, clustering, and other considerations to group normalized organization names.	07-19-2012
20120185479	SYSTEM AND METHOD FOR ORGANIZING AND MANAGING CONTENT TO FACILITATE DECISION-MAKING - A system is configured to organize content. The system may constitute a decision tool that provides a user with a decision template that enables the user to create a decision record that organizes aspects of a decision that the user considered, the user's reasoning with respect to these aspects in arriving at an ultimate outcome, and/or other information related to the decision. The decision template may include one or more fields into which the user may enter content manually, or the user may search for, and import content related to the decision into the template from, one or more content sources that include relevant content.	07-19-2012
20120185480	METHOD TO IMPROVE THE NAMED ENTITY CLASSIFICATION - A method is described for providing a named entity classification in a computing system having a processor, comprising the steps of the processor reading, from an LOD (Linking Opening Data) set, an LOD node corresponding to a to-be-classified named entity. The processor also determining a type attribute of the LOD node corresponding to the to-be-classified named entity as a tagged type of the to-be-classified named entity and further reading a candidate type. Finally, the processor computing, based on the tagged type, a possibility of the to-be-classified named entity belonging to the candidate type.	07-19-2012
20120185481	Method and Apparatus for Executing a Recommendation - A method, apparatus and system for generating recommendations of items to users. Ratings of items made by users are collected (	07-19-2012
20120191708	Document Classification and Characterization - Data is received that characterizes each of a plurality of documents within a document set. Based on this data, the plurality of documents are grouped into a plurality of stacks using one or more grouping algorithms. A prime document is identified for each stack that includes attributes representative of the entire stack. Subsequently, provision of data is provided that characterizes documents for each stack including at least the identified prime document to at least one human reviewer. User-generated input from the human reviewer is later received that categorized each provided document and data characterizing the user-generated input can then be provided. Related apparatus, systems, techniques and articles are also described.	07-26-2012
20120191709	AUTOMATIC SHARING OF SUPERLATIVE DIGITAL IMAGES - Described herein are techniques related to automatic sharing of superlative digital images. Such techniques include an automatic selection of one or more superlative digital images from a set of digital images based, at least in part, upon weighted criteria regarding properties (e.g., metadata or content) of the digital images. Instead, interested parties (e.g., subscribers and/or persons with an association with a particular image) are notified automatically. This Abstract is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.	07-26-2012
20120191710	DIRECTED PLACEMENT OF DATA IN A REDUNDANT DATA STORAGE SYSTEM - A data processing apparatus, comprising a metadata store storing information about files that are stored in a distributed data storage system, and comprising a class database; one or more processing units; logic configured for receiving and storing in the class database a definition of a class of data storage servers comprising one or more subclasses each comprising one or more server selection criteria; associating the class with one or more directories of the data storage system; in response to a data client storing a data file in a directory, binding the class to the data file, determining and storing a set of identifiers of one or more data storage servers in the system that match the server selection criteria, and providing the set of identifiers to the data client.	07-26-2012
20120191711	Deferring Classification of a Declared Record - A records management system classifies records according to a file plan. Record are declared, and then classified. Some records have in initially indeterminate classification and classification is deferred, either by request or due to a lack of sufficient information to classify the record according to the file plan. Unclassified records are placed into a temporary container. At some time while in the temporary container a classification event occurs with a given record which allows the records management system to classify the record and place it into a container corresponding to its classification.	07-26-2012
20120191712	CROSS-DOMAIN CLUSTERABILITY EVALUATION FOR CROSS-GUIDED DATA CLUSTERING BASED ON ALIGNMENT BETWEEN DATA DOMAINS - A computer program product evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source-target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively.	07-26-2012
20120191713	CROSS-DOMAIN CLUSTERABILITY EVALUATION FOR CROSS-GUIDED DATA CLUSTERING BASED ON ALIGNMENT BETWEEN DATA DOMAINS - A process for evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source- target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively.	07-26-2012
20120191714	SCALABLE USER CLUSTERING BASED ON SET SIMILARITY - Methods and apparatus, including systems and computer program products, to provide clustering of users in which users are each represented as a set of elements representing items, e.g., items selected by users using a system. In one aspect, a program operates to obtain a respective interest set for each of multiple users, each interest set representing items in which the respective user expressed interest; for each of the users, to determine k hash values of the respective interest set, wherein the i-th hash value is a minimum value under a corresponding i-th hash function; and to assign each of the multiple users to each of the respective k clusters established for the respective user, the i-th cluster being represented by the i-th hash value. The assignment of each of the users to k clusters is done without regard to the assignment of any of the other users to k clusters.	07-26-2012
20120197888	METHOD AND APPARATUS FOR SELECTING CLUSTERINGS TO CLASSIFY A PREDETERMINED DATA SET - A method for selecting clusterings to classify a predetermined data set of numerical data comprises five steps. First, a plurality of known clustering methods are applied, one at a time, to the data set to generate clusterings for each method. Second, a metric space of clusterings is generated using a metric that measures the similarity between two clusterings. Third, the metric space is projected to a lower dimensional representation useful for visualization. Fourth, a “local cluster ensemble” method generates a clustering for each point in the lower dimensional space. Fifth, an animated visualization method uses the output of the local cluster ensemble method to display the lower dimensional space and to allow a user to move around and explore the space of clustering.	08-02-2012
20120197889	INFORMATION MATCHING APPARATUS, INFORMATION MATCHING METHOD, AND COMPUTER READABLE STORAGE MEDIUM HAVING STORED INFORMATION MATCHING PROGRAM - An information matching apparatus includes a target DB corresponding to a check target that stores therein records; a narrow-down condition creating unit that combines, in accordance with values of check items in a check source record using AND, a search condition defined by a search definition indicating a condition for excluding candidates in check target records that are less likely to have a similarity to or a relationship with a name identification source record and each grouping condition defined by a grouping definition indicating a condition for limiting a checking area of the check target records to create a narrow-down condition for narrowing down the check target records; and a searching unit that searches the target DB corresponding to the check target for a check target record in accordance with the created narrow-down condition.	08-02-2012
20120197890	DISCOVERING AND SCORING RELATIONSHIPS EXTRACTED FROM HUMAN GENERATED LISTS - A computer-implemented system and method for extracting Human Generated Lists from an electronic database is described. The system searches for objects of the same class within a context window to identify Human Generated Lists and stores them to an archive, The archive may be used to generate a relationship network. The system generates variable length data vectors to represent the relationships between the objects within each Human Generated List. This relationship network can then be queried to discover relationships between the objects in the Human Generated Lists and to provide related objects as recommendations.	08-02-2012
20120197891	GENRE DISCOVERY ENGINES - Genre discovery engines are presented. A genre discovery engine can compare clusters of products falling within known genres to other clusters. Known genres can be defined in turns of correlated product properties. When a new cluster is identified falling outside the boundaries of known genres, the discovery engine can recommend that the new cluster might be a new genre.	08-02-2012
20120197892	CROSS-DOMAIN CLUSTERABILITY EVALUATION FOR CROSS-GUIDED DATA CLUSTERING BASED ON ALIGNMENT BETWEEN DATA DOMAINS - A computer system for evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source-target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively.	08-02-2012
20120197893	Methods and Systems for Selecting and Presenting Content Based on Dynamically Identifying Microgenres Associated with the Content - A method of selecting and presenting content based on learned user preferences is provided. The method includes providing a content system including a set of content items organized by genre characterizing the content items, and wherein the set of content items contains microgenre metadata further characterizing the content items. The method also includes receiving search input from the user for identifying desired content items and, in response, presenting a subset of content items to the user. The method further includes receiving content item selection actions from the user and analyzing the microgenre metadata within the selected content items to learn the preferred microgenres of the user. The method includes, in response to receiving subsequent user search input, selecting and presenting content items in an order that portrays as relatively more relevant those content items containing microgenre metadata that more closely match the learned microgenre preferences of the user.	08-02-2012
20120197894	APPARATUS AND METHOD FOR PROCESSING DOCUMENTS TO EXTRACT EXPRESSIONS AND DESCRIPTIONS - Disclosed is an apparatus and method for processing documents to extract expressions and descriptions. The apparatus for processing documents includes a document collection unit, which collects documents from websites and divides each of the collected documents into a script portion and a description portion to thus generate a script document and a description document, and an expression extraction unit, which extracts expression description sentences on the basis of the description document, and extracts expressions described by the expression description sentences from the script document. According to the invention, study material, including a pair that comprises an expression to be studied and a description thereof, can be automatically constructed.	08-02-2012
20120203782	METHOD AND SYSTEM FOR DATA PROVENANCE MANAGEMENT IN MULTI-LAYER SYSTEMS - Method, system, and programs for heterogeneous data management. Information from multiple data sources is first obtained. Data/metadata from each of the data sources is modeled based on the source and/or granularity information of the data/metadata to generate data/metadata models. The data/metadata from multiple data sources are integrated, by applying one or more processes to the data/metadata from different data sources based on the data/metadata models, to generate integrated data/metadata. A provenance representation for the integrated data/metadata is created tracing sources, granularities, and/or processes applied and archived for enabling an query associated with the integrated data/metadata.	08-09-2012
20120209847	METHODS AND SYSTEMS FOR AUTOMATICALLY GENERATING SEMANTIC/CONCEPT SEARCHES - In various embodiments, a semantic space associated with a corpus of electronically stored information (ESI) may be created and used for concept searches. Documents (and any other objects in the ESI, in general) may be represented as vectors in the semantic space. Vectors may correspond to identifiers, such as, for example, indexed terms. The semantic space for a corpus of ESI can be used in information filtering, information retrieval, indexing, and relevancy rankings.	08-16-2012
20120209848	SYSTEM AND METHOD FOR ADVERTISEMENT TRANSMISSION AND DISPLAY - The disclosure herein provides systems and methods for a media enhancement system configured to associate a secondary media signal (for example, the secondary media signal can comprise an advertisement) to a primary media signal (for example, a radio broadcast). The disclosure herein additionally provides systems and methods for a media enhancement system that enables the generating, transmitting, displaying, and/or responding to a plurality of associated and/or unassociated secondary media signals, based on a primary media content from a primary media signal, user characteristics (for example, demographic and/or geographic information), and/or third-party preferences (for example, the goals of advertisers). The secondary media signals can be used to enhance the primary media content already being provided to the user on a user device. The secondary media signals can also be used to create psychological associations or relationships with the primary media content already being provided to the user.	08-16-2012
20120209849	WEB-SCALE DATA PROCESSING SYSTEM AND METHOD - A web-scale data processing system and method are provided herein. More particularly, a web-scale data processing system and method for crawling, storing, processing, encoding, and/or serving web-scale data are disclosed.	08-16-2012
20120215777	ASSOCIATION SIGNIFICANCE - Systems and techniques for determining significance between entities are disclosed. The systems and techniques identify a first entity having an association with a second entity, apply a plurality of association criteria to the association, weight each of the criteria based on defined weight values, and compute a significance score for the first entity with respect to the second entity based on a sum of a plurality of weighted criteria values. The systems and techniques utilize information from disparate sources to create a uniquely powerful signal. The systems and techniques can be used to identify the significance of relationships (e.g., associations) among various entities including, but not limited to, organizations, people, products, industries, geographies, commodities, financial indicators, economic indicators, events, topics, subject codes, unique identifiers, social tags, industry terms, general term/s, metadata elements, classification codes, and combinations thereof.	08-23-2012
20120215778	APPARATUS AND METHOD FOR MANAGING CONTENT DATA IN PORTABLE TERMINAL - An apparatus and method for managing content data of a portable terminal. for the method allows the portable terminal to be able to assign a user's preference on content data by assigning the preference on the content data to data information such as meta-data. The apparatus includes a preference manager configured to assign a preference on acquired content data, and a content manager configured to determine a transmission priority and sorting rule for the content data by confirming the assigned preference.	08-23-2012
20120215779	ANALYTICS MANAGEMENT - Example embodiments herein include a system having one or more edge servers disposed in an edge site of a content delivery network (CDN). The system can include a collector for collecting analytics associated with requests for content in the CDN. One or more additional collectors can be instantiated in the system, for example, in response to an increase in recordable events detected in the CDN. The system can include an aggregator for aggregating the collected analytics with analytics collected from other edge stages of the CDN. The system can also include a data store that stores the aggregated analytics according to a configurable data model.	08-23-2012
20120215780	ENTERPRISE LEVEL DATA MANAGEMENT - A system for identifying data of interest from among a multiplicity of data elements residing on multiple platforms in an enterprise, the system including background data characterization functionality characterizing the data of interest at least by at least one content characteristic thereof and at least one access metric thereof, the at least one access metric being selected from data access permissions and actual data access history and near real time data matching functionality selecting the data of interest by considering only data elements which have the at least one content characteristic thereof and the at least one access metric thereof from among the multiplicity of data elements.	08-23-2012
20120215781	COMPUTER SYSTEM PERFORMANCE ANALYSIS - This invention relates to a method and device for computer system performance analysis. All instructions are split into clusters based on significant offset gaps in top-down processing steps. Comments on instruction clusters can be generated automatically or can be edited manually. The comments can be shared among users for the achievement of portability. Significant clusters can be recognized as hotspots based on predetermined metrics.	08-23-2012
20120221571	Efficient presentation of comupter object names based on attribute clustering - A method for discovering and presenting ordered groups of names of objects that are commonly used together by an individual user of a computer system. The invention tracks usages of computer objects and computes a measure of importance (a “weight”) based on attributes such as time of use and other application dependent data. The objects that are commonly used at the same time are called a cluster, and clusters with the highest cumulative weights are the ones a user is most likely to use again in conjunction with one another. A user can select an entire cluster or a subset. The objects with the highest weights in the cluster are presented first when the user, having selected a cluster, needs to select a subset of the objects in the cluster. The invention uses space saving techniques to represent clusters in computer memory.	08-30-2012
20120221572	CONTEXTUAL WEIGHTING AND EFFICIENT RE-RANKING FOR VOCABULARY TREE BASED IMAGE RETRIEVAL - Systems and methods are disclosed to search for a query image, by detecting local invariant features and local descriptors; retrieving best matching images by quantizing the local descriptors with a vocabulary tree; and reordering retrieved images with results from the vocabulary tree quantization.	08-30-2012
20120221573	Methods and systems for biclustering algorithm - Methods and systems for improved unsupervised learning are described. The unsupervised learning can consist of biclustering a data set, e.g., by biclustering subsets of the entire data set. In an example, the biclustering does not include feeding know and proven results into the biclustering methodology or system. A hierarchical approach can be used that feeds proven clusters back into the biclustering methodology or system as the input. Data that does not cluster may be discarded. Thus, a very large unknown data set can be acted on to learn about the data. The system is also amenable to parallelization.	08-30-2012
20120226690	OPTIMIZATION OF OUTPUT DATA ASSOCIATED WITH A POPULATION - Embodiments of the present invention relate to systems, methods and computer program products for allowing a user to define output data to be optimized for a population, define one or more input constraints based on which the output data is to be optimized, and optimizing the output data for the population by selecting an object based on a selection routine. Embodiments of the present invention allow fast and accurate optimization of output data associated with large populations.	09-06-2012
20120226691	SYSTEM FOR AUTONOMOUS DETECTION AND SEPARATION OF COMMON ELEMENTS WITHIN DATA, AND METHODS AND DEVICES ASSOCIATED THEREWITH - A data interpretation and separation system for identifying data elements within a data set that have common features, and separating those data elements from other data elements not sharing such common features. Commonalities relative to methods and/or rates of change within a data set may be used to determine which elements share common features. Determining the commonalities may be performed autonomously by referencing data elements within the data set, and need not be matched against algorithmic or predetermined definitions. Interpreted and separated data may be used to reconstruct an output that includes only separated data. Such reconstruction may be non-destructive. Interpreted and separated data may also be used to retroactively build on existing element sets associated with a particular source.	09-06-2012
20120226692	SYSTEM AND METHOD FOR MATCHING AND ASSEMBLING RECORDS - A system and method for matching and assembling records is provided. One embodiment of the invention assembles records by applying a method for grouping records based on matching fields, assembling a new record as a composite of the matched records, and then repeating the grouping, matching and assembly steps in a cascade where the matching, grouping and assembly steps are modified as a function of the cascade step and the assembled records created in earlier steps. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules that allow a reader to quickly ascertain the subject matter of the disclosure contained herein. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.	09-06-2012
20120226693	DYNAMIC SELECTION OF OPTIMAL GROUPING SEQUENCE AT RUNTIME FOR GROUPING SETS, ROLLUP AND CUBE OPERATIONS IN SQL QUERY PROCESSING - A method, apparatus, and article of manufacture for optimizing a query in a computer system. During compilation of the query, a GROUP BY clause with one or more GROUPING SETS, ROLLUP or CUBE operations is maintained in its original form until after query rewrite. The GROUP BY clause with the GROUPING SETS, ROLLUP or CUBE operations is then translated into a plurality of levels having one or more grouping sets. After compilation of the query, a grouping sets sequence is dynamically determined for the GROUP BY clause with the GROUPING SETS, ROLLUP or CUBE operations based on intermediate grouping sets, in order to optimize the grouping sets sequence. The execution of the grouping sets sequence is optimized by selecting a smallest grouping set from a previous one of the levels as an input to a grouping set on a next one of the levels. Finally, a UNION ALL operation is performed on the grouping sets.	09-06-2012
20120226694	Systems and Methods for Gathering and/or Presenting Information - The present invention provides systems and methods for presenting a quantity of information in a single tool. Such a tool includes a map of various objects, the objects having themes relating to a given overall concept, wherein at least one object contains information relating to other objects that have a relationship with that object.	09-06-2012
20120226695	CLASSIFYING DOCUMENTS ACCORDING TO READERSHIP - A system for classifying documents in a collection of documents according to their intended readerships includes: a computer configured to select a document in the collection of documents; and a computer to determine a characteristic of the selected document, the characteristic being: misleading when the document includes one or more features that are determined to be for a purpose other than reading the document; commercial when the document includes features that are presented for a commercial purpose; or personal when the document includes features of a personal opinion. A computer classifies the selected document as misleading, commercial, or personal according to its determined characteristic; and a computer repeats the steps of select document, determines a characteristic of the selected document, and classifies the selected document for additional documents in the collection. At least some documents are classified as misleading, some as commercial, and at least some as personal.	09-06-2012
20120233163	DETECTING APPLICATION SIMILARITY - The subject matter of this disclosure can be implemented in, among other things, a method. In these examples, the method includes selecting for analysis by a computing device, an executable application, and identifying a group of application programming interfaces (APIs) utilized by the application when the application is executed. The method may also identifying a group of related applications that are each related to the application based on the group of APIs utilized by the application, wherein each related application of the group of related applications utilizes one or more APIs of the group of APIs utilized by the application.	09-13-2012
20120233164	MUSIC CLASSIFICATION SYSTEM AND METHOD - The present identifies collections of digital music and sound that effectively elicit particular emotional responses as a function of analytical features from the audio signal and information concerning the background and preferences of the subject. The invention can change emotional classifications along with variations in the audio signal over time. Interacting with a listener, the invention locates music with desired emotional characteristics from a central repository, assembles these into an effective and engaging “playlist” (sequence of songs), and plays the music files in the calculated order to the listener.	09-13-2012
20120233165	DETECTING APPLICATION SIMILARITY - The subject matter of this disclosure can be implemented in, among other things, a method. In these examples, the method includes selecting for analysis by a computing device, an executable application, and identifying a group of application programming interfaces (APIs) utilized by the application when the application is executed. The method may also identifying a group of related applications that are each related to the application based on the group of APIs utilized by the application, wherein each related application of the group of related applications utilizes one or more APIs of the group of APIs utilized by the application.	09-13-2012
20120233166	TAG INFORMATION MANAGEMENT APPARATUS, TAG INFORMATION MANAGEMENT SYSTEM,CONTENT DATA MANAGEMENT PROGRAM, AND TAG INFORMATION MANAGEMENT METHOD - A content data management apparatus that manages tag data indicating attributes relating to content data, comprising: an extraction section that extracts positional information indicating geographic positions associated with the content data and time information indicating time points associated with the content data, the positional information and the time information being attached to the content data; a speed computation section that computes speeds associated with the content data, based on the positional information and the time information extracted by the extraction section; and a grouping section that groups the content data, based on the speeds computed by the speed computation section.	09-13-2012
20120233167	MEDIA ITEM CLUSTERING BASED ON SIMILARITY DATA - Methods and arrangements for facilitating generation of media mixes for a program participant based at least in part on media library inventory information provided by a number of program participants. Those individuals that decide to be program participants are interested in organizing, maintaining and playing their music, based at least in part, on data derived from a population of other participants in the program. A program participant must send, and the system, receive, data representative of that program participant's media inventory. The system or program determines a relative similarity of each item from the collection of program participants as compared to each other item and from the similarity information clusters of similar items are identified. The clusters can be used to identify clusters of similar items in an individual program participant's media library and therefrom mixes of similar media items can be created.	09-13-2012
20120239649	EXTENT VIRTUALIZATION - Files can be segmented into distinct groups and allocated storage units such as blocks. Files associated with parent and child files can be segmented into separate groups, for instance. Further, a group associated with parent files can be extended to include additional blocks reserved for subsequent update. Additionally, metadata can be merged across groups to provide a unified view of the distinct groups.	09-20-2012
20120239650	UNSUPERVISED MESSAGE CLUSTERING - Unsupervised clustering can be used for organization of micro-blog or other short length messages into message clusters. Messages can be compared with existing clusters to determine a similarity score. If at least one similarity score is greater than a threshold value, a message can be added to an existing message cluster. If a message is not similar to an existing cluster, the message can be compared against criteria for starting a new message cluster.	09-20-2012
20120239651	Data Collections on a Mobile Device - Data collections on a mobile device may be user-defined to include various types of objects including any combination of apps, contacts, email subscriptions, data feeds, and so on. A user interface associated with the data collection includes representations of the various objects associated with the data collection and representations of broadcast data received in association with the objects associated with the data collection.	09-20-2012
20120239652	Hardware Accelerated Application-Based Pattern Matching for Real Time Classification and Recording of Network Traffic - An indexing database utilizes a non-transitory storage medium. A pattern matching processing unit generates preclassification data for the network data packets utilizing pattern matching analysis. At least one processing unit implements a storage process that receives the network data packets, stores the network data packets in at least one of the slots, and transfers the network data packets to a packet capture repository when slots in a shared memory are full. A preclassification process requests from the pattern matching processing unit the preclassification data. An indexing process determines, based upon the preclassification data, whether to invoke or omit additional analysis of the network data packets, and performs at least one of aggregation, classification, or annotation of the network data packets in the shared memory to maintain one or more indices in the indexing database.	09-20-2012
20120239653	Machine Assisted Query Formulation - Architecture for completing search queries by using artificial intelligence based schemes to infer search intentions of users. Partial queries are completed dynamically in real time. Additionally, search aliasing can also be employed. Custom tuning can be performed based on at least query inputs in the form of text, graffiti, images, handwriting, voice, audio, and video signals. Natural language processing occurs, along with handwriting recognition and slang recognition. The system includes a classifier that receives a partial query as input, accesses a query database based on contents of the query input, and infers an intended search goal from query information stored on the query database. A query formulation engine receives search information associated with the intended search goal and generates a completed formal query for execution.	09-20-2012
20120239654	RELATED DOCUMENT SEARCH SYSTEM, DEVICE, METHOD AND PROGRAM - Provided is a related document search system which can provide supplementary information showing a related content together with a related document related to a predetermined document.	09-20-2012
20120246160	VARIABLE PAGE SIZING FOR IMPROVED PHYSICAL CLUSTERING - A data size characteristic of contents of a related unit of data to be written to a storage by an input/output module of a data storage application can be determined, and a storage page size consistent with the data size can be selected from a plurality of storage page sizes. The related unit of data can be assigned to a storage page having the selected storage page size, and the storage page can be passed to the input/output module so that the input/output module physically clusters the contents of the related unit of data when the input/output module writes the contents of the related unit of data to the storage. Related methods, systems, and articles of manufacture are also disclosed.	09-27-2012
20120246161	APPARATUS AND METHOD FOR RECOMMENDING INFORMATION, AND NON-TRANSITORY COMPUTER READABLE MEDIUM THEREOF - According to one embodiment, profile information of new user and items to be selected are inputted. Each item has an attribute value of a plurality of attributes. Profile information and preference information of a plurality of users are acquired. The preference information represents whether each user has selected each item. The plurality of users is classified into a plurality of clusters by the profile information and the preference information of the plurality of users. A parameter of each attribute of each cluster is calculated by the preference information of each cluster. A similar cluster to classify the new user is estimated from the plurality of clusters by the profile information of the new user. A preference degree of each item is calculated by the parameter of each attribute of the similar cluster and the attribute value of each item. An item to be recommended is decided by the preference degree.	09-27-2012
20120246162	METHOD AND DEVICE FOR GENERATING A SIMILAR MEANING TERM LIST AND SEARCH METHOD AND DEVICE USING THE SIMILAR MEANING TERM LIST - In a generation device, a term determiner, for reference terms and a similar meaning term that has similar meaning to any of the reference terms, determines if each of the reference terms and the similar meaning term are both included in a document data group. An extractor extracts a reference term and the similar meaning term of the reference term that were both determined to be included in the document data group. A priority determiner determines an output priority to the extracted similar meaning term on the basis of appearance of at least either of the similar meaning term and the reference term in the document data group. And a list generator generates a the similar meaning term list in such a way that the extracted reference term, the similar meaning term of the extracted reference term, and the output priority are associated with one another.	09-27-2012
20120246163	HASH TABLE STORAGE AND SEARCH METHODS AND DEVICES - A hash table storage method includes: obtaining attribute information of at least two levels of hash tables; sequentially obtaining Key information from a received packet according to the attribute information of the at least two levels of hash tables; sequentially determining whether the Key information is stored in its corresponding hash table; and storing the Key information in its corresponding hash table if the Key information is not stored in its corresponding hash table.	09-27-2012
20120246164	HYPERBOLIC SMOOTHING CLUSTERING AND MINIMUM DISTANCE METHODS - The invention concerns four methodologies regarding the unsupervised clustering of a set of observations in multidimensional space, considering a defined number of clusters. The invention comprises a special procedure for calculating the minimum distance of a given point to a set of points in a multidimensional space, the main component of the first methodology.	09-27-2012
20120254171	Exploitation of Correlation Between Original and Desired Data Sequences During Run Generation - A computer executed method of exploiting correlations between original and desired data sequences during run generation comprises, with a processor, adding a number of data values from a data source to a first memory device, the first memory device defining a workspace, determining whether the data values within the workspace should be output in ascending or descending order for a number of runs, and writing a number of the data values as a run to a second memory device in the determined order.	10-04-2012
20120254172	APPARATUS, METHOD AND COMPUTER-READABLE STORAGE MEDIUMS FOR CLUSTERING AND RANKING A LIST OF MULTIMEDIA OBJECTS - An apparatus is provided that includes a processor and a memory storing executable instructions that in response to execution by the processor cause the apparatus to at least perform a number of functions. The apparatus is caused to direct presentation of a list for a plurality of patients and that is clustered by patient. The apparatus is caused to apply a keyword filter to identify a subset of the patient exams that match the keyword filter, and rank the respective exams by relevance to the keyword filter. The apparatus is caused to direct presentation of a filtered list of patient exams that is clustered by patient in the filtered list of patient exams. And for each patient having patient exams in the subset of the patient exams, the respective patient exams are in ranked order in the filtered list of patient exams according to the keyword filter.	10-04-2012
20120254173	GROUPING DATA - A computer-executed method for grouping data comprising, with a processor, generating a number of sorted runs from an unsorted input, storing the sorted runs in temporary storage, placing pages of data from the sorted runs, one at a time, into a portion of a buffer allocated to receive that page, and from the allocated portion of the buffer, merging each page of data, one at a time, into a number of aggregated records, the number of aggregated records also being stored in the buffer.	10-04-2012
20120254174	TIME-BASED DATA PARTITIONING - According to one embodiment, a file system (FS) of a storage system is partitioned into a plurality of FS partitions, where each FS partition stores segments of data files. In response to a request for writing a file to the storage system, the file is stored in a first of the FS partitions that is selected based on a time attribute of the file, such that files having similar time attributes are stored in an identical FS partition.	10-04-2012
20120254175	SYSTEM AND METHOD FOR OPTIMIZING DATA MIGRATION IN A PARTITIONED DATABASE - According to one aspect, provided is a horizontally scaled database architecture. Partition a database enables efficient distribution of data across a number of systems reducing processing costs associated with multiple machines. According to some aspects, the partitioned database can be manages as a single source interface to handle client requests. Further, it is realized that by identifying and testing key properties, horizontal scaling architectures can be implemented and operated with minimal overhead. In one embodiment, databases can be partitioned in an order preserving manner such that the overhead associated with moving the data for a given partition can be minimized during management of the data and/or database. In one embodiment, splits and migrations operations prioritize zero cost partitions, thereby, reducing computational burden associated with managing a partitioned database.	10-04-2012
20120254176	SYSTEM AND METHOD FOR STREAK DISCOVERY AND PREDICTION - The disclosed embodiment relates to identifying performance regions in time-series data. An exemplary method comprises identifying, with a computing device, one or more streaks in the time-series data based on at least one streak parameter, ranking, with a computing device, the identified streaks based on at least one characteristic of the identified streaks, and predicting, with a computing device, a future occurrence of at least one streak based on the characteristics of the identified streaks. The steps of identifying and ranking may be carried out using at least one of a linear graph method, a statistical based approach, a curve-line intersection method, and a hypothesis-based method, and the step of predicting the future occurrence of at least one streak may comprise predicting at least one of how long a current streak will continue, when a current streak will end, and when a new streak will begin. The disclosed embodiment also relates to a system and computer-readable code that can be used to implement the exemplary methods.	10-04-2012
20120254177	Numbering System for Antecedents and Outcomes - A numbering system for antecedents and outcomes providing a method for numbering antecedents and outcomes that reveals underlying information of relationships. The numbering system for antecedents and outcomes utilizes a mathematical relationship and an antecedent's or outcome's existing characterizing information to assign a unique indexing number identifying each antecedent and outcome. In an antecedent numbering system, the unique indexing number is able to provide information about the contributor line number, the cohort, the combination of the preceding multiple antecedents, and the sequence number of the outcome. In an outcomes numbering system, the unique indexing number provides information about the sequence line number, the cohort, the combination of the antecedents, and the order numberof the outcomes.	10-04-2012
20120254178	SYSTEM AND METHOD FOR PROCESSING AN SQL QUERY MADE AGAINST A RELATIONAL DATABASE - A system and method for processing an SQL query made against a relational database is disclosed. In one example embodiment, the method includes receiving the SQL query made against the relational database. Further, the received SQL query is parsed to obtain each operator and associated one or more operands and sequence of execution of the operators. Furthermore, a closure-friendly operator is dynamically generated for each operator and the associated one or more operands in the received SQL query. In addition, the dynamically generated closure-friendly operators are executed based on the obtained sequence of execution of the operators.	10-04-2012
20120254179	CLUSTERING CUSTOMERS - A computer implemented method for clustering customers includes receiving a source set of customer records, wherein each customer record represents one customer, and each customer record includes at least one data attribute, and each data attribute has an attribute value; pre-processing the source set of customer records to generate a pre-processed set of customer records; executing a clustering algorithm on the pre-processed set of customer records to group the pre-processed set of customer records into clusters of a pre-defined number. The pre-processing comprises: determining the type of a customer in the source set of customer records; using a type attribute value to indicate the type of the customer in its customer record; normalizing data attribute values and type attribute values; weighting to the data attribute values and the type attribute values respectively to obtain weighted attribute values of the data attribute and weighted attribute values of the tune attribute.	10-04-2012
20120254180	INTELLIGENT IDENTIFICATION OF MULTIMEDIA CONTENT FOR SYNCHRONIZATION - An intelligent synchronization tool ensures access to desired content in a manner that automatically keeps the content current on the portable media device. A variation threshold or user-specified degree of content variation may be introduced among content downloaded to a user's mobile device to prevent the user from becoming bored. Furthermore, intelligent synchronization may automatically populate the portable media device with popular content to save a user time and/or use passive monitoring techniques to ascertain a user's preferences for subsequent population.	10-04-2012
20120254181	TEXT, CHARACTER ENCODING AND LANGUAGE RECOGNITION - A method is disclosed, for recognizing whether some electronic data is the digital representation of a piece of text and, if so, in which character encoding it has been encoded. A fingerprint is constructed from the data, wherein the fingerprint comprises, for each of a plurality of predetermined character encoding schemes, at least one confidence value, representing a confidence that the data was encoded using said character encoding scheme. The fingerprint also comprises a frequency value for each of a subset of byte values, each frequency value representing the frequency of occurrence of a respective byte value in the data. A statistical classification of the data is then performed based on the fingerprint.	10-04-2012
20120254182	PARTITIONING MANAGEMENT OF SYSTEM RESOURCES ACROSS MULTIPLE USERS - A resource group attribute is assigned to a storage resource object representing at least one of the plurality of storage resources in a system configuration of the computing storage environment. The resource group attribute includes a selectable value indicating a resource group object to which the storage resource object is associated. A resource group label is provided in the resource group object and is a string having no wildcards. A user resource scope is assigned to a user ID and a value of the user resource scope provides a mechanism to match to the resource group label. The user ID is authorized to perform one of creating, deleting, modifying, controlling, and managing storage resources with an association to a resource group.	10-04-2012
20120259849	DETERMINING FILE OWNERSHIP OF ACTIVE AND INACTIVE FILES BASED ON FILE ACCESS HISTORY - File management systems and methods are presented. In one embodiment, implementation of a method for determining the accurate ownership of a file within a data system includes: identifying a first plurality of access events for a file, wherein the file is associated with a directory of related files; identifying a second plurality of access events for the related files within the directory, wherein access events in the first and second plurality of access events occur within a period; determining a pool of users accessing files within the directory within the period; and selecting a user from the pool of users as an inferred owner of the file based on access metrics related to the plurality of access events.	10-11-2012
20120259850	EFFICIENT QUERY CLUSTERING USING MULTI-PARTITE GRAPHS - Efficient search query clustering using tripartite graphs may enable a search engine developer to model information needs of users while expending less computing resources. The efficient clustering of search queries may involve multiple computing devices receiving a subgraph of a multi-partite graph that encompasses search queries, as well as receiving a global center vector table that includes cluster center entries for query clusters. At each computing device, the received global center vector table may be filtered to eliminate one or more cluster center entries that are irrelevant to the search queries. Subsequently, the search queries may be clustered into the query clusters by at least using the filtered global center vector table at each of the computing devices. In some instances, one or more comparisons between search queries and the cluster center entries in the global center vector table during the clustering may be eliminated.	10-11-2012
20120259851	AGGREGATION OF CONVERSION PATHS UTILIZING USER INTERACTION GROUPING - Methods, systems, and apparatuses, including computer programs encoded on computer-readable media, for aggregating conversion paths utilizing user interaction grouping. In one aspect, information regarding a plurality of conversion paths is received. Each conversion path includes one or more user interactions that include a plurality of dimensional data. A sorted list of grouping definitions that includes one or more group rules is received and the conversion paths are converted into group paths based upon the one or more group rules. Each group path includes one or more group elements corresponding to each user interaction of a corresponding conversion path. The plurality of group paths are aggregated based upon the number and order of group elements within each group path. Information regarding the aggregated group paths can then be provided, for example, through a report.	10-11-2012
20120259852	METHOD AND APPARATUS FOR PUSHING SITUATIONALLY RELEVANT DATA - A computer-implemented method of providing users with contextually relevant data associates metadata tags with data items extracted from a variety of data sources that summarize the data items in searchable form using a common format. Contextual data is collected from the users indicative of their current situation. This data is then correlated with the metadata tags to identify data items of potential interest to the users taking into account their current situation. The identified data items are pushed to the relevant receiving devices in real time over a communications network to provide the identified users with information relevant to their current situation.	10-11-2012
20120265758	SYSTEM AND METHOD FOR GATHERING, FILTERING, AND DISPLAYING CONTENT CAPTURED AT AN EVENT - A method for generating event compilations during an event comprising: providing an event client designated to display event content captured at the event; identifying an event moderator to review event content captured by attendees of the event; receiving event content captured by one or more event attendees; transmitting the event content to the event moderator for review; receiving a response from the event moderator, the response indicating whether the event content is allowed or blocked; and displaying the event content from the event client at the event if the response from the moderator indicates that the event content is allowed.	10-18-2012
20120271826	DATA COLLECTING METHOD FOR DETECTION AND ON-TIME WARNING SYSTEM OF INDUSTRIAL PROCESS - A data collection method for a process margin monitoring system of industrial equipment includes preparing a learning data set based on data determined to be normal in an operation history of the industrial equipment so that the learning data set is sorted for each operation mode, in a case in which the industrial equipment includes equipment units performing the same functions, receiving data for each of the equipment units and processing the received data as data for the equipment units, sorting and grouping associated ones of the data in the learning data set, and sampling the collected data to reduce the amount of data.	10-25-2012
20120271827	METHODS AND SYSTEMS FOR IMPLEMENTING APPROXIMATE STRING MATCHING WITHIN A DATABASE - A computer-based method for character string matching of a candidate character string with a plurality of character string records stored in a database is described. The method includes performing a clustering operation on at least a portion of the plurality of character string records, the clustering operation generating a plurality of clusters, each cluster comprising a plurality of character strings from the plurality of character string records, the plurality of character strings in each cluster are determined to be similar with respect to each other based on at least one characteristic of the plurality of character strings. The method also includes generating a set of reference character strings that are selected from the plurality of character strings in each cluster, generating an n-gram representation for one of the reference character strings in the set of reference character strings, and generating an n-gram representation for the candidate character string.	10-25-2012
20120278322	Method, Apparatus and Program Product for Personalized Video Selection - Apparatus and program products in which image metadata identifying video objects in video data files is recorded; access to video data files by an end user of an access device is monitored; and a record of the concentration of personal video preferences is compiled from the monitored accesses and the image metadata of the video data files accessed by the end user. This record is then used to create summarizations of videos considered for prospective viewing by an end user and/or to select commercial messages such as advertisements to be delivered to the end user.	11-01-2012
20120278323	Joining Tables in a Mapreduce Procedure - Systems and techniques by which tables can be joined in a mapreduce procedure. In some implementations, when a large table of business data (e.g., having one billion transaction records or more) is to be joined with a large table of customer data (e.g., having hundreds of millions of customer records), then these two tables can be organized before the mapreduce procedure to speed up the table join. For example, the business data and the customer data can both be hash partitioned, based on the same key, into shards of business data and shards of customer data, respectively. The number of shards in these two groups has an integer relationship with each other: for example such that there are two business data shards for every customer data shard, or vice versa.	11-01-2012
20120278324	PARTICIPANT GROUPING FOR ENHANCED INTERACTIVE EXPERIENCE - Representative embodiments of a method for grouping participants in an activity include the steps of: (i) defining a grouping policy; (ii) storing, in a database, participant records that include a participant identifier, a characteristic associated with the participant, and/or an identifier for a participant's handheld device; (iii) defining groupings based on the policy and characteristics of the participants relating to the policy and to the activity; and (iv) communicating the groupings to the handheld devices to establish the groups.	11-01-2012
20120278325	Career Criminal and Habitual Violator (CCHV) Intelligence Tool - A computer implemented method, apparatus, and computer usable program product for ranking and categorizing criminal offenders in a jurisdiction. In one embodiment, external data associated with the offenders is processed in a set of data models to generate a ranking index of criminal offenders. The external data comprises offender data elements related to prior arrests. The computer software and web application enables officers, detectives, and supervisors to research the offenders in their jurisdiction. They can intentionally track and monitor the status of the offenders that are not currently incarcerated. They can deliberately increase lawful contacts with these high-rate and treacherous offenders.	11-01-2012
20120278326	Method to Dynamically Design and Configure Multimedia Fingerprint Databases - Techniques are provided for dynamic configuration of search parameters for multimedia fingerprint databases that use weak bits. A multimedia fingerprint database, which stores reference fingerprints and uses weak bits, is maintained. Maintaining the database includes dynamically configuring one or more of the following parameters: a fingerprint length of those portions of the reference fingerprints that are used to identify multimedia objects; an index length of the index used to index those portions of the reference fingerprints that are used to identify multimedia objects; a threshold that is used to determine whether multimedia objects are correctly identified; and a number of the weak bits in the reference fingerprints.	11-01-2012
20120278327	DOCUMENT ANALYSIS DEVICE, DOCUMENT ANALYSIS METHOD, AND COMPUTER READABLE RECORDING MEDIUM - A document analysis device (	11-01-2012
20120278328	DATA CLASSIFICATION METHODS AND APPARATUS FOR USE WITH DATA FUSION - Methods and apparatus for classifying data for use in data fusion processes are disclosed. An example method of classifying data selectively groups nodes of a classification tree so that each node is assigned to only one of a plurality of groups and so that at least one of the groups includes at least two of the nodes. Data is classified based on the classification tree and the selective grouping of the nodes, and the results displayed.	11-01-2012
20120284266	DYNAMICALLY DETERMINING THE RELATEDNESS OF WEB OBJECTS - A first cluster of web objects is identified from a click-through data structure. The click-through data structure can organize web objects into clusters based on query results of web objects selected by a user. Also, a second cluster of web objects can be identified from a metadata data structure. The metadata data structure can organize web objects into clusters based on metadata associated with the web objects. An output set of web objects is selected, in real time, from the identifier clusters.	11-08-2012
20120284267	Item Randomization with Item Relational Dependencies - An embodiment of the invention provides a method and system for randomizing items within a storage device. A linkage module links a first linked item in the storage device to one or more second linked items in the storage device based on attributes of the items. The linking is performed without input from a user of the storage device. The second linked item has at least one attribute in common with the first linked item. A list generating module connected to the linkage module generates a list, wherein the list includes a random sequence of items. The generating of the list groups the first linked item and the second linked item(s) in the list such that there are no items between the first linked item and the second linked item(s) in the list.	11-08-2012
20120284268	SPACE-TIME-NODAL TYPE SIGNAL PROCESSING - Example methods, apparatuses, or articles of manufacture are disclosed that may be implemented using one or more computing devices or platforms to facilitate or otherwise support one or more processes or operations associated with a space-time-node engine signal processing.	11-08-2012
20120284269	HIERARCHICAL ANT CLUSTERING AND FORAGING - A clustering method yields a searchable hierarchy to speed retrieval, and can function dynamically with a changing document population. Nodes of the hierarchy climb up and down the emerging hierarchy based on locally sensed information. Like previous ant clustering algorithms, the inventive process is dynamic, decentralized, and anytime. Unlike them, it yields a hierarchical structure. For simplicity, and reflecting our initial application in the domain of textual information, the items being clustered are documents, but the principles may be applied to any collection of data items.	11-08-2012
20120284270	METHOD AND DEVICE TO DETECT SIMILAR DOCUMENTS - A method for detecting similar documents includes extracting an entity from each of a first web document and a second web document; determining an importance contribution element corresponding to each of the web documents; calculating, using the processor, weights for each entity based on the determined importance contribution elements; and determining whether the web documents are similar documents based on the calculated weights. A device to detect similar documents includes a storage device; an entity extractor stored on the storage device and configured to extract an entity from a first web document and a second web document and to determine an importance contribution element from each of the web documents; a weight calculator configured to calculate weights of each entity based on the determined importance contribution elements; and a similar document detection unit configured to determine whether the web documents are similar documents based on the calculated weights.	11-08-2012
20120284271	REQUIREMENT EXTRACTION SYSTEM, REQUIREMENT EXTRACTION METHOD AND REQUIREMENT EXTRACTION PROGRAM - Included are a candidate extraction unit	11-08-2012
20120284272	Automated Electronic Message Filing System - A sender selection is detected at a sender computer system within a user interface of at least one suggested folder name for a composed electronic message for a recipient receiving the electronic message to select as a folder name for filing the electronic message. The at least one suggested folder name is attached to the electronic message at the sender computer system for distribution to the recipient. The electronic message is sent with the suggested filing folder name from the sender computer system to a recipient, wherein a recipient receiving the electronic message receives the at least one suggested folder name specified by the sender in the electronic message for selecting a folder for filing the electronic message in a messaging filing directory for the recipient.	11-08-2012
20120284273	Automated Electronic Message Filing System - A receipt receives an electronic message from a sender, wherein said electronic message comprises at least one suggested folder name specified by the sender for the recipient to select as a folder name for filing the electronic message, wherein the at least one suggested folder name is detected by a sender computer system from a selection by the sender within a user interface of the sender computer system of the at least one suggested folder name for the electronic message and inserted into the electronic message. The electronic message is filtered to detect the at least one suggested folder name for filing the electronic message in a messaging filing directory. Responsive to the recipient selecting to file the electronic message, the electronic message is filed in at least one folder with the suggested folder name from among a plurality of folders.	11-08-2012
20120284274	METHOD AND DEVICE FOR SERVICE MANAGEMENT - A method and a device for service management are provided. The method includes: receiving a management instruction corresponding to a virtual linkage group, where the virtual linkage group is grouped according to a service requirement; determining an object of the virtual linkage group, where the object of the virtual linkage group is a group of processes, a group of components, a group of boards, or a group of frames satisfying the service requirement; and performing unified management on the object of the virtual linkage group according to the management instruction. By grouping a virtual linkage group according to a service requirement, when a management instruction corresponding to the virtual linkage group is received, effective service management is performed on the basis of satisfying the service requirement, and the flexibility of the service management is improved.	11-08-2012
20120290574	FINDING OPTIMIZED RELEVANCY GROUP KEY - Methods and apparatus filter out unused information in irrelevant patterns to find an optimized relevancy group key. Such an optimized key occupies a smaller mapping space and functions to identify relevancy groups while requiring fewer computations to perform thereby improving the overall speed and performance of the processing device.	11-15-2012
20120290575	MINING INTENT OF QUERIES FROM SEARCH LOG DATA - Architecture that mines intent of a query from search log data. For example, for a given query, the intent, the major URLs for the intent, and intent attributes, are found. The input is search log data and the output is a database that contains the intent of queries mined from the log data. Data mining techniques are employed to discover major intents of queries in the click-through log data of a search engine. For each query, its expanded queries are created and utilized, as well as co-clicks of the original query and expanded queries in the log data. For each query, clustering is performed on the co-click data of the query and expanded queries to find the major intents of the query.	11-15-2012
20120290576	Data Analysis System - A data analysis system for analyzing data from multiple devices has a database service module including a data storage subsystem storing data from collected from different devices. The data is stored in a meta-structure using primitives to classify the data. An analysis engine analyzes the data to determine whether the data defined by the meta- structure meets certain criteria in accordance with a stored set of rules. The system is useful, for example, in the detection of faults in railway infrastructure.	11-15-2012
20120290577	IDENTIFYING VISUAL CONTEXTUAL SYNONYMS - Tools and techniques for identifying visual contextual synonyms are described herein. The described operations use visual words having similar contextual distributions as contextual synonyms to identify and describe visual objects that share semantic meaning. The contextual distribution of a visual word is described using the statistics of co-occurrence and spatial information averaged over image patches that share the visual word. In various implementations, the techniques are employed to construct a visual contextual synonym dictionary for a large visual vocabulary. In various implementations, the visual contextual synonym dictionary narrows the semantic gap for large-scale visual search.	11-15-2012
20120290578	FORENSIC SYSTEM, FORENSIC METHOD, AND FORENSIC PROGRAM - Embodiments of the inventive concept reduce the burden of creating litigant sources of evidence or other evidentiary materials in connection with litigation in a court of law. Designation of at least one document file included in digital document information is accepted and designation of a language into which the designated document file is translated is accepted. The document file, the designation of which is accepted, is translated into the language the designation of which is accepted. A common document file representing the same content as that of the designated document file is extracted from digital document information recorded in a recording unit. Translation-related information representing that the extracted common document file is translated by invoking a translated content of the translated document file is generated, and, based on the translation-related information, a litigant-related document file is output.	11-15-2012
20120290579	SERVER COMPUTER, COMPUTER SYSTEM, AND FILE MANAGEMENT METHOD - A server computer which determines the configuration of a file for configuring a plurality of virtual computers respectively is configured to comprise: an OS/AP file evaluation criteria table which stores evaluation criteria for judging whether to split and manage a file required for the configuration of the virtual computers; a user data evaluation criteria TBL; and a verification and splitting unit which judges whether the file conforms to the evaluation criteria, and determines a part of a file judged to conform to the evaluation criteria as a first file stored as an entity and determines the remaining part of the file as a second file for referencing an entity of a predetermined destination storage.	11-15-2012
20120290580	CLUSTERING CUSTOMERS - A computer implemented method for clustering customers includes receiving a source set of customer records, wherein each customer record represents one customer, and each customer record includes at least one data attribute, and each data attribute has an attribute value; pre-processing the source set of customer records to generate a pre-processed set of customer records; executing a clustering algorithm on the pre-processed set of customer records to group the pre-processed set of customer records into clusters of a pre-defined number. The pre-processing comprises: determining the type of a customer in the source set of customer records; using a type attribute value to indicate the type of the customer in its customer record; normalizing data attribute values and type attribute values; weighting to the data attribute values and the type attribute values respectively to obtain weighted attribute values of the data attribute and weighted attribute values of the type attribute.	11-15-2012
20120296900	ADAPTIVELY LEARNING A SIMILARITY MODEL - A method, system, and computer-readable storage medium for computing a representation of similarity among items in a set of items. Computing a representation of similarity items may comprise generating a first similarity model that represents characteristics of the set of items, the characteristics being indicative of similarity among the items in the set of items. Additionally, computing the representation of similarity may comprise adaptively selecting a subset of the set of items for similarity evaluation based on the first similarity model, receiving a similarity evaluation for the adaptively-selected subset of items, and generating a second similarity model based on the first to similarity model and the received similarity evaluation.	11-22-2012
20120296901	Information Management System - An information management system creates data structures based entirely on the content of source files, then compares these data structures to discover synergies and commonalities. In one embodiment, the system accepts a first collection of source files, and extracts text from each source file. The text is compared to tags in one or more dictionaries, which comprise hierarchical listing of tags. Tags matching the text are associated with each source file. The system then generates a virtual relational network in which each source file having matching tags is a node. Tags associated with two or more source files are links between the nodes. This virtual relational network may be compared with another virtual relational network to discover common nodes or links. Source files later added to a collection are massively linked by associating all tags from all source files with the newly added source file, and vice versa.	11-22-2012
20120296902	SYSTEM AND METHOD FOR IDENTIFYING THE PRINCIPAL DOCUMENTS IN A DOCUMENT SET	11-22-2012
20120296903	Methods And Systems For Eliminating Duplicate Events - Systems and methods for eliminating duplicate events are described. In one embodiment, an event is captured, wherein the event comprises a user interaction with an article on a client device and it is determined whether the event is a duplicate of a stored event.	11-22-2012
20120296904	GRID-BASED DATA CLUSTERING METHOD - A grid-based data clustering method is disclosed. A parameter setting step sets a grid parameter and a threshold parameter. A diving step divides a space having a plurality of data points according to the grid parameter. A categorizing step determines whether a number of the data points contained in each grid is larger than or equal to a value of the threshold parameter. The grid is categorized as a valid grid if the number of the data points contained therein is larger than or equal to the value of the threshold parameter, and the grid is categorized as an invalid grid if the number of the data points contained therein is smaller than the value of the threshold parameter. The clustering step retrieves one of the valid grids. If the retrieved valid grid is not yet clustered, the clustering step performs horizontal and vertical searching/merging operations on the valid grid.	11-22-2012
20120296905	DENSITY-BASED DATA CLUSTERING METHOD - A density-based data clustering method executed by a computer system is disclosed. The method includes a setup step, a clustering step, an expansion step and a termination step. The setup step sets a radius and a threshold value. The clustering step defines a single cluster on a plurality of data points of a data set, and provides and adds a plurality of first boundary marks to a seed list as seeds. The expansion step expands the cluster from each seed of the seed list, and provides and adds at least one second boundary mark to the seed list as seeds. The termination step determines whether each of the data points is clustered, wherein the clustering step is re-performed if the determination is negative.	11-22-2012
20120296906	GRID-BASED DATA CLUSTERING METHOD - A grid-based data clustering method performed by a computer system includes a setup step, a dividing step, a categorizing step and an expanding/clustering step. The setup step sets a grid quantity and a threshold value. The dividing step divides a space containing a data set having a plurality of data points into a two-dimensional matrix. The matrix has a plurality of grids G(i,j) comprising a plurality of target sequences and a plurality of non-target sequences interlaced with the plurality of target sequences. The indices “i” and “j” of each grid G(i,j) represents the coordinate thereof. The categorizing step determines whether each of the grids is valid based on the threshold value. The expanding/clustering step respectively retrieves each of the grids of the target sequences, performs an expansion operation on each of the grids retrieved and clusters the plurality grids G(i,j).	11-22-2012
20120296907	SPECTRAL CLUSTERING FOR MULTI-TYPE RELATIONAL DATA - A general model is provided which provides collective factorization on related matrices, for multi-type relational data clustering. The model is applicable to relational data with various structures. Under this model, a spectral relational clustering algorithm is provided to cluster multiple types of interrelated data objects simultaneously. The algorithm iteratively embeds each type of data objects into low dimensional spaces and benefits from the interactions among the hidden structures of different types of data objects.	11-22-2012
20120296908	APPAPATUS AND METHOD FOR GENERATING A COLLECTION PROFILE AND FOR COMMUNICATING BASED ON THE COLLECTION PROFILE - An apparatus for generating a collection profile of a collection of different media data items has a feature extractor for extracting at least two different features describing a content of a media data item for a plurality of media data items of the collection, and a profile creator for creating the collection profile by combining the extracted features or weighted extracted features so that the collection profile represents a quantitative fingerprint of a content of the media data collection. This collection profile or music DNA can be used for transmitting information, which is based on this collection profile, to the entity itself or to a remote entity.	11-22-2012
20120296909	Method and Apparatus for Characterizing User Behavior Patterns from User Interaction History - An approach is provided for characterizing user behavior patterns. The behavior pattern platform receives a plurality of context records from a device. Next, the behavior pattern platform places one or more contexts from the context records. Then, the behavior pattern platform places the contexts into one or more context groups. Then, the behavior pattern platform receives interaction data from the device, associates the context groups with the interaction data, and determines a behavior pattern based, at least in part, on the association of the context groups and the interaction data.	11-22-2012
20120303618	Clustering-Based Resource Aggregation within a Data Center - Data representing capabilities of devices in a data is aggregated on a cluster-basis. Information representing capability attributes of devices in the data center is received. The information representing the capability attributes is analyzed to generate data that groups devices based on similarity of at least one capability attribute. Aggregation data is stored that represents the grouping of the devices based on similarity of the at least one capability attribute and identifies the devices in corresponding groups.	11-29-2012
20120303619	VIRTUAL SUB-METERING USING COMBINED CLASSIFIERS - A virtual sub-metering process using combined classifiers includes generating an electric power consumption signature database by receiving data from an electric power consumption measuring meter and auxiliary data from a building management system, and clustering the data. After the generation of the electric power consumption signature database, additional data from the electric power consumption measuring meter and the auxiliary data from the building management system is received, and this additional data is processed to generate a second steady-state load classification component. A second transient state component is extracted from the additional data. The steady-state load classification components and the transient shape components from the electric power consumption signature database, the second steady state load classification component, the second transient shape component, and control signals and status signals associated with the plurality of electric power consuming devices are correlated and combined.	11-29-2012
20120303620	METHOD OF CALCULATING CONNECTIVITY OF N-DIMENSIONAL SPACE - A method for calculating connectivity of a space having a number of objects using a computer system includes establishing a character matrix having a number of elements in accordance with the objects. A label matrix is established, and the character matrix is divided into blocks. Connectivity of the elements of each of the blocks is calculated, and then the elements are grouped into regions based on the values of the elements. Each of the regions is labeled and enhanced. Connectivity of the regions is calculated, and the regions are grouped into larger regions based on the values of the elements	11-29-2012
20120303621	REAL-TIME ADAPTIVE BINNING THROUGH PARTITION MODIFICATION - In one embodiment, real-time adaptive binning may be performed through the modification of a set of partitions. More particularly, a set of partitions separating one or more bins from one another may be identified, each of the one or more bins having boundaries including a lower boundary and an upper boundary, wherein the boundaries of the one or more bins together define a contiguous range of data values capable of being stored in the one or more bins. A data value may be obtained and added to one of the one or more bins according to the boundaries of the one or more bins. It may be determined whether to modify the set of partitions. The set of partitions may be modified according to a result of the determining step.	11-29-2012
20120303622	Efficient Indexing of Documents with Similar Content - A computer system comprising one or more processors and memory groups a set of documents into a plurality of clusters. Each cluster includes one or more documents of the set of documents and a respective cluster of documents of the plurality of clusters includes respective cluster data corresponding to a plurality of documents including a first document and a second document. The computer system determines that the second document includes duplicate data that is duplicative of corresponding data in the first document, identifies a respective subset of the respective cluster data that excludes at least a subset of the duplicate data, and generates an index of the respective subset of the respective cluster data.	11-29-2012
20120310935	Integration and Combination of Random Sampling and Document Batching - Methods and systems of integrated batching and random sampling of documents for enhanced functionality and quality control, such as validation, within a document review process are provided herein. According to various embodiments, a batching request may be received and may include a population size that corresponds to a total amount of documents available for sampling. The batching request may also include an acceptable margin of error. A random sample size may be calculated based on the batching request, and then a subset of documents corresponding to the random sample size may be selected from the total amount of documents available for sampling. The subset of documents may be grouped into one or more batches, and the one or more batches may be assigned to one or more review nodes.	12-06-2012
20120310936	METHOD FOR PROCESSING DUPLICATED DATA - A processing method for duplicated data includes the following steps. A stored file is partitioned into a plurality of raw tanks and a plurality of meta tanks, in which the raw tanks correspond to the meta tanks in a one to one manner, and each meta tank has a stored fingerprint value of the corresponding raw tank. A duplicated data determination request is received, in which the duplicated data determination request includes a requested fingerprint value. At least one of the meta tanks is read, and the requested fingerprint value is compared with the stored fingerprint value of the read meta tank. A referred counter value of the read meta tank is modified, and the modified meta tank is stored back, when the requested fingerprint value is the same as the stored fingerprint value of the read meta tank.	12-06-2012
20120310937	People Engine Optimization - Some embodiments promote website credibility and the optimization of websites for people by automatedly quantifying various elements of a website into component credibility scores. In some embodiments, a set of encoded credibility scoring rules are used to compute each of the component credibility scores, wherein the credibility scoring rules are derived based on factors that have been identified by a grouping of people that preferably represent a primary demographic of users that consume the content of a particular classified type of website. In some such embodiments, the credibility scoring rules are derived from commonality that is identified from a sample set of known credible and/or non-credible websites of a particular classification. Once the credibility scoring rules are defined, the system applies the rules to other websites having the same classification as those from which the rules are derived to automatically generate credibility scores for the other websites.	12-06-2012
20120310938	INFORMATION ORGANIZING SYTEM AND INFORMATION ORGANIZING METHOD - An information organizing system includes a reference information database storing reference information, a generalized expression unit to map measurement data and non-measurement data in a space in such a manner that the more they resemble each other, the shorter a distance between them becomes, an extended reference database in which the reference information is expressed in an extended manner by using the generalized expression unit, extended log data in which log data is expressed in an extended manner by using the generalized expression unit, a relevance detection unit to detect extended reference information having high relevance with the extended log data, and a template creation unit to create a predetermined template in which the log data is summarized by using the detected extended reference information.	12-06-2012
20120317112	OPERATION LOG MANAGEMENT SYSTEM AND OPERATION LOG MANAGEMENT METHOD - In an example of operation log management system, a storage device stores a plurality of operation log records obtained from an operation log in a client computer. The plurality of operation log records each contains an operation type of a corresponding operation and a group identifier for identifying a group to which the corresponding operation belongs. Each of at least a part of the plurality of operation log records contains at least one of identifiers of input data and output data of a corresponding operation. A processor groups the plurality of operation log records into groups by the group identifiers, identifies operation log records which belong to different groups and whose output data identifier and input data identifier match, and associates the different groups to which the identified operation log records belong as components of one integrated group. A display device displays information representing the integrated group.	12-13-2012
20120317113	COMPUTING DEVICE, STORAGE MEDIUM, AND METHOD FOR PROCESSING BILL OF MATERIAL OF ELECTRONIC PRODUCTS - In a method for processing a bill of material (BOM) of an electronic product using a computing device, each electronic component of the electronic product is identified according to a function description of the electronic component, and an identification code for each of the electronic components is generated. The method creates a functional BOM of the electronic components according to characteristic data of the electronic components, and classifies the electronic components into different groups of the functional BOM according to the identification code of each of the electronic components. The method further counts a total number of the electronic components in each of the classified groups, generates assembly information of the electronic product according to the functional BOM, and displays the assembly information of the electronic product on a display device that is connected to the computing device.	12-13-2012
20120317114	INFORMATION PROCESSING DEVICE, METHOD, AND COMPUTER PROGRAM PRODUCT - An information processing device, method and computer program product use a candidate attribute area identification portion configured to receive information associated with a cluster including at least one data item. The candidate attribute area identification portion identifies at least one named attribute area for each of the at least one data item. A relatedness assessment portion assesses a relatedness between the cluster and each of the at least one named attribute area. A cluster name generation portion generates a cluster name based on the assessed relatedness, wherein said cluster name includes at least a part of one of the at least one named attribute area.	12-13-2012
20120317115	INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, PROGRAM, AND INFORMATION PROCESSING SYSTEM - There is provided an information processing device including an event cluster creation unit configured to create an event cluster including, among a plurality of types of content, reference content serving as a reference and related content, the related content having a different type from the reference content and indicating the same event as the reference content, and a meta information appending unit configured to create meta information about the event on the basis of the event cluster and append the meta information to the event cluster.	12-13-2012
20120317116	APPARATUS AND METHOD FOR MANAGING SYSTEMS EACH INCLUDING A PLURALITY OF CONFIGURATION ITEMS - An apparatus generates configuration group information by classifying, based on first log information storing messages outputted by a first plurality of configuration items of a first system, into first configuration groups each including one or more configuration items that have outputted messages having a commonality. The apparatus generates relation class information that defines, in association with the first configuration groups, first one or more message propagation relations. The apparatus classifies a second plurality of configuration items of a second system into second configuration groups included in the first configuration groups, based on the configuration group information and second log information storing messages outputted by the second plurality of configuration items, and applies second one or more message propagation relations that are associated, by the relation class information, with third configuration groups included in the second configuration groups, to the second plurality of configuration items.	12-13-2012
20120317117	Information Visualization System - Provided is an information visualization system that can present information most suitable for the sensitivity and interest of a user. The information visualization system according to the present invention uses interest degrees of the user for items to calculate relevance between the items and generates an item map reflecting the relevance as coordinate values of the items.	12-13-2012
20120317118	DATABASE DATA DICTIONARY - Systems and methods are provided for manipulating data sets. In accordance with one implementation, a computerized system is provided for storing, managing, indexing, interrelating, and/or retrieving data sets in a manner independent of the data model. The system includes an element module configured to store and uniquely identify elements and an element relation module configured to store relationships between the elements in the element module. The computerized system may also comprise a class module configured to store attributes of elements in a class and a type definition module configured to define the class and the attributes related to the class. The computerized system may further comprise a state machine module, the state machine module including a state machine transition module and a status module.	12-13-2012
20120317119	Product Line Type Development Supporting Device - Product line type development is supported by analyzing inter-feature dependency relations based on the use state of feature in existing products and uses the analysis result.	12-13-2012
20120323915	INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM - There is provided an information processing device including a display control unit configured to display pieces of content at a first position of a screen, a condition setting unit configured to set a clustering condition for the pieces of content in accordance with a user operation, and a clustering unit configured to classify the pieces of content into a cluster in accordance with the clustering condition. The display control unit moves a display of the pieces of content from the first position toward a second position corresponding to the cluster.	12-20-2012
20120323916	METHOD AND SYSTEM FOR DOCUMENT CLUSTERING - A method and system for document clustering. The method includes: extracting text feature information of the documents, establish a social network based on information related with the documents, performing graph clustering based on the social network to obtain structural sub-set, extracting structural feature information of the structural sub-set, and performing clustering on the documents based on the text feature information and the structural feature information.	12-20-2012
20120323917	NAVIGATING MEDIA CONTENT BY GROUPS - Grouping media files via playlists on a computer-readable medium. One or more media files are selected according to a grouping criterion to define one or more playlists from the media files. A container group is associated with the playlists and stores values identifying each of the playlists associated with the container group along with references to each of the playlists.	12-20-2012
20120323918	METHOD AND SYSTEM FOR DOCUMENT CLUSTERING - A method and system for document clustering. The method includes: extracting text feature information of the documents, establish a social network based on information related with the documents, performing graph clustering based on the social network to obtain structural sub-set, extracting structural feature information of the structural sub-set, and performing clustering on the documents based on the text feature information and the structural feature information.	12-20-2012
20120330952	SCALABLE METADATA EXTRACTION FOR VIDEO SEARCH - Video entity templates defining common features that relate to various metadata types shared among a group of video Web pages are generated for target Web sites. Metadata associated with videos contained within Web pages belonging to a particular target Web site can then be automatically and accurately extracted using a video entity template generated for the particular target Web site. This metadata can then be indexed for use by video search applications in providing video search results.	12-27-2012
20120330953	DOCUMENT TAXONOMY GENERATION FROM TAG DATA USING USER GROUPINGS OF TAGS - Embodiments of the invention provide a novel and non-obvious method, system and computer program product for generating a document taxonomy based upon tag data in groupings of tags. In an embodiment of the invention, a method for generating a document taxonomy based upon tag data in groupings of tags has been claimed. The method includes retrieving into memory of a host computer different groupings of tags for correspondingly different documents providing a bottom-up view of the documents. The method further includes deriving a folksonomy from the groupings of tags for the documents and organizing the folksonomy into a hierarchy of nodes. Of note, each of the nodes can be associated with a different subject in the folksonomy. Finally, the method includes publishing the hierarchy of nodes as a taxonomy for the documents to provide a top-down view of the documents.	12-27-2012
20120330954	System And Method For Implementing A Scalable Data Storage Service - A system that implements a scalable data storage service may maintain tables in a non-relational data store on behalf of clients. The system may provide a Web services interface through which service requests are received, and an API usable to request that a table be created, deleted, or described; that an item be stored, retrieved, deleted, or its attributes modified; or that a table be queried (or scanned) with filtered items and/or their attributes returned. An asynchronous workflow may be invoked to create or delete a table. Items stored in tables may be partitioned and indexed using a simple or composite primary key. The system may not impose pre-defined limits on table size, and may employ a flexible schema. The service may provide a best-effort or committed throughput model. The system may automatically scale and/or re-partition tables in response to detecting workload changes, node failures, or other conditions or anomalies.	12-27-2012
20120330955	DOCUMENT SIMILARITY CALCULATION DEVICE - A document similarity calculation device, configured to calculate a similarity indicating a degree of how much a plurality of documents are similar, includes: an associative word group storage portion for storing an associative word group composed of words associated with one another, a word-in-document frequency matrix generation portion for generating a matrix of word frequency in document which is a matrix each element of which is the frequency of a word present in a document with respect to each combination of the word and the document, a word-in-document frequency matrix transformation portion for transforming the generated matrix of word frequency in document based on the stored associative word group so as to reduce the number of dimensions of the matrix of word frequency in document, and a similarity calculation portion for calculating the similarity based on the transformed matrix of word frequency in document.	12-27-2012
20120330956	SYSTEM AND METHOD FOR PRESENTING USER GENERATED GEO-LOCATED OBJECTS - A system and method for generating a virtual tour on a display device is described. The method comprises providing at least one map. The method further comprises providing a plurality of sequenced images, wherein each of the images is associated with at least one location by a geo-coding module configured to generate a geo-location object data sheet that associates sequential images with a corresponding location. The sequenced images are organized based on the location of each of the sequenced images and displayed on the map. The method is implemented by the system.	12-27-2012
20120330957	INFORMATION PROCESSING METHOD FOR DETERMINING WEIGHT OF EACH FEATURE IN SUBJECTIVE HIERARCHICAL CLUSTERING - An information processing apparatus determines a weight of each physical feature for hierarchical clustering by acquiring training data of multiple pieces of content in triplets with label information indicating a pair specified by a user as having a highest degree of similarity among three contents of the triplet and executing hierarchical clustering using a feature vector of each piece of content of the training data and the weight of each feature to determine the hierarchical structure of the training data. The information processing apparatus updates the weight of each feature so that the degree of agreement between a pair combined first as being the same clusters among three contents of the triplet in a determined hierarchical structure and a pair indicated by label information corresponding to the triplet increases.	12-27-2012
20130006986	Automatic Classification of Electronic Content Into Projects - Automatically classifying content into a given project workspace is provided. New electronic mail items, documents, meeting requests, tasks, calendar items, and the like are automatically classified into a project workspace. Thus, a user is not required to engage in a time-consuming task of identifying, collecting, and associating such content with a given project workspace. In addition, feedback may be provided to the user on the quality of automatic assignments of content items to the desired workspace for editing content associated with the desired workspace and for improving the automatic classification process.	01-03-2013
20130006987	EVENT PROCESSING BASED ON META-RELATIONSHIP DEFINITION - According to an example implementation, a non-transitory computer-readable storage medium is provided that includes computer-readable instructions stored thereon that, when executed, are configured to cause a processor to at least: store a relationship definition including one or more selectors identifying events participating in the relationship and one or more constraints between the events, at least one of the constraints expressed in terms of one or more relationship parameters. The instructions further cause the processor to receive one or more events, evaluate the received events against the one or more selectors, create a candidate relationship when the relationship parameters have been defined based on receiving one or more events that match one or more of the selectors, and convert the candidate relationship to a relationship instance when a minimum number of events matching each of the selectors are received.	01-03-2013
20130006988	PARALLELIZATION OF LARGE SCALE DATA CLUSTERING ANALYTICS - A cluster selector may determine a plurality of sample clusters, and may reproduce the plurality of sample clusters at each of a plurality of processing cores. A sample divider may divide a plurality of samples stored in a database with associated attributes into a number of sample subsets corresponding to a number of the plurality of processing cores, and may associate each of the number of sample subsets with a corresponding one of the plurality of processing cores. A joint operator may perform a comparison of each sample of each sample subset at each corresponding core of the plurality of processing cores with respect to each of the plurality of sample clusters reproduced at the corresponding processing core, based on associated attributes thereof.	01-03-2013
20130006989	Search Method for a Containment-Aware Discovery Service - In general, methods and apparatus, including computer program products, implementing and using techniques for providing a discovery service in a unique identifier network are described. Said discovery service is suitable for tracking and tracing a query item represented by a unique identifier in a unique identifier network. In particular, a search method for a containment-aware discovery service is described.	01-03-2013
20130006990	ENHANCING CLUSTER ANALYSIS USING DOCUMENT METADATA - A search query including search criteria can be received. The search criteria can be a text string. An enhanced search against an enhanced index can be executed. The enhanced index can be metadata associated with an enhanced cluster. The enhanced cluster can be a document cluster associated with the metadata. The enhanced cluster can be aggregated into a merged document. The merged document can be a document including the enhanced cluster contents. The ranking algorithm can be executed on the merged document to obtain a final ranking of content within the single document.	01-03-2013
20130006991	INFORMATION PROCESSING APPARATUS, METHOD AND PROGRAM FOR DETERMINING WEIGHT OF EACH FEATURE IN SUBJECTIVE HIERARCHICAL CLUSTERING - An information processing apparatus determines a weight of each physical feature for hierarchical clustering by acquiring training data of multiple pieces of content in triplets with label information indicating a pair specified by a user as having a highest degree of similarity among three contents of the triplet and executing hierarchical clustering using a feature vector of each piece of content of the training data and the weight of each feature to determine the hierarchical structure of the training data. The information processing apparatus updates the weight of each feature so that the degree of agreement between a pair combined first as being the same clusters among three contents of the triplet in a determined hierarchical structure and a pair indicated by label information corresponding to the triplet increases.	01-03-2013
20130006992	Adapting Data Quality Rules Based Upon User Application Requirements - During application of data quality rules to a data set obtained from a data source, data is retrieved from the data source along with a common set of rules configured to format the retrieved data in a manner in accordance with one or more predefined data quality rules of the common set of rules. At least one predefined data quality rule is adjusted utilizing at least one editable widget to form a modified set of data quality rules adapted for use with a specified application. The modified set of data quality rules is applied to the retrieved data.	01-03-2013
20130006993	PARALLEL DATA PROCESSING SYSTEM, PARALLEL DATA PROCESSING METHOD AND PROGRAM - A parallel data processing system comprises: a unit of processing that generates, reads out or updates an object or relevant information on objects; a consistency controller that returns to the unit of processing a consistency value for an object within each object cluster; and an object to cluster association resolving unit that receives an identifier of an object to return an identifier of a consistency controller for an object cluster including the object, wherein, in generating, reading out or updating an object or relevant information on objects, the unit of processing acquires an identifier of a consistency controller for an object cluster including the object from the object to cluster association resolving unit, and performs consistency control based on the consistency controller, while the unit of processing accesses an object storage unit.	01-03-2013
20130006994	SYSTEM FOR ORGANIZING COMPUTER DATA - A system for organizing computer data by the use of naming rules, grouping rules, and sequencing rules. These rules name and sort data in a consistent and convenient manner, which can, in part or whole, be employed by a human, computer, or both.	01-03-2013
20130013599	Identifying a candidate part of a map to be updated - A method for identifying a candidate part of a map to be updated. The method comprises receiving position data relating to a plurality of reroute points and determining one or more clusters of reroute points based on the position data. The method further comprises determining cluster features, determining a weight for each of the clusters and generating reroute cluster position data which is transmitted in a last step of the method. By determining clusters of reroute points and corresponding weights, a candidate part of a map to be updated may be identified in an efficient way. A corresponding device and computer program product are also provided.	01-10-2013
20130013600	ATTRIBUTE CHANGE COALESCING IN ORDER TO DEAL WITH COMPONENT MOVES ON A PAGE - Embodiments of the invention provide for applying multiple attribute changes to components of a dataset. According to one embodiment, coalescing changes can comprise reading a definition of the dataset. For example, the definition can comprise an identity and a context for each of the plurality of components. A component tree can be generated representing the data set and based on the context and identity. An indication of one or more changes to the components of the data set can be received and the changes can be classified based on a type of each of the changes. For example, the type of the changes can comprise one or more of a single component change, a cross-component change, and a cross-component change the affects the identity of at least one of the components The changes can be coalesced based on the type of the changes.	01-10-2013
20130013601	SYSTEM AND METHOD FOR GROUPING OF USERS INTO OVERLAPPING CLUSTERS IN SOCIAL NETWORKS - Members of a social network user's social graph are automatically segregated into overlapping clusters according to patterns of their past communications. Each cluster within the social graph represents a group of members having a high degree of intra-cluster communication or other connection with one another. The clustering is performed according to a sorting or ranking in accordance with non-principal eigenvectors of connectivity matrices describing the intra-cluster communications/connections. The overlapping clusters exhibit maximum internal density and minimum external sparsity.	01-10-2013
20130013602	DATABASE SYSTEM - Operating a database system comprises: storing a database table comprising a plurality of rows, each row comprising a key value and one or more attributes; storing a primary index for the database table, the primary index comprising a plurality of leaf nodes, each leaf node comprising one or more key values and respective memory addresses, each memory address defining the storage location of the respective key value; creating a new leaf node comprising one or more key values and respective memory addresses; performing a memory allocation analysis based upon the lowest key value of the new leaf node to identify a non-full memory page storing a leaf node whose lowest key value is similar to the lowest key value of the new leaf node; and storing the new leaf node in the identified non-full memory page.	01-10-2013
20130013603	SEMIOTIC INDEXING OF DIGITAL RESOURCES - A method of classifying a plurality of documents. The method includes steps of providing a first set of classification terms and a second set of classification terms, the second set of classification terms being different from the first set of classification terms; generating a first frequency array of a number of occurrences of each term from the first set of classification terms in each document; generating a second frequency array of a number of occurrences of each term from the second set of classification terms in each document; generating a first similarity matrix from the first frequency array; generating a second similarity matrix from the second frequency array; determining an entrywise combination of the first similarity matrix and the second similarity matrix; and clustering the plurality of documents based on the result of the entrywise combination.	01-10-2013
20130013604	Method and System for Making Document Module - It can automatically be extracted a document module from a plurality of documents and be made a document module database.	01-10-2013
20130013605	Managing Storage of Data for Range-Based Searching - In general, a value of a numerical attribute of a record stored in a data structure is received. A numerical range is generated that includes the value of the numerical attribute. An entry is stored, in an index associated with the data structure, that specifies a location of the record within the data structure and that includes a first index key and a second index key. The first index key corresponds to a value of an attribute of the record different from the numerical attribute, and the second index key corresponds to the generated numerical range.	01-10-2013
20130013606	Managing Storage of Data for Range-Based Searching - In general, a value of a numerical attribute of a record stored in a data structure is received. A numerical range is generated that includes the value of the numerical attribute. An entry is stored, in an index associated with the data structure, that specifies a location of the record within the data structure and that includes a first index key and a second index key. The first index key corresponds to a value of an attribute of the record different from the numerical attribute, and the second index key corresponds to the generated numerical range.	01-10-2013
20130013607	SYSTEMS, METHODS, AND MEDIA FOR CORRELATING OBJECTS ACCORDING TO RELATIONSHIPS - Systems, methods, and media for correlating objects according to relationships are provided herein. According to some embodiments, methods may include the steps of for each object in a database, determining a static weight, the static weight representing a number of relational connections between each object and one or more connected entities, setting a delta weight for each object, the delta weight being equal to the static weight, determining which object in the database comprises a highest delta weight, propagating the highest delta weight of the object to each of the connected entities, adding the highest delta weight to a static weight and a delta weight for each of the connected entities, setting the delta weight for the object to zero, wherein the method terminates upon determining that a highest delta weight for at least one object is below a threshold value.	01-10-2013
20130013608	GENERATING A TAXONOMY FOR DOCUMENTS FROM TAG DATA - Tags on documents are clustered using tag weightings of the tags on the documents. Each cluster includes an identified subject. The identified subjects are compared to identify relationships between the identified subjects. A taxonomy of subjects is built using the identified relationships between the identified subjects programmatically without user intervention.	01-10-2013
20130013609	SYSTEMS AND METHODS FOR CLASSIFYING AND TRANSFERRING INFORMATION IN A STORAGE NETWORK - Systems and methods for data classification to facilitate and improve data management within an enterprise are described. The disclosed systems and methods evaluate and define data management operations based on data characteristics rather than data location, among other things. Also provided are methods for generating a data structure of metadata that describes system data and storage operations. This data structure may be consulted to determine changes in system data rather than scanning the data files themselves.	01-10-2013
20130013610	ALLOCATING AND MANAGING RANDOM IDENTIFIERS USING A SHARED INDEX SET ACROSS PRODUCTS - Provided are techniques for selecting row identifiers from an initial index structure storing rows of randomized indexes. The row identifiers are randomized. Groups are formed with the randomized row identifiers so that each group has a predetermined number of row identifiers. At least one group is selected from the groups. Indexes are retrieved from the initial index structure that correspond to the row identifiers in the selected at least one group. The retrieved indexes are encoded by adding product information to form new identifiers.	01-10-2013
20130013611	GROUPING APPARATUS, COMPUTER-READABLE RECORDING MEDIUM, AND GROUPING METHOD - A grouping apparatus (	01-10-2013
20130018884	Systems and Methods for Providing a Content Item Database and Identifying Content ItemsAANM Chandrasekharappa; Santhosh BaramasagaraAACI BangaloreAACO INAAGP Chandrasekharappa; Santhosh Baramasagara Bangalore INAANM Ekambaram; SivakumarAACI BangaloreAACO INAAGP Ekambaram; Sivakumar Bangalore INAANM Sohoney; SaurabhAACI BangaloreAACO INAAGP Sohoney; Saurabh Bangalore INAANM Nigam; RakeshAACI ChennalAACO INAAGP Nigam; Rakesh Chennal IN - Systems and methods are provided for identifying unsolicited or unwanted electronic communications, such as spam. The disclosed embodiments also encompass systems and methods for selecting content items from a content item database. Consistent with certain embodiments, computer-implemented systems and methods may use a clustering based statistical content matching anti-spam algorithm to identify and filter spam. Such a anti-spam algorithm may be implemented to determine a degree of similarity between an incoming e-mail with a collection of one or more spam e-mails stored in a database. If the degree of similarity exceeds a predetermined threshold, the incoming e-mail may be classified as spam. Further, in accordance with other embodiments, systems and methods may be provided to determine a degree of similarity between a query or search string from a user and content items stored in a database. If the degree of similarity exceeds a predetermined threshold, the content item from the database may be identified as a content item that matches the query or search string provided by the user.	01-17-2013
20130018885	INDICATING STATES IN A TELEMATIC SYSTEMAANM Guenkova-Luy; TeodoraAACI UlmAACO DEAAGP Guenkova-Luy; Teodora Ulm DE - A status management system includes a computer-implemented method for delivering status information to a requester, comprising providing status codes, clustering the status codes in a number of status codes clusters, hierarchically sorting the status codes clusters and transmitting at least one of the status codes to the requester depending on the hierarchy of the sorted status codes clusters.	01-17-2013
20130018886	EFFECT MEASUREMENT DEVICE, EFFECT MEASUREMENT METHOD, AND EFFECT MEASUREMENT PROGRAMAANM Minamizawa; TakeakiAACI Minato-kuAACO JPAAGP Minamizawa; Takeaki Minato-ku JP - Segment group generating means	01-17-2013
20130024452	SYSTEM AND METHOD FOR MANAGING PROJECTS - A method and system for managing a project. The method and system comprise accepting at least two project templates from a database, wherein the project database contains personal project templates and work project templates categorized by type of project. A start date and/or an end date for each project template may be accepted. Information related to each project template may be automatically generated. The information related to all project templates may be aggregated and a user may access the information related to all project templates from one user interface.	01-24-2013
20130024453	CONTEXT SYSTEM - The present invention provides an extended context map that can be used in the prioritisation of content which is displayed to a user. The extended context map contains a hierarchical arrangement of contexts, which are associated with a hierarchical arrangement of topics and sub-topics.	01-24-2013
20130031092	METHOD AND APPARATUS FOR COMPRESSING GENETIC DATA - A method of compressing sequence data in a text-based format, the method involving parsing text of the sequence data into a plurality of fields, identifying encoding algorithms that achieve greatest compression gains with respect to the plurality of fields based on collected statistics, and generating a bitstream, compressed from the sequence data, by encoding the sequence data using the identified encoding algorithms.	01-31-2013
20130031093	INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD, PROGRAM, AND NON-TRANSITORY INFORMATION STORAGE MEDIUM - A difference in tendency of times associated with combinations of content clusters and user clusters among the user clusters is reflected in a result of correspondence between the content cluster and the user cluster. A data acquisition unit acquires association data indicating a combination of a content belonging to the content cluster, a user belonging to one of a plurality of user clusters, and a time relating to a combination of the content and the user. A dividing unit divides, under a condition that the tendency of the times associated with the users in the association data differs among the plurality of user clusters to which the users belong, the content cluster into a plurality of clusters each corresponding to at least one of the plurality of user clusters.	01-31-2013
20130031094	CLUSTERING OF FEEDBACK REPORTS - A computer-implemented method comprising: receiving a first report related to an application configured to run on one or more computing devices; identifying one or more terms included in the first report; identifying one or more second reports including at least one of the one or more terms; retrieving a term relevance value for a term included in at least one of the one or more second reports; determining that the term relevance value is less than a term relevance threshold value; identifying at least one of the one or more second reports for a clustering process, wherein at least one of the one or more second reports that include the term is excluded from the clustering process; implementing the clustering process using the identified at least one of the one or more second reports; and assigning the first report to a cluster.	01-31-2013
20130031095	ENTRY SUPPORT APPARATUS AND METHOD - A computer-readable recording medium has an entry support program embodied therein for causing a computer to perform detecting text being entered, extracting text examples corresponding to the detected text from a storage unit, the storage unit storing text examples and frequencies of use of the text examples such that the frequencies of use are associated with the respective text examples, classifying the extracted text examples into text-example groups each containing one or more text examples based on comparison of letters included in the extracted text examples, determining display order of the text-example groups, based on the frequencies of use that are associated in the storage unit with text examples belonging to the text-example groups, and displaying the extracted text examples in the determined display order.	01-31-2013
20130031096	SYSTEM AND METHOD FOR THE INTELLIGENT SUGGESTION AND EVALUATION OF CONTENT - A system and method for analyzing search sessions including recording a plurality of search sessions, the search sessions compiled into a plurality of search clusters based on a commonality of search terms used across sessions in each cluster, and calculating a rating for each of the plurality of search clusters based on the success of the search session in yielding usable search results.	01-31-2013
20130031097	SYSTEM AND METHOD FOR ASSIGNING SOURCE SENSITIVE SYNONYMS FOR SEARCH - A system and method for the content-sensitive tagging of a document including creating a content-specific domain on a database, creating a document associated with the content-specific domain, analyzing the document to identify a term, the term being associated with a content-sensitive synonym set assigned to the content-specific domain, and associating the document with a plurality of terms contained within said content-sensitive synonym set based on the term identified in the document.	01-31-2013
20130031098	MISMATCH DETECTION SYSTEM, METHOD, AND PROGRAM - A mismatch detection system includes: a statement unit extracting portion that extracts a set of statement units by dividing a given document, which is written in a natural language, into pieces; a statement constructing portion that constructs each statement as a combination of a context and specifics by sorting each of the statement units into the context, which indicate additional information of statements, and the specifics, which indicate information of the statements; and a data generating portion that generates a data set obtained by merging a set of predetermined check specifics and a set of the statements generated by the statement constructing portion. A clustering portion converts two most similar pieces of data in the generated data set into one new piece of data which is generated by linking the most similar two pieces of data, repeats the conversion to generate a new data set, and extracts, from the generated new data set, only pieces of data that contain the statements generated by the statement constructing portion, to thereby generate a clustering result set. The mismatch detection system further includes a detection portion that generates a check item for each combination of a predetermined check subject and check specifics, and detects a mismatch of the statements based on a degree of similarity between the generated check item and a clustering result.	01-31-2013
20130036119	Behavior Based Record Linkage - A computer implemented method for matching data records from multiple entities comprising providing respective transaction logs for the entities representing actions performed by or in respect of the entities, determining a matching score using the transaction logs for respective pairs of the entities and for predetermined combinations of merged entities by generating a measure representing a gain in behavior recognition for the entities before and after merging, and using the gain as a matching score.	02-07-2013
20130036120	FIELD-BASED SIMILARITY SEARCH SYSTEM AND METHOD - A field-based similarity search system includes an input device which inputs a query molecule, and a processor which partitions a conformational space of the query molecule into a fragment graph including an acyclic graph including plural fragment nodes connected by rotatable bond edges, computes a property field on fragment pairs of fragments of the query molecule from the fragment graph, the property field including a local approximation of a property field of the query molecule, constructs a set of features of the fragment pairs based on the property field, the features including a set of local, rotationally invariant, and moment-based descriptors generated from all conformations of the fragment graph of the query molecule, and weights the descriptors according to importance as perceived from a training set of descriptors to generate a context-adapted descriptor-to-key mapping which maps the set of descriptors to a set of feature keys.	02-07-2013
20130041900	Script Reuse and Duplicate Detection - A script repository (embodied, for instance, as a SQL server database) of re-usable scripts may be provided that houses scripts previously developed and/or executed as part of a prior testing effort. The script repository may be searched to view and/or download one or more test scripts based on one or more search criteria, and/or may be checked for duplicate scripts. A development team thus may be able to easily find appropriate scripts for re-use based on particular testing requirements. At the end of testing, the information on the portion (e.g., percentage) of scripts that were re-used by the developers may be collected and reported on a periodic basis.	02-14-2013
20130041901	NEWS FEED BY FILTER - Disclosed are electronic systems and techniques for filtering data comprising content requested from a variety of sources, and tracking which of the sources supplied the data and a category that classifies the content. The filtering may comprise implementing a variety of filtering techniques to acquire the content. In this regard, the content can be stored and grouped with other content related to the content in data storage, while discarding a subset of the data not containing the content.	02-14-2013
20130041902	TRAVEL SERVICES SEARCH - A system and method for searching travel services. A server computer receives a travel request from a client device operated by a user. The server computer identifies travel options according to the travel request. The server computer classifies the travel options into predefined groups, the classifying based on at least one of past transactions, input from domain experts, input from semantic analysts, analytics data, user preferences, and company policies. The server computer presents the options via presentation of the predefined groups.	02-14-2013
20130046762	SHAPE CLASSIFICATION METHOD BASED ON THE TOPOLOGICAL PERCEPTUAL ORGANIZATION THEORY - A shape classification method based on the topological perceptual organization (TPO) theory, comprising steps of: extracting boundary points of shapes (S1); constructing topological space and computing the representation of extracted boundary points (S2); extracting global features of shapes from the representation of boundary points in topological space (S3); extracting local features of shapes from the representation of boundary points in Euclidean space (S4); combining global features and local features through adjusting the weight of local features according to the performance of global features (S5); classifying shapes using the combination of global features and local features (S6). The invention is applicable for intelligent video surveillance, e.g., objects classification and scene understanding. The invention can also be used for the automatic driving system wherein robust recognition of traffic signs plays an important role in enhancing the intelligence of the system.	02-21-2013
20130046763	SYSTEMS AND METHODS FOR IDENTIFYING ASSOCIATIONS BETWEEN MALWARE SAMPLES - Systems and methods are disclosed for identifying associations between binary samples, such as e-mail files and their attachments or a document and an executable program associated with the document. In one implementation, the method includes receiving a plurality of binary samples, and extracting metadata from the plurality of binary samples. The metadata for a binary sample from the plurality of binary samples includes a set of attributes of the binary sample. The method further includes identifying a set of associations between the plurality of binary samples based on the extracted metadata. Each association is characterized by at least one attribute the associated binary samples have in common, and each association has a confidence level indicative of a strength of the association. The method also includes identifying associations with a confidence level that exceeds a predefined threshold.	02-21-2013
20130054597	CONSTRUCTING AN ASSOCIATION DATA STRUCTURE TO VISUALIZE ASSOCIATION AMONG CO-OCCURRING TERMS - Extended associations are determined based on binary associations. The extended associations are associations among three or more terms in input data, and the binary associations are between terms in the input data. An association data structure having a plurality of entries is constructed, where at least a particular one of the plurality of entries includes visual elements representing terms that are associated according to the binary associations and the extended associations, and where the association data structure provides a visualization of an association pattern among co-occurring terms in the input data	02-28-2013
20130054598	ENTITY RESOLUTION BASED ON RELATIONSHIPS TO A COMMON ENTITY - Techniques are disclosed for resolving entities based on relationships to a common entity. In one embodiment, two entities are compared to determine that an entity resolution threshold is not satisfied. One or more entities commonly related to the two entities are determined. The two entities are determined to satisfy the entity resolution threshold on the basis of the one or more commonly-related entities. The two entities are then resolved into a single entity.	02-28-2013
20130054599	Dynamically Generated List Index - Dynamically generated lists are provided. Data elements may be sorted in a list according to a value. An index may be dynamically generated to divide the data elements into one or more index groupings, the index groupings containing an equal or nearly-equal number of data elements. The index may comprise index grouping entry points allowing for navigation to each of the one or more index groupings. If a list is modified, for example a data element is added, removed, or otherwise modified, the index may be automatically regenerated to preserve equal distribution of data elements across the index groupings.	02-28-2013
20130054600	SYSTEM AND METHOD FOR IMPROVING APPLICATION CONNECTIVITY IN A CLUSTERED DATABASE ENVIRONMENT - A clustered database environment (e.g. Oracle Real Application Cluster (RAC)) includes multiple database instances that appear as one server. An application server (e.g. WebLogic Server (WLS)) can use a data source (e.g. an Oracle GridLink data source) and connection pools to connect with the clustered database. In accordance with an embodiment, a data source configuration allows for specification of a preferred affinity policy, such as a data affinity, temporal affinity, and/or session or session-based affinity policy. In accordance with an embodiment, the system includes a number of features that improve application connectivity in the clustered database environment, including a select-only case for application continuity, wherein an application-independent infrastructure, e.g. implemented within a Java Database Connectivity (JDBC) driver, enables recovery of work from an application perspective and masks system communications, hardware failures and hangs.	02-28-2013
20130054601	MANAGING AND CLASSIFYING ASSETS IN AN INFORMATION TECHNOLOGY ENVIRONMENT USING TAGS - Disclosed below are representative embodiments of methods, apparatus, and systems for managing and classifying assets in an information technology (“IT”) environment using a tag-based approach. The disclosed tag-based classification techniques can be implemented through a graphical user interface. Embodiments of the disclosed tag-based classification techniques can be used to allow a user to easily and quickly select and perform actions on groups of one or more assets (e.g., monitor policies, perform upgrades, etc.). For example, the tag-based classification techniques can automatically classify assets into “tag sets” (or “tagged sets”) based on node properties or user-selected criteria or conditions (e.g., criteria or conditions that are established in a user-created tagging profile or rule). The tagged assets can then be further filtered to identify even deeper relationships between the assets.	02-28-2013
20130054602	CHARACTERISTIC POINT DETECTION SYSTEM, CHARACTERISTIC POINT DETECTION METHOD, AND PROGRAM - The characteristic point detection system of this invention includes: a staying area detection unit	02-28-2013
20130060774	METHOD FOR SEMANTIC CLASSIFICATION OF NUMERIC DATA SETS - A system and method for semantically classifying numerical data includes using semantic classification techniques on ‘nearby’ non-numerical data to identify a context whereby opaque data sets of numbers can be semantically classified inside of that context. An Electronic Knowledge Base is used to query against the context and determine the semantics of the opaque numeric data sets.	03-07-2013
20130060775	SPANNING-TREE PROGRESSION ANALYSIS OF DENSITY-NORMALIZED EVENTS (SPADE) - Methods and systems for determining progression and other characteristics of microarray expression levels and similar information, alternatively using a network or communications medium or tangible storage medium or logic processor.	03-07-2013
20130060776	DISJOINT PARTIAL-AREA BASED TAXONOMY ABSTRACTION NETWORK - A disjoint partial-area taxonomy abstraction network and methods of producing same for a hierarchy, which partitions overlapping concepts into singly-rooted disjoint groups that are more manageable to work with and comprehend. This provides abstract models for summarizing overlapping concepts which permit enhanced, high-level display for users at a user interface.	03-07-2013
20130060777	TIME ALIGNED TRANSMISSION OF CONCURRENTLY CODED DATA STREAMS - A method begins by a dispersed storage (DS) processing module receiving a first coded matrix that includes a first plurality of pairs of coded values corresponding to first data segments of a first data stream and a second data stream. The method continues with the DS processing module receiving a second coded matrix that includes a second plurality of pairs of coded values corresponding to first data segments of a third data stream and a fourth data stream. The method continues with the DS processing module generating a new coded matrix to include a plurality of groups of selected coded values. The method continues with the DS processing module outputting the plurality of groups of selected coded values to a requesting entity in a manner to maintain time alignment of the first data segments of the first, second, third, and fourth data streams.	03-07-2013
20130060778	DEVICE, METHOD, AND PROGRAM FOR DISPLAYING DOCUMENT LIST - Provided are a device, a method, and a program for displaying a document list with which a desired document can be effectively specified. The present invention groups documents in accordance with a displaying method of a document list, dynamically gives a group a name with which a range of the grouped documents can be seen, and organizes the document list.	03-07-2013
20130060779	COMMUNITY-BASED PARENTAL CONTROLS - According to a general aspect, a method includes maintaining rating groups, each rating group providing a rating for content compiled based on information received from a user evaluating the content. The method also includes receiving, from a first user, a selection of a first rating group to be applied to a set of users associated with the first user. The method also includes receiving, from a user, a request for a piece of content. The method also includes determining that the user from which the request was received belongs to the set of users associated with the first user. The method also includes, based upon the determination that the user belonged to the set of users associated with the first user, accessing information associated with the first rating group and determining whether the first rating group includes a rating for the requested piece of content.	03-07-2013
20130066867	MANAGEMENT OF METADATA FOR LIFE CYCLE ASSESSMENT DATA - Management of metadata applied to life cycle inventory (LCI) and life cycle assessment (LCA) data is provided. Life cycle inventory data may be provided through a secure framework to a data hub where it may be validated and audited and where the received life cycle inventory data may ultimately be used to generate a life cycle assessment score for a given product. Life cycle inventory data may be provided via a structured data template provided by a supplier or obtained by a supplier via a requested unit business process model. Metadata may be applied to each LCI data item to allow the data items to be stored, sorted, search, retrieved and used for validating and auditing the data and for comparing the data to other data items of similar data types for eventual use in the generation of the aforementioned life cycle assessment score for a given product.	03-14-2013
20130066868	Systems and Methods for Version Chain Clustering - A system, a method and a computer program product for storing data, which include receiving a data stream having a plurality of transactions that include at least one portion of data, determining whether at least one portion of data within at least one transaction is substantially similar to at least another portion of data within at least one transaction, clustering together at least one portion of data and at least another portion of data within at least one transaction, selecting one of at least one portion of data and at least another portion of data as a representative of at least one portion of data and at least another portion of data in the received data stream, and storing each representative of a portion of data from each transaction in the plurality of transactions, wherein a plurality of representatives is configured to form a chain representing the received data stream.	03-14-2013
20130066869	COMPUTER SYSTEM, METHOD OF MANAGING A CLIENT COMPUTER, AND STORAGE MEDIUM - In an embodiment, a client acquires an operation log of operations in the client. A management system acquires a first operation log group consisting of operation log records including an operation log record of an operation in which a first problem is generated from the operation log. The management system stores in advance problem examples associated with operation log groups each consisting of operation log records and with solutions. The management system searches the operation log groups associated with the stored problem examples for an operation log group determined to be similar to the first operation log group based on the operation log records of the first operation log group. The management system determines a solution to one of the problem examples that is associated with the operation log group determined to be similar to the first operation log group, as a solution candidate to the first problem.	03-14-2013
20130066870	System for Generating a Medical Knowledge Base - A system generates medical knowledge base information by searching at least one repository of medical information to identify sentences including a received medical term. A data processor searches the identified sentences to identify sentences including a medical term different to the received term in response to a predetermined repository of medical terms and excludes sentences without a term different to the received term, to provide remaining multiple term sentences. The data processor groups different terms of individual sentences of the multiple term sentences to provide grouped terms, determines whether a medically valid relationship occurs between different terms of an individual group of terms of the grouped terms by using predetermined sentence structure and syntax rules and outputs data representing grouped terms having a medically valid relationship.	03-14-2013
20130066871	Enabling Identification of Online Identities Between Different Messaging Service - A method and system for populating identities in a message service involves registering a user of a first messaging service with a second messaging service. User identities for users other than the registered user may be identified. These user identities may be associated with the first messaging service and may be stored in a list associated with the registered user. It is determined if each identified user identity has a matching user identity associated with the second messaging service. If so, a database associated with the second messaging service is populated with the matching user identities. Determining whether a matching user identity exists may be performed, for example, by making character strings comparisons between user identities or using a database that stores a mapping of first messaging service user identities to second messaging service user identities. The mapping database may be generated as corresponding user identities are discovered.	03-14-2013
20130066872	Method and Apparatus for Organizing Images - A method and apparatus are defined for organizing a plurality of digital photos. The method comprises the steps of identifying a group of digital photos, receiving a number defining how many clusters to be formed from the group, receiving profile information to be used for clustering the digital photos into the number of clusters, clustering the group of digital photos according to the profile information, and identifying representative digital photo(s) of the clusters from the clustered digital photos based on the profile information.	03-14-2013
20130073552	Data Center Capability Summarization - A method for summarizing capabilities in a hierarchically arranged data center includes receiving capabilities information, wherein the capabilities information is representative of capabilities of respective nodes at a first hierarchical level in the hierarchically arranged data center, clustering nodes based on groups of capabilities information, generating a histogram that represents individual node clusters, and sending the histogram to a next higher level in the hierarchically arranged data center. Relative rankings of capabilities may be used to order a sequence of clustering operations.	03-21-2013
20130073553	INFORMATION MANAGEMENT METHOD AND INFORMATION MANAGEMENT APPARATUS - An information management method to be executed by a computer, the information management method includes: accepting a registration request including information in which latitude and longitude are included, and correspondence information corresponding to the position information; generating one character string by alternately arraying one character of the latitude and another one character of the longitude, each of the one character and the other one character is in a same digit regarding all of the digits of each of the latitude and the longitude, or some digits from the least significant digit of each of the latitude and the longitude; and storing the correspondence information in a storage unit in a manner correlated with the character string as a key.	03-21-2013
20130080434	Systems and Methods for Contextual Analysis and Segmentation Using Dynamically-Derived Topics - Systems and methods are disclosed for contextual analysis and segmentation of information objects. According to one implementation, information objects, such as web pages and user profiles, may be analyzed to identify key terms. These key terms may be included in a contextual representation of an information object. By comparing the contextual representations of a plurality of information objects, one or more contextual segments (i.e., categories of information objects) may be created. Each contextual segment may also be associated with its own contextual representation. Once a contextual segment has been created, information objects may be assigned to the contextual segment. These contextual segments may be used to deliver targeted advertising, for example.	03-28-2013
20130080435	METHOD AND APPARATUS FOR MANAGING ONLINE CONTENT COLLECTIONS - An approach is presented for managing content collections at various hierarchical layers. A content management platform causes a specification of one or more hierarchical layers for managing content associated with one or more content stores. The content management platform then determines at least one content collection associated with the one or more content stores. It then determines to organize the at least one content collection into one or more bins associated with respective ones of the one or more hierarchical layers.	03-28-2013
20130080436	PHRASE CLUSTERING - Systems and associated methods for enhanced concept understanding in large document collections through phrase clustering are described. Embodiments take as input an initial set of phrases and estimate centroids using a clustering process. Embodiments then generate new phrases around each of the current centroids using the current phrases. These new phrases are added to the current set, and the clustering process is iterated. Upon convergence, embodiments finalize clusters based on phrases of any given length.	03-28-2013
20130080437	SYSTEM AND METHOD FOR PROVIDING STATISTICS FOR USER SUBMISSIONS - Systems and methods for providing an overview of user comments are disclosed. In one example, the method includes receiving the user comment, processing words in the user comments by comparing each of the words in the user comments with a plurality of words in a wordlist, determining whether each word from the wordlist is used in the user comments, calculating statistics on the user comments based on a number of times each word from the wordlist is used and transmitting the statistics for display to a user.	03-28-2013
20130086065	PRIVILEGED ACCOUNT MANAGER, DYNAMIC POLICY ENGINE - Techniques for managing accounts are provided. An access management system may check out credentials for accessing target systems. For example a user may receive a password for a period of time or until checked back in. Access to the target system may be logged during this time. Upon the password being checked in, a security account may modify the password so that the user may not log back in without checking out a new password. Additionally, in some examples, password policies for the security account may be managed. As such, when a password policy changes, the security account password may be dynamically updated. Additionally, in some examples, hierarchical viewing perspectives may be determined and/or selected for visualizing one or more managed accounts. Further, accounts may be organized into groups based on roles, and grants for the accounts may be dynamically updated as changes occur or new accounts are managed.	04-04-2013
20130086066	AUTOMATED DISCOVERY AND GENERATION OF HIERARCHIES FOR BUILDING AUTOMATION AND CONTROL NETWORK OBJECTS - Management systems, methods, and mediums. A method includes identifying a name of a first object in a plurality of objects. The plurality of objects is associated with one more devices communicably connected to a building automation and controls network. The method includes parsing characters in the name of the first object to identify a plurality of name segments separated by one or more delimiters in the name in response to identifying the one or more delimiters. The method includes identifying a type of the first object and a location of a first device associated with the first object based on the plurality of name segments. Additionally, the method includes generating a hierarchical structure for the plurality of objects based on the type and the location, the hierarchical structure comprising a name for each of the plurality of objects.	04-04-2013
20130086067	CONTEXT-AWARE SUGGESTIONS FOR STRUCTURED QUERIES - A suggestion system for providing suggestions of features for inclusion in a clause of a structured query. The suggestion system receives a partial query that is being created by a user. The suggestion system analyzes a query log having queries submitted by one or more users to identify features to suggest to the user based on a likelihood that users who submitted queries similar to the partial query included that feature in the query. The query system then presents to the user the identified features as suggestions to include in the partial query.	04-04-2013
20130086068	METHOD OF VISUALIZING THE COLLECTIVE OPINION OF A GROUP - A computerized method of visualizing the collective opinion of a group regarding one or more qualitative issues. The group initially selects N issues from the universe of potential issues and often assigns the issues images and titles. The system presents each user with graphical user interface screens wherein individual users vote on the relative importance and degree of relationship between the N aspects (Data Points) and issues, often using drag and drop methods. The software computes N×N similarity matrices based on users voting input and clusters various aspects into groups of greater and lesser similarity and importance, and presents results of users qualitative ranking in easy to read relationship tree diagrams where the relative importance and qualitative relationship of the issues may be designated by size and other graphical markers. The software may reside on a network server and present display screens to web browsers running on user's computerized devices.	04-04-2013
20130086069	SYSTEMS, METHODS, AND APPARATUS FOR COMPUTER-ASSISTED FULL MEDICAL CODE SCHEME TO CODE SCHEME MAPPING - An example method for mapping of medical code schemes includes processing a plurality of coded concepts to determine a potential match between a code from a first code scheme in the plurality of coded concepts and a code from a second code scheme in the plurality of coded concepts. The method includes assigning a probability to each potential match of a code from the first code scheme and a code from the second code scheme. The method includes generating an alphanumeric indication of the probability of each potential match between the first code scheme and the second code scheme from the plurality of coded concepts and generating a graphical representation of the plurality of coded concepts. The method includes outputting the alphanumeric indication and the graphical representation to a user and accepting user input to select a match between the first code scheme and the second code scheme.	04-04-2013
20130091135	FILE AGGREGATION METHOD AND INFORMATION PROCESSING SYSTEM USING THE SAME - The performance of the analysis system is deteriorated because file content extraction processing is performed in the file aggregation server and in the analysis server and further because annotation data creation is performed in the file aggregation server. Therefore, the present invention solves the problem by providing a file aggregation server classifying files into analysis target contents, non analysis target contents, and content matched data, and providing only the analysis target contents to the analysis server. Since this method enables the analysis server to acquire the analysis target contents directly from the file aggregation server, the processing of extracting contents from the files becomes unnecessary, and the throughput of the entire analysis system is improved.	04-11-2013
20130091136	ELECTRONIC DISCOVERY SYSTEM - Embodiments of the invention relate to systems, methods, and computer program products for improved electronic discovery and custodian management. Embodiments herein disclosed provide for an enterprise wide e-discovery system that provides for data to be identified, located, retrieved, preserved, searched, reviewed and produced in an efficient and cost-effective manner across the entire enterprise system. In addition, by structuring management of e-discovery based on case/matter, custodian and data and providing for linkage between the same, further efficiencies are realized in terms of identifying, locating and retrieving data and leveraging results of previous e-discoveries with current requests.	04-11-2013
20130091137	INFORMATION CLASSIFICATION SYSTEM - An information classification system	04-11-2013
20130097166	Determining Demographic Information for a Document Author - According to one embodiment of the present invention, a system determines a demographic group associated with a document, and comprises a computer system including at least one processor. The system analyzes sample documents from one or more demographic groups to determine a demographic profile for each of the demographic groups based on one or more textual features within the sample documents. The one or more textual features within a document are evaluated and a document profile is generated based on the one or more textual features. The document profile is compared to the demographic profiles to identify the demographic group associated with the document. Embodiments of the present invention further include a method and computer program product for determining a demographic group associated with a document in substantially the same manner described above.	04-18-2013
20130097167	Method and system for creating ordered reading lists from unstructured document sets - Systems and methods for organizing a repository of unstructured documents into groups of ordered reading lists, i.e., document trails, comprising an ordered list of documents that relate to each by subject matter. Text analytics and natural language processing sets group documents, chose the most important/relevant documents from each group, and organize the documents into a suggested reading order. Additionally, documents within each document trail may be marked up or highlighted to indicate which paragraphs therein contain novel or useful information, or information that is not useful or redundant	04-18-2013
20130097168	METHOD TO IDENTIFY COMMON STRUCTURES IN FORMATTED TEXT DOCUMENTS - A computer implemented method, computer program product and data processing system, for identifying common structures shared across a plurality of formatted text documents. The common structure is presented as a sequence of landmarks, each of which has a starting and ending marker to describe the borders of text. The common structure is identified by counting the occurrences of repeating text segments across documents. Frequently co-occurred adjacent segments become candidates for markers of landmarks. In addition, styling information of textual content within a landmark is extracted and mapped to rules. The rules are used to merge and summarize content from multiple documents, which gives an advantage over current practice of content concatenation.	04-18-2013
20130097169	Personal Achievement and Recognition System and Method - An electronic achievement and recognition system provides a software-based platform that allows users to receive, share, view and create achievements.	04-18-2013
20130097170	APPARATUS, SYSTEM AND METHOD FOR THE EFFICIENT STORAGE AND RETRIEVAL OF 3-DIMENSIONALLY ORGANIZED DATA IN CLOUD-BASED COMPUTING ARCHITECTURES - A cloud based storage system and methods for uploading and accessing 3-D data partitioned across distributed storage nodes of the system. The data cube is processed to identify discrete partitions thereof, which partitions may be organized according to the x (e.g., inline), y (e.g., crossline) and/ or z (e.g., time) aspects of the cube. The partitions are stored in unique storage nodes associated with unique keys. Sub-keys may also be used as indexes to specific data values or collections of values (e.g., traces) within a partition. Upon receiving a request, the proper partitions and values within the partitions are accessed, and the response may be passed to a renderer that converts the values into an image displayable at a client device. The request may also facilitate data or image access at a local cache, a remote cache, or the storage partitions using location, data, retrieval, and/or rendering parameters.	04-18-2013
20130103688	PROVIDING AN AGGREGATE DISPLAY OF CONTACT DATA FROM INTERNAL AND EXTERNAL SOURCES - An aggregate display of contact data from internal and external sources is provided. Contact data associated with at least one contact is obtained from a plurality of sources, including at least an internal source and an external source. The obtained contact data is processed to generate an aggregated collection of contact data. The aggregated collection of contact data is stored. A display of the aggregated collection of contact data is displayed in a single, interactive interface.	04-25-2013
20130103689	MEDIA MEDIATOR SYSTEM AND METHOD FOR MANAGING CONTENTS OF VARIOUS FORMATS - Provided is a system and method that may encode various formats of contents to a single format and thereby manage the contents, and may transform the contents to a format corresponding to a request of a third party or an end user to distribute the content. A media mediator system of managing various formats of contents may include: a service manager to receive a content and metadata of the content from a content provider; a metadata manager to register the content using the metadata, and to store the metadata of the registered content; a database manager to store and manage information associated with the content; and an encoding manager to schedule an encoding sequence of the content, and to sequentially encode the content based on a scheduling result.	04-25-2013
20130110834	AGGREGATING CARDIAC RESYNCHRONIZATION THERAPY DATA	05-02-2013
20130110835	METHOD FOR CALCULATING PROXIMITIES BETWEEN NODES IN MULTIPLE SOCIAL GRAPHS	05-02-2013
20130110836	SYSTEM AND METHOD FOR COMMUNITY AND GROUP ORGANIZATION FOR ACTIVITIES AND EVENTS	05-02-2013
20130110837	DATA COLLECTING CONCENTRATOR AND DATA COLLECTING METHOD	05-02-2013
20130110838	METHOD AND SYSTEM TO ORGANIZE AND VISUALIZE MEDIA	05-02-2013
20130117266	GEO-FENCE BASED ON GEO-TAGGED MEDIA - Architecture that creates a geo-fence based on geo-tagged item (e.g., a photo. The geo-tagged item can be used to share virtual boundaries, such as geo-fences, between users via conventional methods (e.g., email) for sharing digital media. An extraction component that extracts geolocation information (e.g., latitude and longitude coordinates, altitude, bearing, distance, place names, etc.) of a geo-tagged item. The geolocation information can be related to a geographical location at which the item is geo-tagged. A boundary component then creates a virtual boundary (e.g., geo-fence) in association with the geographical location and based on the geolocation information. Thereafter, the virtual boundary is triggered when the user crosses (e.g., engages, intersects) the boundary and the attached action is triggered. The geo-tagged item can be shared with another user, which when is processed, creates a virtual boundary for that other user.	05-09-2013
20130117267	CUSTOMER SUPPORT SOLUTION RECOMMENDATION SYSTEM - Methods, systems, and apparatus, including computer programs encoded on a computer-readable storage medium and a method for automatically providing support solutions in response to user feedback items. The method comprises receiving user feedback items and corresponding support solutions. The method further comprises identifying, using clustering techniques, associations between the user feedback items and the corresponding support solutions. The method further comprises storing the identified associations as an items-solutions model that correlates the user feedback items with the corresponding support solutions. The method further comprises receiving a new user feedback item. The method further comprises automatically determining, using the items-solutions model, at least one support solution that corresponds to the new user feedback item. The method further comprises providing the at least one support solution in response to the received new user feedback item.	05-09-2013
20130124522	METHOD AND APPARATUS FOR REPRESENTING MULTIDIMENSIONAL DATA - The present invention relates to methods for representing multidimensional data. The methods of the present invention are well suited but not limited to the representation of multidimensional data in such a way as to enable the comparison and differentiation of data sets. For example, the invention may be applied to the representation of flow cytometric data. The invention further relates to a program storage device having instructions for controlling a computer system to perform the methods, and to a program storage device containing data structures used in the practice of the methods.	05-16-2013
20130124523	SYSTEMS AND METHODS FOR MEDICAL INFORMATION ANALYSIS WITH DEIDENTIFICATION AND REIDENTIFICATION - A medical information navigation engine is useful in association with at least one electronic health record system. The engine decouples identifying information from clinical data from electronic health records. The clinical data includes clinical narrative having discrete data and textual data. The identifying information is stored. Additionally, the identifying information is associated with a token in the clinical data. The clinical data may then be indexed. The discrete data and the textual data in the clinical data may then be mined. Mining includes extracting at least one relevant event from the discrete data and the textual data. Next, the clinical data and identifying information may be reintegrated using the token. The event associated with the mined discrete data and textual data may then be exported. The system may also provide a validation tool for users, including clinicians, to search and view clinical data. The exported event may be used to alter treatment of a patient.	05-16-2013
20130124524	DATA CLUSTERING BASED ON VARIANT TOKEN NETWORKS - Received data records, each including one or more values in one or more fields, are processed to identify one or more data clusters. The processing includes: identifying tokens that each include at least one value or fragment of a value in a field or a combination of fields; generating a network representing the identified tokens, with nodes of the network representing tokens and edges of the network each representing a variant relationship between tokens; and generating a graphical representation of the network with different subsets of nodes distinguished based at least in part on values associated with nodes, where a value associated with a particular node quantifies a count of a number of instances of the token represented by that particular node appearing within the received data records.	05-16-2013
20130124525	DATA CLUSTERING BASED ON CANDIDATE QUERIES - Received data records, each including one or more values in one or more fields, are processed to identify a matched data cluster. The processing includes: for selected data records, generating a query from one or more values; identifying one or more candidate data records from the received data records using the query; determining whether or not the selected data record satisfies a cluster membership criterion for at least one candidate data cluster of one or more existing data clusters containing the candidate records; and selecting the matched data cluster from among one or more candidate data clusters based at least in part on a growth criterion for the candidate data clusters, or initializing the matched data cluster with the selected data record if the selected data record does not satisfy a cluster membership criterion for any of the existing data clusters or based on a result of the growth criterion.	05-16-2013
20130124526	METHOD AND APPARATUS FOR PREDICTING OBJECT PROPERTIES AND EVENTS USING SIMILARITY-BASED INFORMATION RETRIEVAL AND MODELING - Method and apparatus for predicting properties of a target object comprise application of a search manager for analyzing parameters of a plurality of databases for a plurality of objects, the databases comprising an, electrical, electromagnetic, acoustic spectral, database (ESD), a micro-body assemblage database (MAD) and a database of image data whereby the databases store data objects containing identifying features, source information and information on site properties and context including time and frequency varying data. The method comprises application of multivariate statistical analysis and principal component analysis in combination with content-based image retrieval for providing two-dimensional attributes of three dimensional objects, for example, via preferential image segmentation using a tree of shapes and to predict further properties of objects by means of k-means clustering and related methods. By way of example, one of a machine component or process failure event, an intrusion event and a fire event and residual objects may be predicted and located and qualified such that, for example, properties of the residual objects may be qualified, for example, via black body radiation and micro-body databases including charcoal assemblages.	05-16-2013
20130124527	REPORT AUTHORING - A system for aiding report authoring is disclosed. The system comprises a set of associations (	05-16-2013
20130132390	SYSTEM AND METHOD FOR SELECTIVELY PROVIDING AN AGGREGATED TREND - A method of selectively providing an aggregated trend obtained from at least a subset of a plurality of individual trends. The method comprises deciding whether to provide the aggregated trend by determining whether an individual trend in at least the subset of the plurality of individual trends can be at least partially identified from the aggregated trend.	05-23-2013
20130132391	PREDICTING CONTENT AND CONTEXT PERFORMANCE BASED ON PERFORMANCE HISTORY OF USERS - Systems and methods are provided for selecting contexts for new invitational content and invitational content for new contexts. In particular, a performance history of delivered invitational content in known contexts is combined with similarity measures for the delivered invitational content, with respect to a new invitational content, to generate a list of potential contexts for the new invitational content. Similarly, a performance history of in known contexts with delivered invitational content can combined with similarity measures for known contexts, with respect to a new context, to generate a list of potential content for the new context. Further, a combination of these methods can be used to pair new invitational content with new contexts.	05-23-2013
20130132392	Pangenetic Web Item Recommendation System - Computer based systems, methods, software and databases are presented in which correlations between web item preferences and pangenetic (genetic and epigenetic) attributes of individuals are used for pangenetic based web item recommendation in which a user can request and receive personalized online recommendations of web items that are based on the user's pangenetic makeup. Data masking can be used to maintain privacy of sensitive portions of the pangenetic data.	05-23-2013
20130132393	METHOD AND SYSTEM FOR DISPLAYING ACTIVITIES OF FRIENDS AND COMPUTER STORAGE MEDIUM THEREFOR - Method and system for displaying activities of friends. The method includes, generating activity information from the activities of friends according to the friend relationship chain; classifying the activity information according to the preset classification rule to obtain the corresponding classification collection; acquiring the screening condition, extracting activity information that meets the screening condition from the classification collection, and displaying the activity information extracted. According to the method and system, through the classification and screening of activity information, the activities of friends which meet screening conditions entered by a user can be obtained and displayed. Using this method and the system, a large amount of information can be quickly screened and all the friends' effective activities can be provided to the user allowing user to achieve personalized browsing.	05-23-2013
20130132394	Integration and Combination of Random Sampling and Document Batching - Methods and systems of integrated batching and random sampling of documents for enhanced functionality and quality control, such as validation, within a document review process are provided herein. According to various embodiments, a batching request may be received and may include a population size that corresponds to a total amount of documents available for sampling. The batching request may also include an acceptable margin of error. A random sample size may be calculated based on the batching request, and then a subset of documents corresponding to the random sample size may be selected from the total amount of documents available for sampling. The subset of documents may be grouped into one or more batches, and the one or more batches may be assigned to one or more review nodes.	05-23-2013
20130138651	SYSTEM AND METHOD EMPLOYING A SELF-ORGANIZING MAP LOAD FEATURE DATABASE TO IDENTIFY ELECTRIC LOAD TYPES OF DIFFERENT ELECTRIC LOADS - A method identifies electric load types of a plurality of different electric loads. The method includes providing a self-organizing map load feature database of a plurality of different electric load types and a plurality of neurons, each of the load types corresponding to a number of the neurons; employing a weight vector for each of the neurons; sensing a voltage signal and a current signal for each of the loads; determining a load feature vector including at least four different load features from the sensed voltage signal and the sensed current signal for a corresponding one of the loads; and identifying by a processor one of the load types by relating the load feature vector to the neurons of the database by identifying the weight vector of one of the neurons corresponding to the one of the load types that is a minimal distance to the load feature vector.	05-30-2013
20130138652	AUTOMATIC IDENTITY ENROLMENT - Biometric computer systems are systems which use one or biometric identifiers to enroll, verify or identify a person. This disclosure concerns the automatic enrolment of people into biometric systems. Aspects include methods, computer systems, software and biometric systems. A first biometric identifier (i.e. face) and a second biometric identifier (e.g. iris) is captured (	05-30-2013
20130138653	METHOD FOR REDUCING AN AMOUNT OF STORAGE REQUIRED FOR MAINTAINING LARGE-SCALE COLLECTION OF MULTIMEDIA DATA ELEMENTS BY UNSUPERVISED CLUSTERING OF MULTIMEDIA DATA ELEMENTS - A method for reducing an amount of storage required for maintaining a large-scale collection of multimedia data elements by unsupervised clustering of multimedia data elements. The method comprises processing the multimedia data elements in the large-scale collection to generate a first cluster of multimedia data elements; storing the first cluster in a storage unit; repeating the generation of a new cluster from the first cluster and un-clustered multimedia elements in the large-scale collection until a single cluster is reached; and storing the new cluster generated at each iteration in the storage unit, wherein a N-th cluster generated at the N-th iteration is stored in the storage unit, wherein the amount of storage required to store the N-th cluster is less than an amount of storage of the large-scale collection, thereby the unsupervised clustering enables reducing the storage amount required to store the multimedia data elements in the large-scale collection.	05-30-2013
20130144880	BUSINESS PARTNER GROUPING - A method and system, the method including defining business criteria for business entities, the business criteria including at least one business object attribute; defining at least one business partner group set, each of the at least one business partner group set being associated with at least one business object attribute selected from the business criteria; and modifying, by the computer, a record of a business entity by assigning the business entity to one or more of the defined business partner group sets.	06-06-2013
20130144881	PARALLELIZATION OF ELECTRONIC DISCOVERY DOCUMENT INDEXING - A system and method for parallelizing document indexing in a data processing system. The data processing system includes a primary processor for receiving a list of data having embedded data associated therewith, at least one secondary processor to process the data as provided by the primary processor, a data processor to determine a characteristic of the embedded data and process the embedded data based upon the characteristic, and a messaging module to exchange at least one status message between the primary processor and the at least one secondary processor.	06-06-2013
20130151519	Ranking Programs in a Marketplace System - A marketplace system is described herein for ranking programs based, at least in part, on the assessed distinctiveness of the programs. In one implementation, the marketplace operates by: (a) accessing a set of programs; (b) extracting feature information from each of the programs; (c) generating similarity information for each program, based on the feature information; (d) ranking the programs based at least on the similarity information, to provide ranking information; and (e) providing a user interface presentation that has an effect of promoting at least one distinctive program in the set of applications on the basis of the ranking information.	06-13-2013
20130151520	INFERRING EMERGING AND EVOLVING TOPICS IN STREAMING TEXT - A method, system and computer program product for inferring topic evolution and emergence in a set of documents. In one embodiment, the method comprises forming a group of matrices using text in the documents, and analyzing these matrices to identify a first group of topics as evolving topics and a second group of topics as emerging topics. The matrices includes a first matrix X identifying a multitude of words in each of the documents, a second matrix W identifying a multitude of topics in each of the documents, and a third matrix H identifying a multitude of words for each of the multitude of topics. These matrices are analyzed to identify the evolving and emerging topics. In an embodiment, the documents form a streaming dataset, and two forms of temporal regularizers are used to help identify the evolving topics and the emerging topics in the streaming dataset.	06-13-2013
20130151521	SYSTEMS AND METHODS FOR DYNAMIC PARTITIONING IN A RELATIONAL DATABASE - Systems and methods for dynamic partitioning in a relational database are described herein. A system can be configured to receive a data object definition statement to define a data object, where the data object definition statement associates an expression with the data object, and where the expression defines a correlation between an attribute of the data object and a prospective partition of the data object. The system can then receive a data manipulation statement to manipulate data in the data object, and in response to receiving the data manipulation statement, process the data manipulation statement, where processing includes initiating evaluation of the expression to identify a target partition based on the correlation between the attribute and the prospective partition.	06-13-2013
20130151522	EVENT MINING IN SOCIAL NETWORKS - A method and system for detecting an event from a social stream. The method includes the steps of: receiving a social stream from a social network, where the social stream includes at least one object and the object includes a text, sender information of the text, and recipient information of the text; assigning said object to a cluster based on a similarity value between the object and the clusters; monitoring changes in at least one of the clusters; and triggering an alarm when the changes in at least one of the clusters exceed a first threshold value, where at least one of the steps is carried out using a computer device.	06-13-2013
20130151523	PHOTO MANAGEMENT SYSTEM - A photo management system is provided to record occurrence dates of important events in the individual life course, classify the photos according to the occurrence dates and name the photo folders. Once the preset occurrence date of a specified event is approaching, the photo management system will remind the user of taking and storing photos. Consequently, the photo management system provides a photo book marked by individual life timing. Another photo management system is a family relationship-based photo management system for managing family relationship and sharing photos. By linking the birth date to the blood relation, linking the marriage date to the affinity relation and using the information of the face address book, the kinship between the user and others can be established to deduce the family tree.	06-13-2013
20130151524	Optimized Resizing For RCU-Protected Hash Tables - A technique for resizing a first RCU-protected hash table stored in a memory. A second RCU-protected hash table is allocated in the memory as a resized version of the first hash table having a different number of hash buckets, with the hash buckets being defined but initially having no hash table elements. The second hash table is populated by linking each hash bucket thereof to all hash buckets of the first hash table containing elements that hash to the second hash bucket. The second hash table is then published so that it is available for searching by hash table readers. The first table is freed from memory after waiting for a grace period which guarantees that no readers searching the first hash table will be affected by the freeing.	06-13-2013
20130151525	INFERRING EMERGING AND EVOLVING TOPICS IN STREAMING TEXT - A method, system and computer program product for inferring topic evolution and emergence in a set of documents. In one embodiment, the method comprises forming a group of matrices using text in the documents, and analyzing these matrices to identify evolving topics and emerging topics. The matrices includes a matrix X identifying a multitude of words in each of the documents, a matrix W identifying a multitude of topics in each of the documents, and a matrix H identifying a multitude of words for each of the multitude of topics. These matrices are analyzed to identify the evolving and emerging topics. In an embodiment, two forms of temporal regularizers are used to help identify the evolving and emerging topics. In another embodiment, a two stage approach involving detection and clustering is used to help identify the evolving and emerging topics.	06-13-2013
20130151526	SNS TRAP COLLECTION SYSTEM AND URL COLLECTION METHOD BY THE SAME - A social networking service (SNS) trap collection system capable of accurately and effectively extracting and collecting information including a malicious code among information exchanged in an SNS, and a uniform resource location (URL) collection method by the same. URL information for a malicious code included in post (a bulletin script, a message, a note, or the like) exchanged is effectively collected by using an account IDD and a password of account information and utilized for detecting a malicious code in the SNS, thus significantly reducing damage to users due to infection of a malicious code.	06-13-2013
20130151527	ASSIGNING SOCIAL NETWORKING SYSTEM USERS TO HOUSEHOLDS - Users of a social networking system are assigned to households using prediction models that rely, in part, on user profile information and social graph data. Information about users may be received by a social networking system through various channels (e.g., declared/profile information, user history, IP addresses, Global Positioning System (GPS) data from check-in events and/or continuously provided by mobile devices, external household information, and/or social information). Scoring models may use statistical analysis of the received user information to predict household membership for users. User attributes, such as previous names, date of birth, social graph data, locations, life events, and check-ins, may be factors in generating confidence scores of predicted household memberships. Weighted scoring models may use machine learning methods for measuring the accuracy of the household membership prediction. The social networking system may use a machine learning algorithm to analyze user information to determine confidence scores for matching potential households.	06-13-2013
20130151528	LOGGING DEVICE, LOGGING SYSTEM AND CONTROL METHOD FOR LOGGING DEVICE - A logging device of the present invention includes a collection unit for correlating a production data obtained from a production apparatus with an identification data specific to a product produced by the production apparatus and for collecting these data; and an output unit for outputting the identification data collected by the collection unit to a traceability file and for outputting the production data collected by the collection unit into an area in the traceability file, corresponding to the identification data correlated with the production data. According to the present invention, processing loads to the logging device can be reduced in creating a traceability file, and a traceability file can be created without requiring a particular memory area for storing data collected by the logging device.	06-13-2013
20130151529	FACTORIZATION OF SCENARIOS - A method for configuring a control interface for controlling a system including one or more pieces of home automation equipment, the control interface including an information screen on which may be displayed a time scale representing a time period with a defined duration, the method including steps of: (i): defining a plurality of associations, each association being defined between a scenario for controlling one or more pieces of home automation equipment and a triggering instant defined within the time period, at which the scenario has to be triggered by the control interface, (ii): producing a grouping of at least one portion of the association from among the plurality of defined associations, the triggering instants of which are defined within a time interval with a defined duration within the time period, (iii): positioning a collective reference mark on the time scale corresponding to the grouping at the time interval.	06-13-2013
20130151530	INFORMATION PROVIDING METHOD AND SYSTEM - Embodiments of the present invention disclose an information providing method and system. The method includes: receiving data collected through a control module; collecting a user identification and operation information corresponding to the user identification in the data; associating and storing the user identification and the operation information corresponding to the user identification; and providing information for a user according to operation information corresponding to user identifications on a social relationship chain of the user. The system includes an interface module, a collecting module, a storing module and an information providing module. By the embodiments, the information related to the operation information of contacts is provided for the user. Since the social relationship chain is utilized, the possibility that the user is interested in the information related to the operation information of contacts is greater, and thus targeted information may be provided for the user.	06-13-2013
20130159304	METHOD AND SYSTEM FOR PROVIDING STORAGE DEVICE FILE LOCATION INFORMATION - A method and system are disclosed that permit a host application to obtain cluster location data, for example logical addresses associated with the clusters of a file, and to allow a host application to communicate the logical block address mapping information to firmware of a storage device. The method includes the host transmitting one or more clusters or partial clusters having a signature to the storage device where the storage device knows or has been instructed by the host to look for the signature. The storage device may receive clusters having a signature and, responsive to a host request, return logical address information to a host for the location in the storage device of the marked clusters. The host may parse a data structure on the storage device to obtain remaining cluster location information using a file's first cluster location or may request that the storage device return the cluster location information.	06-20-2013
20130159305	Entity Clustering via Data Services - A method is provided for forming an entity cluster. In this method, a plurality of entities found in one or more data sources are identified. An entity may represent a word or a phrase found in the one or more data sources. The plurality of entities may then be organized into groups, where each group has a master entity and a set of subordinate entities. The groups are formed using a first comparison criteria. Then, using a second comparison criteria, a first group is associated with a second group. The second comparison criteria may compare the master entities associated with the first and second groups. Based on the association between the first group and the second group, the method can then determine that the first entity is related to the second entity.	06-20-2013
20130159306	System And Method For Generating, Updating, And Using Meaningful Tags - A system and method for generating tag glossaries and use thereof is provided. A set of tags is accessed. Each tag is associated with a glossary that includes one or more terms and definitions for the terms. A new tag is generated and a new glossary is generated for the new tag based on the glossaries associated with the set of tags. The tag glossaries can be used to provide context for documents associated with the tags, to determine appropriate tags for untagged documents, to help in search for other documents, and to build indices for documents or collections of documents.	06-20-2013
20130159307	DIMENSION LIMITS IN INFORMATION MINING AND ANALYSIS - Provided are methods, systems, and computer readable media for user interaction with database methods and systems. In an aspect, a user interface can be generated to permit dynamic display generation to view data. The system can comprise a visualization component to dynamically generate one or more visual representations of the data to present in the state space.	06-20-2013
20130159308	Interactive Global Map - Systems and methods are provided for generating an interactive map for displaying and analyzing a compressive intellectual property landscape within a given field. Based on content analysis of relevant patents and patent applications, this prior art map provides a systematic review of vast quantities of data, thereby allowing the user to discern critical technology and product trends, prior art references, and the strategies of both leading and emerging competitors. Each patent represented on the map can be analyzed within the context of the prior art landscape to uncover novel features, strong claims, and business and technology trends. This comprehensive view can provide a foundation for creating effective corporate strategies in-tune with the realities of the intellectual property terrain.	06-20-2013
20130159309	METHOD AND APPARATUS FOR PREDICTING OBJECT PROPERTIES AND EVENTS USING SIMILARITY-BASED INFORMATION RETRIEVAL AND MODELING - Method and apparatus for predicting properties of a target object comprise application of a search manager for analyzing parameters of a plurality of databases for a plurality of objects, the databases comprising an electrical, electromagnetic, acoustic spectral database (ESD), a micro-body assemblage database (MAD) and a database of image data whereby the databases store data objects containing identifying features, source information and information on site properties and context including time and frequency varying data. The method utilizes a model comprising application of multivariate statistical analysis and principal component analysis in combination with content-based image retrieval for providing two-dimensional attributes of three dimensional objects, for example, via preferential image segmentation using a tree of shapes and to predict further properties of objects by means of k-means clustering and related methods.	06-20-2013
20130159310	METHOD AND APPARATUS FOR PREDICTING OBJECT PROPERTIES AND EVENTS USING SIMILARITY-BASED INFORMATION RETRIEVAL AND MODELING - Method and apparatus for predicting properties of a target object comprise application of a search manager for analyzing parameters of a plurality of databases for a plurality of objects, the databases comprising an electrical, electromagnetic, acoustic spectral database (ESD), a micro-body assemblage database (MAD) and a database of image data whereby the databases store data objects containing identifying features, source information and information on site properties and context including time and frequency varying data. The method comprises application of multivariate statistical analysis and principal component analysis in combination with content-based image retrieval for providing two-dimensional attributes of three dimensional objects, for example, via preferential image segmentation using a tree of shapes and to predict further properties of objects by means of k-means clustering and related methods. By way of example, one of a criminal activity and a fraudulent activity event, an intrusion event and a fire event and residual objects may be predicted and located and qualified such that, for example, properties of the residual objects may be qualified, for example, via black body radiation and micro-body databases including charcoal assemblages.	06-20-2013
20130159311	SYSTEM AND METHODS FOR GENERATION OF A CONCEPT BASED DATABASE - A method for generating a concept database respective of a plurality of multimedia data elements (MMDEs) comprises generating a plurality of items from a received MMDE of the plurality of MMDEs; determining the items that are of interest for signature generation; generating at least one signature responsive to at least one item of interest of the received MMDE of the plurality of MMDEs; clustering a plurality of signatures received from the signature generator responsive of the plurality of MMDEs; reducing the number of signatures in each cluster to a create a signature reduced cluster (SRC) of the cluster; associating metadata with the SRC to a concept structure comprised of a plurality of SRCs and their associated metadata; and generating at least one index for mapping the received MMDE to at least one concept structure, wherein the concept database includes concept structures and the generated indices for the plurality of MMDEs.	06-20-2013
20130159312	SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR MANAGING AND ORGANIZING PIECES OF CONTENT - A system for managing and organizing content includes a source capable of operating at least one application (i.e., content manager, taxonomy manager, etc.). The application(s) are capable of providing a plurality of pieces of content and a plurality of master taxonomies, each master taxonomy including at least one piece of content placed at one or more locations. After providing the pieces of content and the master taxonomies, the application(s) can identify a subset of the pieces of content and at least one master taxonomy, where the subset includes at least one piece of content. The application(s) can then generate at least one client taxonomy based upon the identified subset and the master taxonom(ies) such that the subset of the pieces of content can thereafter be organized in the client taxonom(ies). The system can also include a client capable of organizing the subset in the client taxonom(ies).	06-20-2013
20130166552	SYSTEMS AND METHODS FOR MERGING SOURCE RECORDS IN ACCORDANCE WITH SURVIVORSHIP RULES - According to some embodiments, a plurality of source records may be received from a plurality of data sources, with each source record including a plurality of fields. It may be determined that a match group of source records from different data sources relate to the same entity, and a single best record may be automatically created for the match group based on field values from different source records in the match group. The creating may includes, for example, assigning a first set of fields to a first survivorship group associated with a first survivorship rule and a second set of fields to a second survivorship group associated with a second survivorship rule. All records in the match group may then be simultaneously ranked in accordance with the first and second survivorship rules using a single query. The best record could then be stored for subsequent use by other applications.	06-27-2013
20130166553	Hybrid Database Table Stored as Both Row and Column Store - A hybrid database table is stored as both a row and a column store. One or more techniques may be employed alone or in combination to enhance performance of the hybrid table by regulating access to, and/or the size of, the processing-intensive column store data. For example during an insert operation, the column store data may be searched for a uniqueness violation only after certain filtering and/or boundary conditions have been considered. In another technique, a hybrid table manager may control movement of data to the column store based upon considerations such as frequency of access, or underlying business logic. In still another technique, querying of the hybrid table may result in a search of the column store data only after an initial search of row store data fails to return a result.	06-27-2013
20130166554	Hybrid Database Table Stored as Both Row and Column Store - A hybrid database table is stored as both a row and a column store. One or more techniques may be employed alone or in combination to enhance performance of the hybrid table by regulating access to, and/or the size of, the processing-intensive column store data. For example during an insert operation, the column store data may be searched for a uniqueness violation only after certain filtering and/or boundary conditions have been considered. In another technique, a hybrid table manager may control movement of data to the column store based upon considerations such as frequency of access, or underlying business logic. In still another technique, querying of the hybrid table may result in a search of the column store data only after an initial search of row store data fails to return a result.	06-27-2013
20130166555	METHOD AND APPARATUS FOR MANAGING CONTACT DATA BY UTILIZING SOCIAL PROXIMITY INFORMATION - An approach is provided for efficiently managing contact data in a phonebook by utilizing social proximity information at a device. An application and/or a widget processes and analyzes contact details associated with one or more contacts in order to determine a similarity to one or more other contact details associated with one or more users and/or one or more other contacts. Further, the application and/or the widget can determine one or more social proximity information items associated with the one or more contacts, the one or more users and/or the one or more other contacts and can utilize the social proximity information to manage (e.g., organize, group, sort, etc.) the contact data in the phonebook.	06-27-2013
20130166556	Independent Table Nodes In Parallelized Database Environments - A recipient node of a multi-node data partitioning landscape can receive, directly from a requesting machine without being handled by a master node, a first data request related to a table. A target node of a plurality of processing nodes can be identified to handle the data request. The determining can include the recipient node applying partitioning information to determine a target data partition of the plurality of data partitions to which the data request should be directed and mapping information associating each data partition of the plurality of data partitions with an assigned node of the plurality of processing nodes. The recipient node can redirect the data request to the target node so that the target node can act on the target data partition in response to the data request.	06-27-2013
20130166557	UNIQUE VALUE CALCULATION IN PARTITIONED TABLES - An estimation algorithm can generate a uniqueness metric representative of data in a database table column that is split across a plurality of data partitions. The column can be classified as categorical if the uniqueness metric is below a threshold and as non-categorical if the uniqueness metric is above the threshold. A first estimation factor can be assigned to the column if the column is classified as categorical or a larger second estimation factor can be assigned if the column is non-categorical. A cost estimate for system resources required to perform a database operation on the database table can be calculated. The cost estimate can include an estimated total number of distinct values in the column across all of the plurality of data partitions determined using the assigned first estimation factor or second estimation factor and a number of rows in the table as inputs to an estimation function.	06-27-2013
20130166558	METHOD AND SYSTEM FOR CLASSIFYING ARTICLE - The present invention discloses a method and system for classifying articles. The present invention can be not only capable of distinguishing the type of the article but also novelty to generate an overview article automatically in accordance with the initial prepared keyword combination or articles. Furthermore, the overview article described above comprises a representative topic corresponding to the content of the initial prepared articles, wherein the representative topic is also able to identify the field of the articles. Accordingly, by the said overview article, the present invention is capable of decreasing the time required to understand the spirit and the technical aspect of the articles so as to solve the long lasted problem of the prior art.	06-27-2013
20130166559	INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM - There is provided an information processing apparatus including a transmission section which transmits, to an external device that has collected an index pertaining to a feature of an output of a sensor for each classification, classification information for specifying the classification, and a reception section which receives information about the index corresponding to the classification information.	06-27-2013
20130173621	Clustering Devices In An Internet Of Things ('IoT') - Clustering devices in an Internet of Things (‘IoT’), including: receiving, by a device clustering module, a characteristic set for a device, wherein the characteristic set specifies one or more device attributes and an attribute value for each device attribute; clustering, by the device clustering module, the device into an attribute level cluster based on the one or more device attributes specified in the characteristic set for the device; and clustering, by the device clustering module, the device into a value level cluster based on the attribute value for each device attribute, wherein the value level cluster is a subset of the attribute level cluster.	07-04-2013
20130173622	SYSTEM AND METHOD FOR PROVIDING KEYWORD INFORMATION - A system and method for providing keyword information are provided. The method includes receiving, from a device, first keywords and first keyword information corresponding to the first keywords; classifying the first keywords and the first keyword information; receiving content from the device and extracting second keywords from the received content based on the classified first keyword information; and providing, to the device, second keyword information matching the extracted second keywords.	07-04-2013
20130173623	Method for Extracting Data from a Vision - Method for extracting data from a vision database (	07-04-2013
20130179448	Linking Single System Synchronous Inter-Domain Transaction Activity - An approach is provided to correlate transaction data occurring at two different domains running on a common operating system image without using static, or common, correlators. Request-type event records are collected at a first domain within the operating system image, with each of the request-type event records including execution identifiers and a unique token that indicates the order in which the corresponding request-type event occurred on the first domain. Similarly, response-type event records are collected at a second domain within the operating system image. The request-type event records are matched with the response-type event records based on the execution identifiers and an overall order that is indicated by unique tokens included in the records. The matching of request-type event records with response-type event records indicate a number of inter-domain transactions which are recorded in a correlation data store.	07-11-2013
20130179449	DETECTING OVERLAPPING CLUSTERS - A technique for identifying overlapping clusters of items in a data set. The technique may be used in connection with a social network or other on-line environment in which users express approval for other users, such as through votes, tags or other inputs. These expressions of approval may be used to form clusters such that entities assigned to a cluster have a higher metric of approval from other entities within the cluster than from outside the cluster. Such clusters may be arrived at through a computationally efficient approach that involves randomly selecting one or more entities as a seed for a cluster. The to cluster may be grown by testing other entities, similar to those already in the cluster, to determine whether they are more preferred by those already in the cluster than those outside the cluster. Once a cluster is grown to a desired size, it may be pruned.	07-11-2013
20130179450	CONTENT ANALYTICS SYSTEM CONFIGURED TO SUPPORT MULTIPLE TENANTS - Techniques are disclosed for a software as a service (SaaS) provider to host a content analytics tool used to evaluate data collections for multiple customers (referred to as tenants) using one dedicated and expandable computing infrastructure, without requiring that the service provider obtain, install, license, and manage a separate copy of the content analytics tools for each tenant. Customers are provided access to resources dedicated to their enterprise, but do not have access, or even awareness, of data collections or analytics resources hosted for other customers. That is, embodiments presented herein allow a provider to host content analytics tools used by customers to evaluate their enterprise data in a secure and timely manner.	07-11-2013
20130185300	DIVIDING DEVICE, DIVIDING METHOD, AND RECORDING MEDIUM - A dividing device includes: a memory configured to store a program including a procedure; and a processor configured to execute the program, the procedure including: extracting correlation information from source code of software, the information correlating relationships between an originating entity of the relationship and a receiving entity of the relationship, which are identified by dependent relationships of a group of entities, which is the group of elements that structure the software, and dividing the group of entities into clusters, so as to be include in the cluster a lot of the dependent relationship which a weight related to the dependent relationships is large, based on the weight related to the dependent relationships identified by the correlation information extracted by the extracting.	07-18-2013
20130185301	INSERTING DATA INTO AN IN-MEMORY DISTRIBUTED NODAL DATABASE - A database loader loads data to an in-memory database across multiple nodes in a parallel computing system. The database loader uses SQL flags, historical information gained from monitoring prior query execution times and patterns, and node and network configuration to determine how to effectively cluster data attributes across multiple nodes. The database loader may also allow a system administrator to force placement of database structures in particular nodes.	07-18-2013
20130185302	INSERTING DATA INTO AN IN-MEMORY DISTRIBUTED NODAL DATABASE - A database loader loads data to an in-memory database across multiple nodes in a parallel computing system. The database loader uses SQL flags, historical information gained from monitoring prior query execution times and patterns, and node and network configuration to determine how to effectively cluster data attributes across multiple nodes. The database loader may also allow a system administrator to force placement of database structures in particular nodes.	07-18-2013
20130185303	CONCEPTUAL WORLD REPRESENTATION NATURAL LANGUAGE UNDERSTANDING SYSTEM AND METHOD - A Natural Language Understanding system is provided for indexing of free text documents. The system according to the invention utilizes typographical and functional segmentation of text to identify those portions of free text that carry meaning. The system then uses words and multi-word terms and phrases identified in the free to text to identify concepts in the free text. The system uses a lexicon of terms linked to a formal ontology that is independent of a specific language to extract concepts from the free text based on the words and multi-word terms in the free text. The formal ontology contains both language independent domain knowledge concepts and language dependent linguistic concepts that govern the relationships between concepts and contain the rules about how language works. The system according to the current invention may preferably be used to index medical documents and assign codes from independent coding systems, such as, SNOMED, ICD-9 and ICD-10. The system according to the current invention may also preferably make use of syntactic parsing to improve the efficiency of the method.	07-18-2013
20130191388	POPULATION AND/OR ANIMATION OF SPATIAL VISUALIZATION(S) - One or more techniques and/or systems are provided for populating and/or animating a spatial visualization, such as a map, a timeline, and/or other 2D and/or 3D visual representations of locations. The spatial visualization may be populated with events extracted from a data source (e.g., real-time events, news events, social network events, etc.), and may include relationships between events (e.g., based upon time, location, contextual similarity (e.g., social network check-in events at a restaurant), events referencing one another (e.g., an article describing a first event may comprise a hyperlink to an article describing a second event) etc.). Filter criteria (e.g., date, event type, location, etc.) may be applied to events and/or relationships when populating the spatial visualization. A sequence of events and corresponding relationships may be animated within the spatial visualization (e.g., as the events unfold over a (user) designated period of time).	07-25-2013
20130191389	Paragraph Property Detection and Style Reconstruction Engine - Embodiments of the present disclosure provide for analyzing paragraphs in a fixed format document to determine style clusters or groupings of each paragraph. In certain embodiments, the paragraphs are grouped into style clusters based on a first property. Each style cluster is then further divided into sub-groups based on a second property. Once the sub-groups have been determined, a third property associated with each paragraph in each sub-group is normalized based on a dominant one of the at least the third property.	07-25-2013
20130191390	Automatic Identification of Abstract Online Groups - Online abstract groups, in which members aren't explicitly connected, can be automatically identified by computer-implemented methods. The methods involve harvesting records from social media and extracting content-based and structure-based features from each record. Each record includes a social-media posting and is associated with one or more entities. Each feature is stored on a data storage device and includes a computer-readable representation of an attribute of one or more records. The methods further involve grouping records into record groups according to the features of each record. Further still the methods involve calculating an n-dimensional surface representing each record group and defining an outlier as a record having feature-based distances measured from every n-dimensional surface that exceed a threshold value. Each of the n-dimensional surfaces is described by a footprint that characterizes the respective record group as an online abstract group.	07-25-2013
20130191391	PERSONALIZATION ENGINE FOR BUILDING A DYNAMIC CLASSIFICATION DICTIONARY - A dynamic classification dictionary is built for use in profiling and targeting users for additional relevant content. Behavioral data is gathered from user activity, and user documents and actions are categorized. Author-generated document classification information is analyzed and assigned a first taxonomic noun to characterize the document. User-generated tags characterizing a portion of the document are assigned a second taxonomic noun. Search terms that resulted in the user accessing the document are identified and assigned a third taxonomic noun. Attributes related to the manner in which the document was accessed are evaluated and assigned a fourth taxonomic noun. The document is processed using pattern rules to extract a fifth taxonomic noun. The taxonomic nouns are aggregated into a composite set of taxonomic nouns, and the dynamic classification dictionary is build by storing the composite set of taxonomic nouns.	07-25-2013
20130191392	ADVANCED SUMMARIZATION BASED ON INTENTS - A method for summarizing content using weighted Formal Concept Analysis (wFCA) is provided. The method includes (i) identifying, by a processor, one or more keywords in the content based on parts of speech, (ii) disambiguating, by the processor, at least one ambiguous keyword from the one or more keywords using the wFCA, (iii) identifying, by the processor, an association between the one or more keywords and at least one sentence in the content, and (iv) generating, by the processor, a summary of the content based on the association.	07-25-2013
20130191393	EMOTIONAL MATCHING SYSTEM AND MATCHING METHOD FOR LINKING IDEAL MATES - The disclosure relates to an emotional matching system and a matching method for linking ideal mates by finding optimal ideal mates by identifying the level of attraction to the appearance of the other member and tendencies toward music, movies, and food through an emotional-based approach using brainwave changes.	07-25-2013
20130198185	ATTRIBUTE-BASED IDENTIFICATION SCHEMES FOR OBJECTS IN INTERNET OF THINGS - Methods and arrangements for object identification. An identification request is received from different objects of a network. Attributes and values of each object are ascertained, and at least one attribute-value pair from each object is filtered out. An ID is generated for each object based on at least one remaining attribute-value pair from the filtering.	08-01-2013
20130198186	DETERMINATION OF RELATIONSHIPS BETWEEN COLLECTIONS OF DISPARATE MEDIA TYPES - Architecture that automatically determines relationships between vector spaces of disparate media types, and outputs ranker signals based on these relationships, all in a single process. The architecture improves search result relevance by simultaneously clustering queries and documents, and enables the training of a model for creating one or more ranker signals using simultaneous clustering of queries and documents in their respective spaces.	08-01-2013
20130198187	Classifying Data Using Machine Learning - Techniques for data classification include receiving, at a local computing system, a query from a remote computing system, the query comprising data associated with a commodity, the data comprising one or more attributes of the commodity; matching the one or more attributes of the commodity with one or more terms of a plurality of terms in a word matrix that includes a plurality of nodes that each include a term of the plurality of terms and a plurality of links that each connect two or more nodes and define a similarity between the two or more nodes; generating, based on the matching, a numerical vector for the business enterprise commodity; identifying one or more classification regions that each define a classification of the commodity; and preparing the classifications for display at the remote computing system.	08-01-2013
20130198188	Apparatus and Methods For Anonymizing a Data Set - Methods and systems are disclosed for anonymizing a dataset that correlates a set of entities with respective attributes. The method comprises determine clusters of similar entities. Determining the clusters comprises (1) partitioning the entities into a first group with similar attributes to one another and a complement group of entities with similar attributes to one another and (2) recursively repeating the partitioning on the groups until every group meets one or more criteria. The partitioning a group comprises choosing a reference entity from the group, determining a symmetric set of attributes based on the reference entity attributes and on an average of the group's attributes, and assigning each entity to the first or second group depending on whether it's attributes are more similar to those of the reference user or to those of the symmetric set.	08-01-2013
20130198189	MEDIUM STORING BUDGET DETERMINATION SUPPORT PROGRAM, BUDGET DETERMINATION SUPPORT METHOD, AND BUDGET DETERMINATION SUPPORT APPARATUS - A computer-readable recording medium storing a program for causing a computer to execute a procedure that supports budget determination, the procedure includes: receiving designation of a group that makes up an organization having a hierarchical structure; selecting, between a lower level or a higher level adjacent to a level of the group for which the designation has been received, a level to be referred to when a budget of the group for which the designation has been received is corrected; and correcting the budget relating to the designated group using, among budgets of groups at levels of the organization, a budget of a group belonging to the level to be referred to.	08-01-2013
20130198190	SELECTION AND OPERATIONS ON AXES OF COMPUTER-READABLE FILES AND GROUPS OF AXES THEREOF - An embodiment of the present invention provides a method of applying a set function on documents in an axis-based interface, the method comprising grouping a plurality of documents in a plurality of axes of documents, the documents from each axis of documents having commonality, grouping a plurality of axes of documents in a group of axes of documents, the documents from the group of axes of documents being disposed along a collation function, wherein at least some of the axes of documents are adapted to be used by a set function adapted to mathematically collectively manipulate documents thereof. Groups of axes of documents are also adapted to be used by a set function adapted to mathematically collectively manipulate documents thereof.	08-01-2013
20130198191	METHOD FOR DETECTING COMMUNITIES IN MASSIVE SOCIAL NETWORKS BY MEANS OF AN AGGLOMERATIVE APPROACH - Disclosed is a method for detecting communities in massive social networks by means of an agglomerative approach in which core communities are built and gradually clustered in an iterative manner into higher level communities until the algorithm converges (a stop condition is met), whereby it becomes possible to easily trace how the communities are being formed, resulting in an easily explainable model that allows the detection of overlapping communities. The disclosed method starts from data representing social interactions between individuals, building a weighted social graph where the vertices represent individuals and the links represent social relationships between individuals.	08-01-2013
20130204874	Hyper Adapter and Method for Accessing Documents in a Document Base - The present invention relates to a hyper adapter (	08-08-2013
20130204875	Automatic Configuration Of A Product Data Management System - The invention relates to a computer-implemented configuration system, a method of configuration and a computer program product. The configuration system serves to automatically configure a product data management system (P). The product data management system (P) serves to manage parts data sets (	08-08-2013
20130212103	RECORD LINKAGE BASED ON A TRAINED BLOCKING SCHEME - Some implementations disclosed herein provide techniques and arrangements to train a blocking scheme using both labeled data and unlabeled data. For example, training the blocking scheme may include iteratively: learning a conjunction, identifying first matches in the labeled data and the unlabeled data that are uncovered by the conjunction, and identifying second matches in the labeled data and the unlabeled data that are covered by the conjunction. The conjunction learned in each iteration may be combined using a disjunction. A search engine may use the search engine when searching for records that match an entity.	08-15-2013
20130212104	SYSTEM AND METHOD FOR DOCUMENT ANALYSIS, PROCESSING AND INFORMATION EXTRACTION - The present invention is directed to a method and computer system for representing a dataset comprising N documents by computing a diffusion geometry of the dataset comprising at least a plurality of diffusion coordinates. The present method and system stores a number of diffusion coordinates, wherein the number is linear in proportion to N.	08-15-2013
20130212105	INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM - There is provided an information processing apparatus including a cluster information acquiring unit that acquires information of clusters into which users and items are classified, based on item use logs of the users, an item score calculating unit that calculates scores of the items with respect to the users, based on first scores showing attributions of the users with respect to the clusters and second scores being set for the respective clusters and showing attributions of the items with respect to the clusters, which are included in the information of the clusters, and an item selecting unit that selects at least one item from the items according to the scores of the items.	08-15-2013
20130212106	APPARATUS FOR CLUSTERING A PLURALITY OF DOCUMENTS - According to an aspect, there are provided an apparatus, a program for causing a computer to function as such an apparatus, and a method, wherein the apparatus includes a selection section for selecting a plurality of sample documents from a plurality of documents and a first parameter generation section for analyzing the plurality of sample documents to generate an initial parameter matrix expressing a probability that each of a plurality of words included in the plurality of sample documents is included in each of a plurality of topics. The apparatus also includes a second parameter generation section for analyzing the plurality of documents by using each value included in the initial parameter matrix as an initial value to generate a parameter matrix expressing a probability that each of a plurality of words included in the plurality of documents is included in each of a plurality of topics.	08-15-2013
20130218893	EXECUTING IN-DATABASE DATA MINING PROCESSES - Various embodiments of systems and methods for executing in-database data mining processes are described herein. In one aspect, the method includes identifying a newly created chain comprising a plurality of components connected together to perform a data mining task, generating an identifier (ID) for the newly created chain, identifying metadata associated with the chain, and storing the ID and the metadata related to the newly created chain into a repository. Each component comprises a parameterized script including one or more parameters. Values of the parameters are stored in the repository. The parameters within the scripts are replaced by their corresponding values and the components of the chain are executed sequentially to generate a final output.	08-22-2013
20130218894	METHOD, APPARATUS, SYSTEM AND INTERFACE FOR GROUPING ARRAYS OF COMPUTER-READABLE FILES - A method is presented to combine a plurality of arrays of computer-readable files along a common collation function. The arrays can be embodied as axes of documents disposed along a timeline. Such a combination creating a group of axes of documents improving the graphical interactions among two groups of documents. An interface, a computerized system and a method for enabling same is equally hereby presented.	08-22-2013
20130218895	Method and Apparatuses for Selectively Accessing Data Elements in a Data Library - For selectively accessing data elements in a data library, an image data object which can be shown on a display is provided. A plurality of image subobjects are defined which respectively correspond to a portion of a representation of the image data object. The data elements are associated with at least a respective one of the image subobjects. One of the image subobjects is selected from a representation of the image data object on the display. The data elements which are associated with the selected image subobject are then reproduced. A data object which comprises the data elements in the data library, the image data object and the association between the data elements and the image subobjects is transmitted to a communication terminal via a telecommunication network for the purpose of selectively accessing the data elements.	08-22-2013
20130226920	Systems, Methods and Apparatus for Identifying Links among Interactional Digital Data - The invention provides in some aspects methods of digital data processor-based analysis of digital data that represent interactions to identify distinct individuals and/or the entities with which they are affiliated (e.g., households, businesses, social or other groups) involved in those interactions. The methods can be employed, for example, to analyze digital data representing retail purchase, marketing and visitor interactions for tracking and/or reporting purposes.	08-29-2013
20130226921	IDENTIFYING AN AUTO-COMPLETE COMMUNICATION PATTERN - A method for identifying an auto-complete communication pattern within a sequence of request entities includes grouping the request entities into a plurality of clusters according to a criterion. Clusters are removed from the plurality according to at least one of pattern analysis, a cluster size, and a cluster timing. Remaining clusters are identified as having an auto-complete communication pattern.	08-29-2013
20130226922	Identification of Complementary Data Objects - In one aspect, the description relates to identifying complementary data objects, including providing a plurality of data objects, applying a clustering algorithm for grouping at least some of the data objects into two or more clusters, for each of the clusters, calculating a cluster center, calculating, for at least a first one of the cluster centers, a complementary cluster center, determining a second cluster center of a second cluster, the second cluster center being determined as the one of the cluster centers having the smallest distance in respect to the complementary cluster center, selecting at least one data object of the determined second cluster. Other features and aspects may be realized, depending upon the particular application.	08-29-2013
20130226923	Method and Device for Reassembling a Data File - Embodiments provide a method for reassembling a data file from a starting file fragment and a plurality of file fragments stored on a digital storage device. The method includes determining, from the plurality of file fragments, one or more matched file fragments which match the starting file fragment based on a first predetermined criterion; associating the one or more matched file fragments with the starting file fragment; and determining one or more candidate data files based on the one or more matched file fragments. The method further includes checking if more than one file fragments have been determined to match the starting file fragment based on the first predetermined criterion. If more than one matched file fragments have been determined to match the starting file fragment based on the first predetermined criterion, the method further includes selecting a candidate data file from the candidate data files determined for the matched file fragments as the reassembled data file based on a second predetermined criterion.	08-29-2013
20130226924	SYSTEM AND METHOD TO DETERMINE THE VALIDITY OF AN INTERACTION ON A NETWORK - A computer implemented method can determine validity of web-based interactions. Web-based interaction data relating to a web-based interaction may be accessed. The web-based interaction data may include aggregate measure data that may include a number of unique queries per web-based session. The validity of the web-based interaction may be determined based on the aggregate measure data.	08-29-2013
20130232146	SCALE BETWEEN CONSUMER SYSTEMS AND PRODUCER SYSTEMS OF RESOURCE MONITORING DATA - A consumer system receives capabilities metadata from a producer system that includes resource class metrics for a resource class included in the producer system. Next, the consumer system creates a rule that corresponds to one of the consumer system's managed entities. The rule includes one or more prescriptions that reference the resource class metrics and specify a periodicity, which informs the producer system as to a time interval for which to send prescription results that includes metric information pertaining to the resource class metrics. The consumer system sends the rule to the producer system and, in turn, the consumer system receives the prescription results from the producer system at the specified periodicity and applies the metric information to the managed entity.	09-05-2013
20130232147	GENERATING A TAXONOMY FROM UNSTRUCTURED INFORMATION - At least one term is extracted [	09-05-2013
20130238621	Entity Augmentation Service from Latent Relational Data - The subject disclosure is directed towards providing data for augmenting an entity-attribute-related task. Pre-processing is preformed on entity-attribute tables extracted from the web, e.g., to provide indexes that are accessible to find data that completes augmentation tasks. The indexes are based on both direct mappings and indirect mappings between tables. Example augmentation tasks include queries for augmented data based on an attribute name or examples, or finding synonyms for augmentation. An online query is efficiently processed by accessing the indexes to return augmented data related to the task.	09-12-2013
20130238622	USER APPARATUS, SYSTEM AND METHOD FOR DYNAMICALLY RECLASSIFYING AND RETRIEVING TARGET INFORMATION OBJECT - A system, method and user apparatus dynamically reclassify and retrieve target information object(s) among multiple information objects stored on a memory. Multiple attribute classifiers are corresponsive to the information objects. Displayable dynamical reclassifying hints (DRHs) are provided according to user input signal(s). When a first attribute classifier is determined by a central processing unit according to the user input signal, second attribute classifier(s) is determined and combined with one of the attribute classifiers together visibly on a display unit; wherein the second attribute classifier and the combined one of attribute classifier corresponds to same one(s) of the information objects. The DRH(s) combines the attribute classifiers with the same search results together, so as to eliminate possible repeated steps or processes that lead to the same search result(s), and also to reduce the remained selectable attribute classifiers and the following steps to retrieve the target information objects.	09-12-2013
20130238623	Method of Linking Electronic Database Records - A method of linking electronic database records, wherein each record is associated with a single member from a set of unique members, and wherein each member from the set of unique members has associated with it a plurality of identifiers for uniquely identifying the member. The method comprising the following steps. First, a set of combinations of identifiers is obtained by, for each record, determining a plurality of identifiers using the data stored in the record. Next, a set of clusters of linked combinations of identifiers is created, by creating a link between any combinations that have equal identifiers. For each cluster from the set of clusters, a quality value for the cluster is calculated, and any cluster whose quality value is below a pre-determined threshold is split into two or more clusters by removing one or more links between combinations in the original cluster, if in that case the resulting clusters have higher quality values than the original cluster. Finally, any records whose corresponding combinations of identifiers are members of the same cluster are linked.	09-12-2013
20130238624	SEARCH SYSTEM AND OPERATING METHOD THEREOF - A search system performs a method for conducting a search using a user device. The method includes receiving a search word through the user device. The method also includes displaying a homonym list indicating homonyms of the input search word on a display screen of the user device. The method further includes receiving a selection of one homonym item of the homonym list through the user device. The method still further includes conducting the search through an information search system using the selected homonym item as a search word.	09-12-2013
20130238625	COMPUTER SYSTEM AND METHOD FOR DE-IDENTIFICATION OF PATIENT AND/OR INDIVIDUAL HEALTH AND/OR MEDICAL RELATED INFORMATION, SUCH AS PATIENT MICRO-DATA - A computer-implemented method de-identifies data collected for patients. In at least one embodiment, the method comprises the sequential, non-sequential and/or sequence independent steps of providing information representative of at least one patient, at least one medical characteristic associated with at least one patient thereto, and a geographic area of the at least one patient, and providing at least one organizational structure for organizing medical characteristics. The method also includes associating the at least one organizational structure with at least one geographical area and at least one medical characteristic, and aggregating, in the at least one organizational structure, said information by medical characteristic and the at least one geographic area therein. Various alternative embodiments are additionally disclosed.	09-12-2013
20130238626	SYSTEMS AND METHODS FOR CLUSTER COMPARISON - Systems and methods for measuring similarity between a first set of clusters and a second set of clusters apply a first clustering procedure and a second clustering procedure to a set of objects to cluster the objects into a first set of clusters and a second set of clusters, respectively, calculate a similarity index between the first set of clusters and the second set of clusters, calculate an expected value of the similarity index, wherein the expected value is a value of the similarity index one would expect to obtain, on average, between a randomly generated third set of clusters and a randomly generated fourth set of clusters with a same number of clusters as the first set of clusters and the second set of clusters, respectively, and adjust the calculated similarity index based on the expected value of the similarity index.	09-12-2013
20130246422	SYSTEM AND METHOD FOR CLUSTERING HOST INVENTORIES - A method in one example implementation includes obtaining a plurality of host file inventories corresponding respectively to a plurality of hosts, calculating input data using the plurality of host file inventories, and then providing the input data to a clustering procedure to group the plurality of hosts into one or more clusters of hosts. The method further includes each cluster of hosts being grouped using predetermined similarity criteria. In more specific embodiments, each of the host file inventories includes a set of one or more file identifiers with each file identifier representing a different executable software file on a corresponding one of the plurality of hosts. In other more specific embodiments, calculating the input data includes transforming the host file inventories into a matrix of keyword vectors in Euclidean space. In further embodiments, calculating the input data includes transforming the host file inventories into a similarity matrix.	09-19-2013
20130246423	SYSTEM AND METHOD FOR SELECTIVELY GROUPING AND MANAGING PROGRAM FILES - A method in one embodiment includes determining a frequency range corresponding to a subset of a plurality of program files on a plurality of hosts in a network environment. The method also includes generating a first set of counts including a first count that represents an aggregate amount of program files in a first grouping of one or more program files of the subset, where each of the one or more program files of the first grouping includes a first value of a primary attribute. In specific embodiments, each program file is unknown. In further embodiments, the primary attribute is one of a plurality of file attributes provided in file metadata. Other specific embodiments include either blocking or allowing execution of each of the program files of the first grouping. More specific embodiments include determining a unique identifier corresponding to at least one program file of the first grouping.	09-19-2013
20130246424	SYSTEM AND METHOD FOR INTELLIGENT STATE MANAGEMENT - A method is provided in one example embodiment and it includes receiving a state request and determining whether a state exists in a translation dictionary for the state request. The method further includes reproducing the state if it is not in the dictionary and adding a new state to the dictionary. In more specific embodiments, the method includes compiling a rule, based on the state, into a given state table. The rule affects data management for one or more documents that satisfy the rule. In yet other embodiments, the method includes determining that the state represents a final state such that a descriptor is added to the state. In one example, if the state is not referenced in the algorithm, then the state is released. If the state is referenced in the algorithm, then the state is replaced with the new state.	09-19-2013
20130246425	DOCUMENT VISUALIZATION SYSTEM - A document visualization system comprises an extraction unit (	09-19-2013
20130246426	Document Classification and Characterization - Data is received that characterizes each of a plurality of documents within a document set. Based on this data, the plurality of documents are grouped into a plurality of stacks using one or more grouping algorithms. A prime document is identified for each stack that includes attributes representative of the entire stack. Subsequently, provision of data is provided that characterizes documents for each stack including at least the identified prime document to at least one human reviewer. User-generated input from the human reviewer is later received that categorized each provided document and data characterizing the user-generated input can then be provided. Related apparatus, systems, techniques and articles are also described.	09-19-2013
20130246427	INFORMATION PROCESSING APPARATUS, COMPUTER-READABLE RECORDING MEDIUM, AND INFORMATION PROCESSING METHOD - An information processing apparatus includes a first storage device that stores in a first storage area a first data group that includes a first plurality of data to be processed successively, and stores in a second storage area a second data group that includes a second plurality of data to be processed successively, a second storage device that includes a third storage area that stores a command to access data stored in the first storage area and a fourth storage area that stores a command to access data stored in the second storage area, and a processor configured to store, in a corresponding storage area of the third storage area and the fourth storage area, a received command, select one of the third storage area and the fourth storage area, and process one or more command stored in a selected storage area.	09-19-2013
20130246428	SYSTEM FOR GENERATING A TABLE - The present invention relates to a system for generating a table comprising generating means for generating a table which contains at least a column or line depicting one or more first categories and at least a column or line depicting first values associated with said first categories and wherein the system further comprises selecting means for selecting one of said first categories by a user and adding means for enlarging the table upon selection of a category by said selecting means, said adding means being adapted to enlarge the table by adding a new column or line which comprises second categories into which said selected first category may be subdivided as well as second values associated with said second categories and wherein said new column or line does not comprise categories into which non selected first categories may be subdivided.	09-19-2013
20130254200	Collection and Categorization of Configuration Data - In an embodiment, a method is provided for collecting configuration data. In this example, configuration data associated with an application is searched. Additionally, metadata associated with the configuration data is searched. Changes made to the configuration data are detected, and the changes and associated metadata are stored in a storage device. The changes are then categorized based on the metadata.	09-26-2013
20130254201	DATA STRUCTURE, DATA STRUCTURE GENERATION METHOD, INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING SYSTEM, AND COMPUTER-READABLE STORAGE MEDIUM HAVING STORED THEREIN INFORMATION PROCESSING PROGRAM - A method for generating a tree-type data structure composed of a plurality of data strings includes the steps of: summing, with respect to a plurality of data strings classified in a parent node, the numbers of data types of data, respectively, at at least one given string position in each of the plurality of data strings; and classifying, based on the numbers of the data types respectively summed at the at least one given string position in the summing step, the plurality of data strings into a plurality of child nodes, for the respective data types at a given string position.	09-26-2013
20130254202	PARALLELIZATION OF SYNTHETIC EVENTS WITH GENETIC SURPRISAL DATA REPRESENTING A GENETIC SEQUENCE OF AN ORGANISM - A method, system, and computer program product for parallelization of updating synthetic events with genetic surprisal data comprising dividing the synthetic event into cohort parts and assigning the cohort parts to one of a plurality of computer processing elements. Within each processing element: searching data records of patients for genetic surprisal data; generating a cluster comprising a centroid by populating the cluster based on all of the matches of the data records; calculating a new centroid for each cluster; calculating a Euclidean distance in multiple dimensions for each match of data records to the new centroid for each cluster; reassigning each match of data to the new centroid of each cluster based on the shortest calculated Euclidean distance to the new centroid for each cluster; and determining at least one cohort part from the clusters and recombining the cohort parts into updated synthetic events based on the metadata.	09-26-2013
20130254203	ORGANIZING NEARBY PICTURE HOTSPOTS - A method of accessing an image database containing location data and determining one or more clusters of the digital images based on their location data. A hotspot location is determined for representing the cluster of the digital images and the results are stored for later access. The computer is connected to a network and receives data from a device including data identifying a current location. After determining that the device is within a selected notification distance from the hotspot location, a notification is transmitted over the network.	09-26-2013
20130254204	Method and Apparatus of Publishing Information - The present disclosure discloses a method and an apparatus of publishing information in order to solve the problems of low efficiency and accuracy of published information in existing technology. The method segments primary information of a current page, extracts at least one feature term from the current page, determines a number of times that the extracted feature term appears in the current page, determines a category of the current page based on the determined number of times that the feature term appears in the current page and a set category model, and publishes relevant information that belongs to the determined category in the current page. By directly extracting a feature term from a current page and determining a category of the current page based on a number of times that the feature term appears in the current page and a set category model, the exemplary embodiments do not need to perform manual labeling for the current page. As such, the efficiency of information publication can be improved. Furthermore, the accuracy of the information publication is increased because no human error is introduced.	09-26-2013
20130254205	PROCESSING USER PROFILES OF USERS IN AN ELECTRONIC COMMUNITY - A method and system for processing user profiles of a plurality of users in an electronic community. Noun phrases are extracted from activities of each user logged in an activity log server, each user having an existing user profile stored in a user profile and relationship database that is external to the activity log server. The existing user profiles in the user profile and relationship database are updated from the extracted noun phrases, a keyword being associated with each determined noun phrase and being within a semantic hierarchical tree, the updating based on a usage frequency of the extracted noun phrases and an importance value of the keywords.	09-26-2013
20130262465	FULL AND SEMI-BATCH CLUSTERING - A method for clustering documents is provided. Each document is represented by a multidimensional data point. The data points are initially assigned to a respective cluster and serve as their initial representative points. Thereafter, in an iterative process, the data points are clustered among the clusters, by assigning the data points to the clusters based on a comparison measure of each data point with the cluster or its representative point, and a threshold of the comparison measure. Based on this clustering, a new representative point for each of the clusters can be computed. Optionally, overlapping clusters are merged. For the next iteration, the new representative points are used as the representative points. An assignment of the documents to the clusters is output, based on a clustering of the data points in the latest iteration. Multiple batches may be processed, retaining the initial clusters to which the original batch was assigned.	10-03-2013
20130262466	GROUP WORK SUPPORT METHOD - A method of supporting a group work includes collecting input contents, which are input by participants via terminals, from the terminals; arranging the collected input contents into input content groups based on groups to which the participants belong; setting a representative flag on each of representative input contents selected from the respective input content groups; extracting the representative input contents and matching input contents, which match a predetermined extracting condition and are different from the representative input contents, from the collected input contents; and displaying a list of the representative input contents and the matching input contents on a display device that all the participants are able to view at a same time.	10-03-2013
20130262467	METHOD AND APPARATUS FOR PROVIDING TOKEN-BASED CLASSIFICATION OF DEVICE INFORMATION - An approach is provided for providing token-based classification of device information. A token management platform determines a plurality of tokens. The tokens include at least in part one or more keywords, one or more representative media items, or a combination thereof. The token management platform processes and/or facilitates a processing of a communication history, one or more personal information sources, or a combination thereof associated with a user to determine one or more frequency counts of respective one or more of the tokens. The token management platform then determines to cause, at least in part, a generation of recommendation information based, at least in part, on the one or more frequency counts of the respective one or more tokens.	10-03-2013
20130268531	Finding Data in Connected Corpuses Using Examples - In one embodiment, datasets are stored in a catalog. The datasets are enriched by establishing relationships among the domains in different datasets. A user searches for relevant datasets by providing examples of the domains of interest. The system identifies datasets corresponding to the user-provided examples. The system them identifies connected subsets of the datasets that are directly linked or indirectly linked through other domains. The user provides known relationship examples to filter the connected subsets and to identify the connected subsets that are most relevant to the user's query. The selected connected subsets may be further analyzed by business intelligence/analytics to create pivot tables or to process the data.	10-10-2013
20130268532	Clustered Information Processing and Searching with Structured-Unstructured Database Bridge - Systems and methods for indexing information and for performing searches are disclosed. In these systems and methods information is “ingested” into the system by clustering the information using a clustering algorithm such as k-means or k-medoids clustering. During the clustering process, a hybrid distance measurement is used that allows the systems and methods to determine similarity across a number of different types of information. Once the information is clustered, it is stored and “mirrored” both in a structured (e.g., relational) data repository and in an unstructured data repository. Methods according to the invention allow the retrieval of both direct search results and search results including related concepts. After clustered information is stored, future searches can be performed by searching the stored results in whichever data repository is most appropriate for the context.	10-10-2013
20130275429	SYSTEM AND METHOD FOR ENABLING CONTEXTUAL RECOMMENDATIONS AND COLLABORATION WITHIN CONTENT - A system for enabling contextual recommendations and collaboration recommendations, based on a user's current work, comprising a plurality of content collector software applications adapted to interface with a plurality of content management applications, an indexing engine software application, an expanded social network graph database, and a predictive content intelligence software application. The plurality of content collector software applications receive documents, document fragments, or other content objects from the plurality of content management applications, the indexing engine software application indexes the retrieved documents, document fragments, or other content objects and modifies the expanded social network graph database using results of the indexing, and the predictive content intelligence software application, using at least the results of the indexing and the expanded social network graph database, identifies at least a plurality of other content objects and a plurality of people that are relevant to the received documents, document fragments, or other content objects.	10-17-2013
20130275430	System and Method for Visually Representing Data - A computer-based system and method evaluates at least one data set, generates grouping based on the data set, and visually represents these groupings as a basis by which individual elements of the data set are evaluated.	10-17-2013
20130275431	VISUAL CLUSTERING METHOD - A method of visually clustering a database of component parts includes indexing a set of image data associated with the component parts, and storing the indexed image data in the database. The indexed image data is clustered dependent upon a visual criterion.	10-17-2013
20130282720	MECHANISM FOR FACILITATING EVALUATION OF DATA TYPES FOR DYNAMIC LIGHTWEIGHT OBJECTS IN AN ON-DEMAND SERVICES ENVIRONMENT - In accordance with embodiments, there are provided mechanisms and methods for facilitating evaluation of data types for dynamic lightweight objects in an on-demand services environment. In one embodiment and by way of example, a method includes uploading a data file having data at a first computing device in response to a request, and detecting data types relating to the data within the data file. The detecting includes scanning data rows and data columns of the data file. The method may further include classifying the detected data types into one or more categories, and creating one or more dynamic objects based on the one or more categories.	10-24-2013
20130282721	DISCRIMINATIVE CLASSIFICATION USING INDEX-BASED RANKING OF LARGE MULTIMEDIA ARCHIVES - Devices, systems, and methods of performing feature detection on a set of multimedia files are disclosed. One method of organization includes identifying a feature from each multimedia file within the set of multimedia files wherein each file has one feature, organizing the features based on their similarities wherein similar features are grouped based upon a proximity in a feature space and a representative feature is identified for each group, receiving a detection model having one or more detection criteria the detection model having previously been trained for detection using the organized features, and using the representative features to apply the detection model in a decreasing order of detection probability in order to detect the files satisfying the detection criteria within the set of multimedia files.	10-24-2013
20130282722	CLASSIFICATION OF DIGITAL CONTENT BY USING AGGREGATE SCORING - Aggregate scoring is used to help classify digital content such as content uploaded to multi-user websites (e.g., social networking websites). In one embodiment, specific categories are used that relate to a social implication of content. For example, text, images, audio or other data formats can provide communication perceived to fall into categories such as violent, abusive, rights management, pornographic or other types of communication. The categories are used to provide a raw score to items in various groupings of a site's content. Where items are related to other items such as by organizational, social, legal, data-driven, design methods, or by other principles or definitions, the related items' raw scores are aggregated to achieve a score for a particular grouping of items that reflects, at least in part, scores from two or more of the related items.	10-24-2013
20130282723	Maintaining A Historical Record Of Anonymized User Profile Data By Location For Users In A Mobile Environment - A system and method are provided for maintaining a historical record of anonymized user profile data for mobile device users. In one embodiment, a central system, which includes one or more servers, operates to obtain current locations and user profiles for users of mobile devices. The central system processes the current locations and the user profiles of the users over time to maintain a historical record of anonymized user profile data by location. By anonymizing the user data, privacy of the users of the mobile devices is maintained. The central system may then use the historical record of anonymized user profile data to respond to historical requests. The historical requests may be made by users of the mobile devices, subscribers, and/or third party services.	10-24-2013
20130282724	METHOD AND SYSTEM TO AUTOMATICALLY GENERATE SOFTWARE CODE - There is provided a method to automatically generate software code. The method receives a request for the data, queries at least two data sources for the data based on the request; and receiving results that include the data that is populated to at least one data object.	10-24-2013
20130290333	SYSTEM FOR EXTRACTING CUSTOMER FEEDBACK FROM A MICROBLOG SITE - A system for extracting customer feedback from a microblog site includes a retrieval unit coupled to the microblog site to capture microblog updates. A filter unit coupled to the retrieval unit filters the captured microblog updates according to filter criteria that remove non-actionable items from the captured microblog updates. A learning unit coupled to the filter unit prioritizes the filtered microblog updates, and a classification unit coupled to the learning unit classifies the filtered and prioritized microblog updates. An action unit coupled to the classification unit performs appropriate actions based on the classified, filtered and prioritized microblog updates.	10-31-2013
20130290334	MANAGING STORAGE OF DATA ACROSS DISPARATE REPOSITORIES - In a method for managing storage of data across a plurality of disparate repositories, a partitioning strategy for storing the data into a plurality of partitions in at least one of a plurality of disparate repositories is acquired based upon a characteristic of the data. In addition, global metadata that, describes the partitioning strategy is acquired and the global metadata is implemented in a plurality of disparate repositories to enable performance of the partitioning strategy in storing the data in the plurality of partitions across the plurality of disparate repositories in a location agnostic manner.	10-31-2013
20130290335	PARTITIONING MANAGEMENT OF SYSTEM RESOURCES ACROSS MULTIPLE USERS - Exemplary embodiments for partitioning management of storage resources in a computing storage environment across multiple users including an existing administrator and an existing non-administrator are provided. A method includes assigning the existing non-administrator a default user resource scope that is more limited than a user resource scope assigned to the existing administrator, associating existing storage resources with a default resource group, creating resource group objects for each new resource group, setting attributes in the resource group objects to define policies for storage resources to be associated with a corresponding new resource group, reassigning existing non-administrator from the default user resource scope to a new user resource scope, reassigning the existing storage resources from the default resource group to the new resource group, and applying the policies and access scopes allowing the existing non-administrator to create new storage resources within a scope of the corresponding new user resource scope.	10-31-2013
20130290336	FLOW LINE DETECTION PROCESS DATA DISTRIBUTION SYSTEM, FLOW LINE DETECTION PROCESS DATA DISTRIBUTION METHOD, AND PROGRAM - There is provided a flow line detection process data distribution system which can distribute data detected from a mobile body to precisely perform flow line detection when flow line detection is realized by distribution processing. An area processing means	10-31-2013
20130297603	MONITORING METHODS AND SYSTEMS FOR DATA CENTERS - A monitoring system includes a database storing configuration information about a plurality of objects in the data center; a first inventory instance that adds a first object to the database, where the first inventory instance classifies the first object based on a set of classification rules to select a set of monitoring rules for the first object based on its classification and add configuration information about the first object to the configuration database; and a first monitoring instance to monitor the first object, the monitoring instance monitoring status of the first object based on respective configuration information in the database; at least one of the first inventory instance and the first monitoring instance identifying a further object functionally connected to the first object, the further objects added to the database by the first or a second inventory instance and monitored by the first or a second monitoring instance.	11-07-2013
20130297604	ELECTRONIC DEVICE AND METHOD FOR CLASSIFICATION OF COMMUNICATION DATA OBJECTS - A method, system and electronic device are provided for classification of data objects such as messages. A number of rule engines, each of which may be associated with a different application or module, are provided on the electronic device. For each data object obtained by the electronic device, matching rule engines are identified, and the data object is processed by the matching rule engines to determine one or more classification values for the data object. The determined classification is stored in association with a data object identifier. Data objects can be subsequently collated according to their classification, or aggregations of data object listings can be collected and displayed in a plurality of views corresponding to the various classifications.	11-07-2013
20130297605	SYSTEM, METHOD, AND COMPUTER PROGRAM PRODUCT FOR PERFORMING GRAPH COLORING - A system, method, and computer program product are provided for performing graph coloring. In use, a graph with a plurality of vertices is identified. Additionally, the plurality of vertices of the graph is categorized, where the categorizing of the plurality of vertices is optimized.	11-07-2013
20130297606	SYSTEMS AND METHODS FOR DETECTING, IDENTIFYING AND CATEGORIZING INTERMEDIATE NODES - A system and method for obtaining node information from a variety of potential sources and storing the information in a logical repository, and a system and method for identifying and categorizing Intermediate Nodes using a combination of requesting and responding node information.	11-07-2013
20130297607	IDENTIFICATION OF PATTERN SIMILARITIES BY UNSUPERVISED CLUSTER ANALYSIS - A method is provided for unsupervised clustering of data to identify pattern similarities. A clustering algorithm randomly divides the data into k different subsets and measures the similarity between pairs of datapoints within the subsets, assigning a score to the pairs based on similarity, with the greatest similarity giving the highest correlation score. A distribution of the scores is plotted for each k. The highest value of k that has a distribution that remains concentrated near the highest correlation score corresponds to the number of classes having pattern similarities.	11-07-2013
20130297608	CONTENT PRESENTATION DEVICE, CONTENT PRESENTATION TERMINAL, CONTENT PRESENTATION SYSTEM, CONTENT PRESENTATION PROGRAM, AND CONTENT PRESENTATION METHOD - A content presentation device including: an obtaining unit which obtains a preference of a first user, a preference of a second user, content identification information identifying a plurality of content items associated with the second user, and relationship information indicating a relationship between the first user and the second user; a content evaluating unit which evaluates display of each of the content items, based on the relationship information and a preference correlation between the preference of the first user and the preference of the second user, wherein the content evaluating unit gives a high evaluation to a content item associated with the second user, the content item having the preference correlation lower than a certain preference correlation and an affinity score higher than a certain affinity score, and the affinity score indicating strength of the relationship between the first user and the second user.	11-07-2013
20130297609	PROVIDING RECONSTRUCTED DATA BASED ON STORED AGGREGATE DATA IN RESPONSE TO QUERIES FOR UNAVAILABLE DATA - In an embodiment, a method comprises dividing collected data into data clusters based on proximity of the data and adjusting the clusters based on density of data in individual clusters. Based on first data points in a first cluster, a first average point in the first cluster is determined. Based on second data points in a second cluster, a second average point in the second cluster is determined. Aggregate data, comprising the first average point and the second average point, are stored in storage. Upon receiving a request to provide data for a particular coordinate, the reconstructed data point is determined by interpolating between the first average point and the second average point at the particular coordinate. Accordingly, aggregated data may be stored and when a request specifies data that was not actually stored, a reconstructed data point with an approximated data value may be provided as a substitute.	11-07-2013
20130297610	CHANGED FILES LIST WITH TIME BUCKETS FOR EFFICIENT STORAGE MANAGEMENT - There is provided, in a computer processing system, an apparatus for managing object data. The apparatus includes a changed objects manager for creating and managing a changed objects list that at least identifies the objects that have changed based on time of change. The changed objects list is associated with a plurality of time buckets. Each of the plurality of time buckets is associated with a respective date and time period and with object change records for objects having a timestamp falling within the respective date and time period. Each of the object change records is associated with a unique object identifier and the timestamp for a corresponding one of the objects. The timestamp specifies a date and a time corresponding to a latest one of a creation time or a most recent update time for the corresponding one of the objects.	11-07-2013
20130297611	METHOD AND APPARATUS FOR PROVIDING TEMPORAL CONTEXT FOR RECOMMENDING CONTENT FOR CONSUMPTION BY A USER DEVICE - A method for operating a system to provide temporal context for recommending items for consumption by a user device is described. The method comprises maintaining a record of items consumed by the user device or a group of devices within a reference period, together with the time of consumption of each item, and a content descriptor associated with each item. Temporal consumption periods are identified within the reference period, each consumption period spanning the consumption of one or more items with similar content descriptors, and each consumption period is associated with its respective content descriptor. An aggregated list is created of consumption periods recorded over a plurality of reference periods. Clusters of similar consumption periods are identified in the aggregated list, and recurring temporal patterns for user device behaviour are identified in each cluster. A profile is created for each user device based on the clusters and the recurring temporal patterns for each cluster, and this profile is used to provide the temporal context at the user device.	11-07-2013
20130304737	SYSTEM AND METHOD FOR THE CLASSIFICATION OF STORAGE - A classification system executing on one or more computer systems includes a processor and a memory coupled to the processor. The memory includes a discovery engine configured to navigate through non-volatile memory storage to discover an identity and location of one or more files in one or more computer storage systems by tracing the one or more files from file system mount points through file system objects and to disk objects. A classifier is configured to classify the one or more the files into a classification category. The one or more files are associated with the classification category and stored in at least one data structure. Methods are also provided.	11-14-2013
20130304738	MANAGING MULTIMEDIA INFORMATION USING DYNAMIC SEMANTIC TABLES - Systems, methods and computer program products manage collections of information using latent semantic analysis. The collections of information may be text based such as collections of documents or non-text data such as audio, image, video or multimedia data. Semantic information groups are created by grouping collections of information according to a degree of relatedness. A system allocates discontiguous node locations of one or more distributed databases to the semantic information groups. The system manages a dynamic semantic table that maps the discontiguous node locations to a semantic virtual table having a contiguous memory space.	11-14-2013
20130304739	COMPUTING SYSTEM WITH DOMAIN INDEPENDENCE ORIENTATION MECHANISM AND METHOD OF OPERATION THEREOF - A computing system includes: a gather module configured to gather a distribution of a class bias score for a feature and across multiple domains; a transformation module, coupled to the gather module, configured to generate a transformation for a characteristic of a domain independence based on the class bias score; and a consolidation module, coupled to the transformation module, configured to compute a domain-independent class-bias score based on the transformation.	11-14-2013
20130304740	CLASSIFYING DATA USING MACHINE LEARNING - Techniques for data classification include matching one or more attributes of a commodity with one or more terms of a plurality of terms in a word matrix; generating, based on the matching, a vector for the commodity; and identifying, based on the vector, one or more classification regions that each define a classification of the commodity.	11-14-2013
20130311467	SYSTEM AND METHOD FOR RESOLVING ENTITY COREFERENCE - A method and a system for coreference resolution are provided. The method includes receiving a set of document clusters, each cluster in the set of document clusters including a set of text documents. Instances of each of a set of candidate named entities are identified in the document clusters. For a pairs of the candidate named entities, at least one socio-temporal feature is computed that is based on the similarity of the distributions of identified instances of the respective candidate name entities among the document clusters. A decision for merging for the candidate named entities into a common real named entity is based on the socio-temporal features.	11-21-2013
20130311468	Data Model Pattern Updating in a Data Collecting System - A pattern analysing device (	11-21-2013
20130311469	METHOD FOR LINE UP CONTENTS OF MEDIA EQUIPMENT, AND APPARATUS THEREOF - A content arranging method and apparatus in a media equipment and recording medium that stores a program source associated with the method are provided. The content arranging method includes extracting time information associated with stored contents from meta data of each of the stored contents, classifying the stored contents based on the extracted time information and a time interval for arranging the stored contents, determining at least one time item corresponding to the time interval, and arranging each of the classified contents under a corresponding time item. The method arranges the stored contents in the media equipment based on a time so that a user readily retrieves a desired content.	11-21-2013
20130311470	AUTOMATIC CLASSIFICATION OF INTERPERSONAL RELATIONSHIP BASED ON SOCIAL NETWORKING ACTIVITIES - Provided are methods and systems for automatic categorizing of interpersonal relationships of a user of a social networking service with the contacts of the user in the social networking service. The monitoring of relationships of a user is performed by integration within the social networking service to receive communication data of the user. The communication data is pre-analyzed using a user predefined configuration, and then analyzed according to timing and content factors. The timing and content factors include a frequency, a date, a time, an intensity, and content of communication of the user and the contacts of the user. Based on the analyzing, the interpersonal relationships of the user and contacts of the user is categorized and optionally stored and provided to the user via a graphical output interface.	11-21-2013
20130311471	TIME-SERIES DOCUMENT SUMMARIZATION DEVICE, TIME-SERIES DOCUMENT SUMMARIZATION METHOD AND COMPUTER-READABLE RECORDING MEDIUM - A time-series document summarization (	11-21-2013
20130311472	IMAGING PROTOCOL UPDATE AND/OR RECOMMENDER - A method includes obtaining electronically formatted information about previously performed imaging procedures, classifying the information into groups of protocols based on initially selected protocols for the previously performed imaging procedures and generating data indicative thereof, identifying deviations between the classified information and the corresponding initially selected protocols for the previously performed imaging procedures, and generating a signal indicative of the deviations. A method includes recommending at least one of a plurality of protocols for an imaging procedure based on at least one of a score, a probability, or a pre-determined rule, which is based on extracted medical concepts from patient information and extracted medical concepts from previously imaged patient information, and generating a signal indicative of the recommendation.	11-21-2013
20130318085	Methods And Apparatus For Use In Adding Contacts Between Profiles Of Different Social Networks - Techniques for use in adding contacts between profiles of different social networks are described. For each candidate contact in a first profile, the technique involves identifying, for a selected attribute type, whether a match exists between a stored attribute of the candidate contact and a stored attribute of a contact in a second profile. If the match exists, then the candidate contact is recommended, queued, or added as a contact in the second profile. Otherwise, if no match exists, then the candidate contact is excluded from being recommended, queued, or added as a contact in the second profile. A learning mechanism may set the operating mode to a recommending mode, a queuing mode, or an adding mode for adding contacts. The operating mode may be set based on predetermined conditions or events.	11-28-2013
20130318086	DISTRIBUTED FILE HIERARCHY MANAGEMENT IN A CLUSTERED REDIRECT-ON-WRITE FILE SYSTEM - Management of a file hierarchy for a clustered file system can be distributed across nodes of the cluster. A cluster file hierarchy is accessed to determine location of a file in response to a request to write to a file. A first node maintains the cluster file hierarchy. It is determined that management of a fileset object, which represents a fileset that includes the file, has been delegated to a second node based, at least in part, on said accessing the cluster file hierarchy. A node file hierarchy maintained by the second node is accessed responsive to determining the delegation. The cluster file hierarchy represents filesets of the clustered file system and the node hierarchy represents a subset of one or more of the filesets. Location of the file is determined based, at least in part, on said accessing the node file hierarchy.	11-28-2013
20130318087	METHODS, SYSTEMS, AND COMPUTER PROGRAM PROUCTS FOR CATEGORIZING/RATING CONTENT UPLOADED TO A NETWORK FOR BROADCASTING - Methods, systems, and computer program products that automatically categorize and/or assign ratings to content (video and audio content) uploaded by individuals who want to broadcast the content to others via a communications network, such as an IPTV network, are provided. When an individual uploads content to a network, a network service automatically extracts an audio stream from the uploaded content. Words in the extracted audio stream are identified. For each identified word, a preexisting library of selected words is queried to determine if a match exists between words in the library and words in the extracted audio stream. The selected words in the library are associated with a particular content category or content rating. If a match exists between an identified word and a word in the library, the uploaded content is assigned a content category and/or rating associated with the matched word.	11-28-2013
20130325861	Data Clustering for Multi-Layer Social Link Analysis - Embodiments of the invention relate to a modeling activity area associated with groups of data items. Tools are provided to profile activity area involvement, both from the data item and from associated participants. The data items are placed into clusters and one or more activity areas are derived from the formed clusters. Each activity area is defined from the perspective of a single user. Participants in an activity area are connected to a user, but not necessarily to each other. The combination of formations of clusters and activity areas provides a multi-facetted organization of connections between data items and associated participants.	12-05-2013
20130325862	PIPELINED INCREMENTAL CLUSTERING ALGORITHM - Systems and methods are provided for large-scale, incrementing clustering. A plurality of processing nodes each include a processor and a non-transitory computer readable medium. The non-transitory computer readable medium stores a plurality of clusters of feature vectors and machine executable instructions for determining a plurality of values for a distance metric relating each of the plurality of clusters to an input feature vector and selecting a cluster having a best value for the distance metric. An arbitrator is configured to receive the selected cluster and best value for the distance metric from each of the plurality of processing nodes and determine a winning cluster as one of the selected clusters and a new cluster. A multiplexer is configured to receive the winning cluster and provide the winning cluster and a new input feature vector to each of the plurality of processing nodes.	12-05-2013
20130325863	Data Clustering for Multi-Layer Social Link Analysis - Embodiments of the invention relate to a modeling activity area associated with groups of data items. Tools are provided to profile activity area involvement, both from the data item and from associated participants. The data items are placed into clusters and one or more activity areas are derived from the formed clusters. Each activity area is defined from the perspective of a single user. Participants in an activity area are connected to a user, but not necessarily to each other. The combination of formations of clusters and activity areas provides a multi-facetted organization of connections between data items and associated participants.	12-05-2013
20130325864	SYSTEMS AND METHODS FOR BUILDING A UNIVERSAL MULTIMEDIA LEARNER - The present disclosure describes a method and system called “Universal Learner (UL),” which provides a unified framework to understand multimedia signals. The UL utilizes the loosely annotated multimedia data on the Web, analyses it in various signal domains, such as text, image, audio and combinations thereof, and builds an association graph called the “Multimedia Brain,” which basically comprises visual signals, audio signals, text phrases and the like that capture a multitude of objects, experiences and their attributes and the links among them that capture similar intent or functional and contextual relationships.	12-05-2013
20130325865	Method and Server for Media Classification - The embodiments of the present invention relates to a method and system for classifying media. The classification is achieved by using annotation ontolgies and by associating bottom level concepts of the annotation ontology tree with explanatory representation data of a selected representation domain and then comparing the explanatory representation data with transformation of the media in the selected representation domain. In this way tags can be generated which corresponds to bottom level concepts of the ontology tree which corresponds to explanatory representation data which can be found in the transformed media.	12-05-2013
20130332456	METHOD AND SYSTEM FOR DETECTING OPERATING SYSTEMS RUNNING ON NODES IN COMMUNICATION NETWORK - Fingerprinting operating systems running on nodes in a communication network. Responsive to obtaining an event to be analyzed with respect to a given node, generating a group of two or more OS profiles matching the event; generating a sufficient set of one or more significant events, i.e. events obtained in order to identify, among the matching OS profiles in the generated group, the OS profile uniquely characterizing the OS running on the given node; upon obtaining a significant event from the given node, generating a new group of one or more matching OS profiles, wherein said new group is generated in accordance with said obtained significant event and at least, with one event previously analyzed with respect to the given node; and identifying the OS running on the given node with the help of said generated new group of one or more matching OS profiles.	12-12-2013
20130332457	SYSTEMS AND METHODS OF SELECTION, CHARACTERIZATION AND AUTOMATED SEQUENCING OF MEDIA CONTENT - In a computer system having at least one output device, a set of media programs is accessed. A playlist first portion including a first plurality of the media programs of the set is created. The programs of the first portion are arranged with respect to one another according to a respective first characteristic value of each of the programs of the first portion.	12-12-2013
20130339354	METHOD AND SYSTEM FOR MINING TRENDS AROUND TRENDING TERMS - A method and system for mining trends around trending terms. The method includes determining a plurality of articles, from one or more websites, in relation to a first entity for a time period. The first entity is a trending term. The method also includes generating comment clusters for the plurality of articles. Each comment cluster is generated for associated article and includes plurality of user comments. The method further includes extracting one or more entities from plurality of user comments for each of the comment clusters, the one or more entities related to the first entity. Further, the method includes enabling selection of a second entity, from the one or more entities, by the user. Moreover, the method includes rendering one or more user comments corresponding to the first entity and the second entity for the time period. The system includes an electronic device, communication interface, memory, and processor.	12-19-2013
20130339355	CLUSTERING STREAMING GRAPHS - A system for clustering vertices in a streaming graph includes a structural sampler configured to receive a stream of edges. The structural sampler includes a reservoir manager configured to receive the stream of edges and create a structural reservoir and a support reservoir and a graph manager configured to receive the structural reservoir from the reservoir manager and to create a sampled graph from the structural reservoir, wherein the sampled graph includes one or more clusters that each include one or more connected vertices.	12-19-2013
20130339356	REAL-TIME DATA THRESHOLD GENERATION AND MONITORING - A first server is configured to receive one or more summarized data groups from a second server. Each summarized data group may include: information regarding a quantity of a group of records, where the group of records includes records associated with a record type and a time interval; information regarding a quantity of records associated with an indicator within the group of records; and information regarding a failure rate associated with the group of records based on the quantity of records associated with the group of records and the quantity of records associated with the indicator within the group of records. The first server is further configured to determine a threshold based on the summarized data groups and based on the failure rates associated with the summarized data groups and send an indication to the client device based on determining that the failure rate does not satisfy the threshold.	12-19-2013
20130339357	CLUSTERING STREAMING GRAPHS - Embodiments of the invention include methods for identifying one or more clusters in a streaming graph, the method includes receiving a stream of edges and sampling the stream of edges to create a structural reservoir and support reservoir. The method also includes creating a sampled graph from the structural reservoir and identifying the one or more clusters in the sampled graph by grouping one or more connected vertices in the sampled graph.	12-19-2013
20130339358	Sharing Information With Other Users - Systems and techniques are described for facilitating sharing information. Some embodiments can receive a set of data items that is to be analyzed for sharing, analyze the set of data items based on a first set of criteria to obtain a subset of the set of data items that is a likely candidate for sharing, and present the subset of the set of data items to a first user. Additionally, some embodiments can receive a set of users that is to be analyzed for sharing information, analyze the set of users based on a second set of criteria to obtain a subset of the set of users with whom the information is likely to be shared, and present the subset of the set of users to the first user.	12-19-2013
20130339359	System and Method for Data Anonymization Using Hierarchical Data Clustering and Perturbation - A system and method for data anonymization using hierarchical data clustering and perturbation is provided. The system includes a computer system and an anonymization program executed by the computer system. The system converts the data of a high-dimensional dataset to a normalized vector space and applies clustering and perturbation techniques to anonymize the data. The conversion results in each record of the dataset being converted into a normalized vector that can be compared to other vectors. The vectors are divided into disjointed, small-sized clusters using hierarchical clustering processes. Multi-level clustering can be performed using suitable algorithms at different clustering levels. The records within each cluster are then perturbed such that the statistical properties of the clusters remain unchanged.	12-19-2013
20130339360	System And Method For Providing Classification Suggestions Using Document Injection - A system and method for providing classification suggestions using document injection is provided. Clusters of uncoded documents are accessed. A set of reference documents is obtained. Each reference document is associated with a classification code. A set of the uncoded documents selected from one or more of the clusters is identified and compared with the set of reference documents. Those reference documents that are similar to the set of uncoded documents are identified and injected into one or more of the clusters from which the set of uncoded documents is selected. The clusters and a visual suggestion for classification of at least one of the uncoded documents within one of the clusters are displayed.	12-19-2013
20130346408	NOTIFICATION CLASSIFICATION AND DISPLAY - A method can include receiving, by a notification module operable by a computing device, an instruction to generate a contextual notification and notification information associated with the instruction. The method also cam include generating, by the notification module and in response to receiving the instruction, a notification object. In some examples, the method can include assigning, by the notification module and based on the notification information, the notification object to at least one notification class from a plurality of notification classes. The example method can also include generating, by the computing device and based at least in part on the at least one notification class to which the notification object is assigned, the contextual notification by populating the notification object with the notification information; and outputting the contextual notification in a manner based at least in part on the at least one notification class.	12-26-2013
20130346409	Systems and Methods for the Determining Annotator Performance in the Distributed Annotation of Source Data - Systems and methods for determining annotator performance in the distributed annotation of source data in accordance embodiments of the invention are disclosed. In one embodiment of the invention, a method for clustering annotators includes obtaining a set of source data, determining a training data set representative of the set of source data, obtaining sets of annotations from a set of annotators for a portion of the training data set, for each annotator determining annotator recall metadata based on the set of annotations provided by the annotator for the training data set and determining annotator precision metadata based on the set of annotations provided by the annotator for the training data set, and grouping the annotators into annotator groups based on the annotator recall metadata and the annotator precision metadata.	12-26-2013
20130346410	INFORMATION PROCESSING APPARATUS AND METHOD, PROGRAM, AND RECORDING MEDIUM - Disclosed herein are an information processing apparatus and method, a program, and a recording medium, in which a content is recommended to each user on the basis of even the metadata that is assigned with no classification. A metadata analysis block resolves metadata acquired by a metadata acquisition block into components. A dictionary data generation block generates dictionary data in which genre is correlated with keyword and each component. An associated-information database generation block references the dictionary data to assign genre to the metadata which are assigned with no genre, thereby generating an associated-information database of content. An associated-information search block references the dictionary data to identify a genre from a keyword of interest data to search for associated information, thereby recommending content to the user. The present invention is applicable to personal computers or HDD recorders.	12-26-2013
20130346411	IDENTIFYING INCONSISTENCIES IN OBJECT SIMILARITIES FROM MULTIPLE INFORMATION SOURCES - A horizontal anomaly detection method includes receiving at plurality of objects described in a plurality of information sources, wherein each individual information source captures a plurality of similarity relationships between the objects, combining the information sources to determine a similarity matrix whose entries represent quantitative scores of similarity between pairs of the objects, and identifying at least one horizontal anomaly of the objects within the similarity matrix, wherein the horizontal anomalies are anomalous relationships across the plurality of information sources.	12-26-2013
20130346412	SYSTEM AND METHOD OF DETECTING COMMON PATTERNS WITHIN UNSTRUCTURED DATA ELEMENTS RETRIEVED FROM BIG DATA SOURCES - A method for detection of common patterns within unstructured data elements. The method includes extracting a plurality of unstructured data elements retrieved from a plurality of big data sources; generating at least one signature for each of the plurality of unstructured data elements; identifying common patterns among the generated signatures; clustering the signatures identified to have common patterns; and correlating the generated clusters to identify associations between their respective identified common patterns.	12-26-2013
20140006399	METHOD AND SYSTEM FOR RECOMMENDING WEBSITES	01-02-2014
20140006400	AUTOMATED ONLINE SOCIAL NETWORK INTER-ENTITY RELATIONSHIP MANAGEMENT	01-02-2014
20140006401	CLASSIFICATION OF DATA IN MAIN MEMORY DATABASE SYSTEMS	01-02-2014
20140006402	CONTENTS PROVIDING SCHEME USING IDENTIFICATION CODE	01-02-2014
20140006403	METHOD AND APPARATUS FOR SELECTING CLUSTERINGS TO CLASSIFY A DATA SET	01-02-2014
20140006404	RESOLVING DATABASE ENTITY INFORMATION	01-02-2014
20140006405	SYSTEM AND METHOD FOR CLUSTERING HOST INVENTORIES	01-02-2014
20140012847	STATISTICAL INSPECTION SYSTEMS AND METHODS FOR COMPONENTS AND COMPONENT RELATIONSHIPS - Embodiments of an inspection system and method for a collection of information objects, for example, a collection of executable software applications may be inspected for computer viruses, or a collection of genomes may be inspected for common or unique gene sequences. Information objects may contain identified sequences of instructions, each of which may be labeled with a symbol. In the software context, programming languages may include symbols that indicate functionality. In some embodiments, an inspection of the statistical properties of the information objects and their included symbols may allow for the symbols (and thus instruction sequences) to be grouped into logical components. In some embodiments, objects that include individual logical components may be grouped together. These groupings and their dependencies may be used to determine the structure of each object by detailing its constituent components, how they relate or depend on one another, and how the information object may function.	01-09-2014
20140012848	SYSTEMS AND METHODS FOR CLUSTER ANALYSIS WITH RELATIONAL TRUTH - Systems and methods for measuring similarity between a set of clusters and a set of object labels, wherein at least two of the object labels are related, receive a first set of clusters, wherein the first set of clusters was formed by clustering objects in a set of objects into clusters of the first set of clusters according to a clustering procedure; and calculate a similarity index between the first set of clusters and a set of object labels based at least in part on a relationship between two or more object labels in the set of object labels	01-09-2014
20140012849	MULTILABEL CLASSIFICATION BY A HIERARCHY - A technique of extracting hierarchies for multilabel classification. The technique can process a plurality of labels related to a plurality of documents, using a clustering process, to cluster the labels into plurality of clusterings representing a plurality of classes. The technique classifies the documents and predicts a plurality of performance characteristics, respectively, for the plurality of clusterings. The technique selects at least one of the clusterings using information from the performance characteristics and adds the selected clustering into a resulting hierarchy.	01-09-2014
20140012850	Method And Apparatus For Prioritizing Metadata - A method and an apparatus for prioritizing a metadata item associated to audio or video data are described. A metadata item is retrieved from a metadata repository or via an input. An analyzing unit determines a priority value of the metadata item using one of a plurality of prioritization methods. A storing unit then stores the priority value in a priority table and references the priority table in a metadata table.	01-09-2014
20140012851	APPARATUS AND METHOD FOR INCREMENTAL PHYSICAL DATA CLUSTERING - In a data storage and retrieval system wherein data arranged in nodes is stored and retrieved in pages, each page comprising a cluster of nodes, a method comprising: monitoring ongoing data retrieval to find retrieval patterns of nodes which are retrieved together and to identify changes in said retrieval patterns over time; and periodically reclustering the data nodes among said pages dynamically during usage of the data to reflect said changes, so that nodes more often retrieved together are migrated to cluster together and nodes more often required separately are migrated to cluster separately, thereby to keep small an overall number of page accesses of said data storage and retrieval system during data retrieval despite dynamic changes in patterns of data retrieval. The low number of page accesses allows the reclustering to be carried out concurrently with data access.	01-09-2014
20140019451	MULTI-LANGUAGE DOCUMENT CLUSTERING - A technique can include identifying a collection of documents to be clustered. The collection of documents can include foreign language documents and base language documents. The foreign language documents can be translated into the base language at a base language translation module. Keywords in the base language documents and keywords in the translated foreign language documents can be determined at a document indexing module. The base language documents can be clustered with the foreign language documents in a common set of document clusters based on the determined keywords in the base language documents and the determined keywords in the translated foreign language documents. In response to a search query in a first language, a listing of search results can be provided that includes documents in the first language and another language from the a common document cluster.	01-16-2014
20140019452	METHOD AND APPARATUS FOR CLUSTERING SEARCH TERMS - A method and apparatus for clustering search terms are provided by the present invention. The method includes: A, establishing a candidate search term set, wherein the candidate search term set comprises a first search term provided by a user, and a second search term related to the first search term; B, performing a clustering operation on the first search term and the second search term related to the first search term in the candidate search term set according to text characteristic and/or semantic characteristic of search term. The accuracy and relevance of search term clustering can be improved by use of the method.	01-16-2014
20140025677	Group Management Apparatus, Substrate Processing System and Method of Managing Files of Substrate Processing Apparatus - There is provided a group management apparatus connected to a substrate processing apparatus configured to store at least a configuration file, the group management apparatus including a controller configured to: receive a command for generating a file group for the configuration file; receive the configuration file and at least one associated file related to the configuration file from the substrate processing apparatus according to the command for generating the file group; and generate the file group including the configuration file and the associated file received from the substrate processing apparatus and store the file group in a state where an output is possible.	01-23-2014
20140025678	USER-FRIENDLY DISPLAY OF DATA - A display apparatus for displaying accumulated data items includes obtaining means for obtaining a plurality of data items accumulated, classifying means for classifying the plurality of data items into N groups on the basis of predetermined criteria, display-control means for controlling an indication on a display unit such that the plurality of data are displayed in N display regions corresponding to the N groups, and accepting means for accepting a specification of one display region from among the N display regions. When the accepting means accepts a specification of one display region from among the N display regions, the classifying means classifies a plurality of data items displayed in the specified display region into a further N groups, and the display-control means controls an indication on the display unit such that the plurality of data items are displayed in the N display regions corresponding to the N groups.	01-23-2014
20140025679	FORMING LOGICAL GROUP FOR USER BASED ON ENVIRONMENTAL INFORMATION FROM USER DEVICE - Systems and methods for the forming of user device groups are presented. In one example, a message including location information indicating a geographic location of a first user device is received from the first user device. Values representing logical connection strengths between the first user device and other user devices are calculated using the location information. A first device group is determined for the first user device based on the calculating of the values representing the logical connection strengths, the first device group including a plurality of the other user devices.	01-23-2014
20140025680	METHOD AND APPARATUS FOR AUTOMATICALLY TAGGING CONTENT - A content tagging and management capability is provided for enabling automatic tagging of content and management of tagged content. A method includes receiving content including an object, and automatically associating an information structure with the object included within the content to form thereby tagged content. The content may be received locally at a content capture device, and the information structure may be automatically associated with the object by the content capture device. The automatic tagging may be performed at the content capture device when the content is captured by the content capture device. The content may be received at a computer, and the information structure may be automatically associated with the object by the computer. The information structure may be available locally or retrieved from one or more remote devices.	01-23-2014
20140032552	DEFINING RELATIONSHIPS - Defining relationships are described. Defining relationships can include retrieving a number of event notifications that correspond to a number of nodes. Defining relationships can include defining a number of group patterns that correspond to the number of event notifications. Defining relationships can also include grouping the number of nodes into a number of groups that correlate with the number of group patterns, the number of groups defining a number of relationships between the number of nodes. Defining relationships can include assigning a number of weights to the number of relationships between the number of nodes, wherein the number of weights are based on a strength of the number of relationships between the number of nodes.	01-30-2014
20140032553	RELATIONSHIP DISCOVERY IN BUSINESS ANALYTICS - A subset of (k−1)-dimensional tables are received, wherein k is greater than 1. A set of k-dimensional tables is created by combining each of the (k−1)-dimensional tables with a non-included dimension corresponding to a 1-dimensional table. Significance of interaction and interaction effect size is computed for the created set of k-dimensional tables to determine dimension and measure interactions.	01-30-2014
20140032554	NOTE ATLAS - Presenting database items includes providing a plurality of clusters, where each of the clusters is formed by grouping database items according to location information associated therewith, creating a plurality of geographic elements based on the clusters, and presenting the geographic elements to a user using a note atlas that represents all of the geographic elements corresponding to a set of the database items, where indicators of corresponding clusters are provided with each of the geographic elements. A quantity of database items may be provided with each of the corresponding clusters. The note atlas may show at least two levels of detail corresponding to a world level of detail, a points of interest level of detail and a city level of detail. Points of interest may be determined by having a user provide points of interest on a map.	01-30-2014
20140032555	SYSTEM AND METHOD TO CLASSIFY TELEMETRY FROM AUTOMATION SYSTEMS - A formal ontology includes multiple context elements to describe elements and their context within a system in the domain. The structure includes multiple role functions to describe the function of elements in the system, multiple types to describe values being provided by the elements in the system, and multiple states to describe states of the elements in the system, wherein the context elements, role functions, types, and states are selectable to provide a full description of the system.	01-30-2014
20140032556	Internal Linking Co-Convergence Using Clustering With No Hierarchy - Certain implementations of the disclosed technology include systems and methods for linking entities in an internal database by utilizing co-convergence and clustering. The method may include clustering database records into a first set of clusters having corresponding first cluster identifications (IDs). The clustering may be based at least in part on determining similarity among corresponding field values. The method may include associating mutually matching database records, by performing at least one matching iteration for each of the database records. The method may include determining similarity among corresponding field values of the database records, re-clustering at least a portion of the database records into a second set of clusters, the re-clustering based at least in part on the associating mutually matching database records and on the determining similarity among corresponding field values of the database records.	01-30-2014
20140032557	Internal Linking Co-Convergence Using Clustering With Hierarchy - Certain implementations of the disclosed technology include systems and methods for internal co-convergence using clustering when there is hierarchy in the data structure. A method is included for clustering hierarchical database records into a first set of clusters having corresponding first cluster identifications (IDs), each hierarchical database record including one or more field values, the clustering based at least in part on determining similarity among corresponding field values of the hierarchical database records. The method includes receiving parent-child hierarchical relationship information for the hierarchical database records, re-clustering at least a portion of the hierarchical database records into a second set of clusters having corresponding second cluster IDs, the re-clustering based at least in part on the received parent-child hierarchical relationship information, and outputting hierarchical database record information, based at least in part on the re-clustering.	01-30-2014
20140040261	INPUT PARTITIONING AND MINIMIZATION FOR AUTOMATON IMPLEMENTATIONS OF CAPTURING GROUP REGULAR EXPRESSIONS - A method for submatch extraction may include receiving an input string, receiving a regular expression, and converting the regular expression with capturing groups into a plurality of finite automata to extract submatches. The method further includes using a first automaton to determine whether the input string is in a language described by the regular expression, and to process the input string, and using states of the first automaton in a second automaton to extract the submatches. In addition, input partitioning and automaton minimization techniques may be employed to reduce the storage area consumed by the plurality of finite automata.	02-06-2014
20140040262	TECHNIQUES FOR CLOUD-BASED SIMILARITY SEARCHES - Techniques for facilitating a similarity search of digital assets (e.g., audio files, image files, video files, etc.) are described. Consistent with some embodiments, a cloud-based search service manages one or more search tree data structures for use in organizing digital assets to make the digital assets searchable. Each digital asset is associated with a feature vector based on the various attributes and/or characteristics of the digital asset. The digital assets are then assigned to leaf nodes in one or more search tree data structures based on a measure of the distance between the feature vector of the digital asset and a virtual feature vector associated with a leaf node. When a search for similar digital assets is invoked, a prioritized breadth first search of a search tree is performed to identify the digital assets having the feature vectors closest in distance to the reference digital asset.	02-06-2014
20140040263	SEARCH AND CONTEXT BASED CREATION IN DYNAMIC WORKSPACES - The disclosure generally describes computer-implemented methods, software, and systems for search-, context-, and rule-based creation and runtime adaptation in dynamic workspaces. One computer-implemented method includes identifying a data artifact associated with each search result of at least one received search result, associating each identified data artifact with a module category of a plurality of module categories, injecting the identified artifacts into a content gallery, categorize, by operation of at least one computer, the injected identified artifacts within the content gallery, presenting at least a subset of the injected identified artifacts on an enterprise workspace page associated with an enterprise workspace, and constructing a context associated with at least one of the enterprise workspace or the enterprise workspace page.	02-06-2014
20140040264	METHOD FOR ESTIMATION OF INFORMATION FLOW IN BIOLOGICAL NETWORKS - The present invention relates to a method for stratifying a patient into a clinically relevant group comprising the identification of the probability of an alteration within one or more sets of molecular data from a patient sample in comparison to a database of molecular data of known phenotypes, the inference of the activity of a biological network on the basis of the probabilities, the identification of a network information flow probability for the patient via the probability of interactions in the network, the creation of multiple instances of network information flow for the patient sample and the calculation of the distance of the patient from other subjects in a patient database using multiple instances of the network information flow. The invention further relates to a biomedical marker or group of biomedical markers associated with a high likelihood of responsiveness of a subject to a cancer therapy wherein the biomedical marker or group of biomedical markers comprises altered biological pathway markers, as well as to an assay for detecting, diagnosing, graduating, monitoring or prognosticating a medical condition, or for detecting, diagnosing, monitoring or prognosticating the responsiveness of a subject to a therapy against said medical condition, in particular ovarian cancer. Furthermore, a corresponding clinical decision support system is provided.	02-06-2014
20140040265	METHOD AND APPARATUS FOR REPRESENTING MULTIDIMENSIONAL DATA - The present invention relates to methods for representing multidimensional data. The methods of the present invention are well suited but not limited to the representation of multidimensional data in such a way as to enable the comparison and differentiation of data sets. For example, the invention may be applied to the representation of flow cytometric data. The invention further relates to a program storage device having instructions for controlling a computer system to perform the methods, and to a program storage device containing data structures used in the practice of the methods.	02-06-2014
20140040266	Training Program and Music Playlist Generation for Athletic Training - Systems and techniques for generating an athletic training program and selecting music for playing during the training program are described. Based on specified parameters, a training program module may generate a customized training program intended to help an athlete reach a goal. In conjunction therewith or independently thereof, a music selection module may generate a music playlist for playing during a training program. Music selection parameters may include training intensity, user speed, user location, user mood, a user's currently performance (e.g., as compared to an expected performance) and the like. The music selection module may select songs from a personal library or a public database of music. Music selection may be made to maximize user motivation/inspiration.	02-06-2014
20140040267	INFORMATION SEARCHING APPARATUS, INFORMATION SEARCHING METHOD, AND COMPUTER PRODUCT - An information searching apparatus retrieves a sub graph matching an inquiry graph from a graph to be searched. The apparatus includes an extracting unit that extracts, from among clusters of nodes in the graph to be searched, plural cluster pairs that each include a first cluster and a second cluster including a node linked by a link to a node in the first cluster and a calculating unit that calculates a bonding strength for each of the cluster pairs. The apparatus further includes a determining unit that determines, among the cluster pairs and based on the bonding strength of each of the cluster pairs, a cluster pair to be merged; a merging unit that merges the cluster pair; and a searching unit that searches the merged clusters for a sub graph matching the inquiry graph. An output unit outputs a search result of the searching unit.	02-06-2014
20140040268	HIGH-DIMENSIONAL STRATIFIED SAMPLING - In one aspect, a processing device of an information processing system is operative to perform high-dimensional stratified sampling of a database comprising a plurality of records arranged in overlapping sub-groups. For a given record, the processing device determines which of the sub-groups the given record is associated with, and for each of the sub-groups associated with the given record, checks if a sampling rate of the sub-group is less than a specified sampling rate. If the sampling rate of each of the sub-groups is less than the specified sampling rate, the processing device samples the given record, and otherwise does not sample the given record. The determine, check and sample operations are repeated for additional records, and samples resulting from the sample operations are processed to generate information characterizing the database. Other aspects of the invention relate to determining which records to sample through iterative optimization of an objective function that may be based, for example, on a likelihood function of the sampled records.	02-06-2014
20140040269	SEARCH CLUSTERING - In one example embodiment, a method is illustrated as including retrieving item data. At least one base cluster having at least one document with common item data stored in a suffix ordering is constructed. The at least one base cluster is compacted to create a compacted cluster representation having a reduced duplicate suffix ordering amongst the clusters.	02-06-2014
20140046942	SYSTEM AND METHOD FOR COMPUTERIZED BATCHING OF HUGE POPULATIONS OF ELECTRONIC DOCUMENTS - A method for computerized batching of huge populations of electronic documents, including computerized assignment of electronic documents into at least one sequence of electronic document batches such that each document is assigned to a batch in the sequence of batches and such that there is no conflict between batching requirements, the following batching requirements being maintained by a suitably programmed processor: a. pre-defined subsets of documents are always kept together in the same batch, b. batches are equal in size, c. the population is partitioned into clusters, and all documents in any given batch belong to a single cluster rather than to two or more clusters.	02-13-2014
20140046943	METHODS AND DEVICES FOR STORING CONTENT BASED ON CLASSIFICATION OPTIONS - Methods and devices for storing content are described. In one example embodiment, a method includes: displaying, on a display of an electronic device, a plurality of selectable content classification options for classifying a content item, the selectable content classification options including a selectable option to classify the content item as an action item and a selectable option to classify the content item as an archive; receiving, via an input interface associated with the electronic device, a selection of one of the content classification options; and storing the content item in accordance with the selected content classification option.	02-13-2014
20140046944	SHAPE BASED PICTURE SEARCH - The present application relates to a method for implementing picture search and a website server thereof. A method for implementing picture search includes: classifying, according to keywords in advance in a picture database, corresponding pictures by shape of objects in the pictures, and determining a sample picture for each shape type; wherein, after a server receives a picture search request sent from a client, the method includes: searching, by the server, in the picture database for the sample picture of several shape types classified in advance corresponding to the keywords in said search request, and returning, to the client, the searched sample picture of the several shape types; receiving, by the server, the sample picture of a certain shape type determined by the client, and searching, in the picture database for the pictures which correspond to said keywords and satisfy a predetermined request with the characteristic value of said determined sample pictures; returning, by the server, said found pictures to the client. The present application enables the user to search pictures of similar shapes according to the shape types, thereby satisfying the user's search demands.	02-13-2014
20140046945	INDICATING DOCUMENTS IN A THREAD REACHING A THRESHOLD - Documents in a document thread include descriptive terms that have weights. An indication indicates when documents in the document thread reach a threshold of weight for the document thread.	02-13-2014
20140052726	HARDWARE IMPLEMENTATION OF THE AGGREGATION/GROUP BY OPERATION: HASH-TABLE METHOD - Techniques are described for performing grouping and aggregation operations. In one embodiment, a request is received to aggregate data grouped by a first column. In response to receiving the request, a group value in a row of a first column is mapped to an address. A pointer is stored for a first group at a first location identified by the address. The pointer identifies a second location of a set of aggregation data for the first group. An aggregate value included in the set of aggregation data is updated based on a value in the row of a second column.	02-20-2014
20140052727	DATA PROCESSING FOR DATABASE AGGREGATION OPERATION - Embodiments relate to a method, system, and computer program product for database aggregation operations. The method includes acquiring data located in data pages of extents and performing a database aggregation operation pre-processing on the acquired data. The method also includes storing the result of said pre-processing in summary data pages, the summary data pages being used for performing database aggregation operations rapidly.	02-20-2014
20140052728	TEXT CLUSTERING DEVICE, TEXT CLUSTERING METHOD, AND COMPUTER-READABLE RECORDING MEDIUM - A text clustering device (	02-20-2014
20140059047	AUTOTRANSFORM SYSTEM - According to one embodiment, an apparatus stores a plurality of datapoints. A datapoint comprises a first value and a second value that depends upon the value of the first value. The apparatus associates the datapoint with a group from a plurality of groups. The group is associated with an identifying range and the datapoint is associated with the group based at least in part upon the first value of the datapoint and the identifying range of the group. The apparatus calculates a median of the second values of the datapoints associated with the group and a performance value by performing a regression based at least in part upon the identifying range and the calculated median of the group. The apparatus determines that the performance value exceeds a baseline value and in response, presents, on a display, an illustration depicting the identifying range and the associated median of the group.	02-27-2014
20140059048	Computer-Implemented System And Method For Providing Visual Suggestions For Cluster Classification - An embodiment provides a computer-implemented system and method for providing visual suggestions for cluster classification. One or more clusters comprising uncoded documents from a set are obtained. A different set of reference documents that are each classified with a code is designated. A cluster center in one of the clusters is identified. The cluster center is compared to one or more of the reference documents. Those of the reference documents that are similar to the cluster are identified based on the comparison. The classification codes of each of the similar reference documents are visually represented as a suggestion for assigning one of the classification codes to the cluster.	02-27-2014
20140059049	IDENTIFICATION OF CONTENT BY METADATA - Systems and methods for identifying content in electronic messages are provided. An electronic message may include certain content. The content is detected and analyzed to identify any metadata. The metadata may include a numerical signature characterizing the content. A thumbprint is generated based on the numerical signature. The thumbprint may then be compared to thumbprints of previously received messages. The comparison allows for classification of the electronic message as spam or not spam.	02-27-2014
20140067806	Retroactive Search of Objects Using K-D Tree - In one embodiment, a method includes at time t	03-06-2014
20140067807	MIGRATION OF TAGS ACROSS ENTITIES IN MANAGEMENT OF PERSONAL ELECTRONICALLY ENCODED ITEMS - A method performed on an electronic device for migrating tags across entities. The migration of the tags is performed following an analysis of one or more personal electronically encoded items associated with a previously created perspective or album associated with the previously created perspective, responsive to a user decision the creation of a new perspective, a new album associated with one of the previously created perspectives, or a new perspective and a new album associated with the new perspective, responsive to a user decision to treat the previously created perspective or album as an individual entity, and association of the previously created perspective or album with the new perspective or new album. The tags are respectively migrated from the new perspective or the new album to the associated previously created perspective or the previously created album and to associated ones of the one or more personal electronically encoded items.	03-06-2014
20140067808	Distributed Scalable Clustering and Community Detection - Techniques, an apparatus and an article of manufacture for distributed scalable clustering and community detection. A method includes generating a label for each node in a graph, wherein said label identifies a community in which a node participates, propagating each label locally within two or more segments of the graph based on a participation percentage of each node in at least one identified community within the graph, and deriving at least one cluster of nodes in the graph that corresponds to the at least one identified community based on said propagating.	03-06-2014
20140067809	NON-TRANSITORY COMPUTER-READABLE MEDIUM, INFORMATION CLASSIFICATION METHOD, AND INFORMATION PROCESSING APPARATUS - There is provided a non-transitory computer-readable medium storing a program causing a computer to execute a process. The process includes: accepting a search keyword; retrieving, from information items posted by users, a posted information item including the accepted search keyword, each of the posted information items including at least either of a text information item and an image information item, and acquiring posted information items which are within a predetermined chronological range with respect to the posted information item including the search keyword; and classifying, as image information items related to the search keyword, some of image information items included in the posted information items that have been acquired, and performing first determination of, for each of the classified image information items, whether or not a user who posted an information item including the classified image information item took an action related to the search keyword.	03-06-2014
20140067810	METHODS AND APPARATUS FOR PARTITIONING DATA - A method for data partitioning method includes defining a set of category levels associated with a plurality of entities stored within a first database, wherein the set of category levels is hierarchical (e.g., proceeding from higher to lower levels). Each of the plurality of entities is assigned to a category level within the set of category levels. One or more partition keys are defined for at least one of the category levels. The plurality of entities are then copied to a second database based on the set of category levels and the one or more partition keys.	03-06-2014
20140067811	Robust Adaptive Data Clustering in Evolving Environments - A computer-implemented method for automated data clustering and analysis. A computer takes a database having multiple entries and transforms the entries in the database into a set of intrinsic attributes for each entry. The computer then receives data defining one or more clustering trials to be run on the attributes from the entries in the database, each clustering trial being defined by a set of relevant intrinsic and extrinsic attributes. The computer automatically identifies the most significant intrinsic and/or extrinsic attributes of the entries being clustered for each clustering trial, and runs a clustering script to cluster the attributes in accordance with the significant attributes. The computer forms hierarchical linkages of the profiles and automatically calculates the cophenetic correlation coefficient for the linkages in each clustering trial. The invention then automatically calculates linkage threshold values for the linkages in each trial, creates cluster groups based on the threshold values, and outputs dendrograms and maps showing the results.	03-06-2014
20140067812	SYSTEMS AND METHODS FOR RANKING DOCUMENT CLUSTERS - Document cluster ranking systems and methods of ranking document clusters are described. In some example embodiments, the method comprises: obtaining, at a document cluster ranking system, a value associated with a first feature for each of a plurality of document clusters; based on the values associated with the first feature, automatically generating, at the document cluster ranking system, a plurality of first feature bins, each first feature bin defining a range of values and a bin identifier; and obtaining a score for one of the document clusters, by: i) identifying the first feature bin having a range of values which includes the obtained value associated with the first feature for that one of the document clusters; and ii) determining a score for that document cluster based on the first feature bin identifier for the identified first feature bin.	03-06-2014
20140067813	PARALLELIZATION OF SYNTHETIC EVENTS WITH GENETIC SURPRISAL DATA REPRESENTING A GENETIC SEQUENCE OF AN ORGANISM - A method, system, and computer program product for parallelization of updating synthetic events with genetic surprisal data comprising dividing the synthetic event into cohort parts and assigning the cohort parts to one of a plurality of computer processing elements. Within each processing element: searching data records of patients for genetic surprisal data; generating a cluster comprising a centroid by populating the cluster based on all of the matches of the data records; calculating a new centroid for each cluster; calculating a Euclidean distance in multiple dimensions for each match of data records to the new centroid for each cluster; reassigning each match of data to the new centroid of each cluster based on the shortest calculated Euclidean distance to the new centroid for each cluster; and determining at least one cohort part from the clusters and recombining the cohort parts into updated synthetic events based on the metadata.	03-06-2014
20140074838	ANOMALY, ASSOCIATION AND CLUSTERING DETECTION - Techniques are provided for anomaly, association and clustering detection. At least one code table is built for each attribute in a set of data. A first code table corresponding to a first attribute and a second code table corresponding to a second attribute are selected. The first code table and the second code table are merged into a merged code table, and a determination is made to accept or reject the merged code table. An anomaly is detected when a total compression cost for a data point is greater than a threshold compression cost inferred from one or more code tables. An association in a data table is detected by merging attribute groups, splitting data groups, and assigning data points to data groups. A cluster is inferred from a matrix of data and code words for each of the one or more code tables.	03-13-2014
20140074839	USER PROFILE BASED ON CLUSTERING TIERED DESCRIPTORS - A user of a network-based system may correspond to a user profile that describes the user. The user profile may describe the user using one or more descriptors of items that correspond to the user (e.g., items owned by the user, items liked by the user, or items rated by the user). In some situations, such a user profile may be characterized as a “taste profile” that describes an array or distribution of one or more tastes, preferences, or habits of the user. Accordingly, the user profile machine within the network-based system may generate the user profile by accessing descriptors of items that correspond to the user, clustering one or more of the descriptors, and generating the user profile based on one or more clusters of the descriptors.	03-13-2014
20140074840	SYSTEM AND METHOD FOR CLASSIFYING DATA DUMP DATA - An example of a system comprises a fingerprint calculator configured to receive data structure information and create a fingerprint as a function of the data structure information, a code generator configured to generate modified machine code, the modified machine code including the fingerprint embedded therein, a fingerprint identifier configured to identify the fingerprint in data received from a data dump, a data structure lookup table including the fingerprint and the data structure information associated with the fingerprint stored thereon, and a data interpreter configured to interpret, using data from the data dump and the data structure information, the data structure of at least a portion of the data from the data dump.	03-13-2014
20140074841	CONCURRENT ACCESS METHODS FOR TREE DATA STRUCTURES - In one embodiment, non-transitory computer-readable medium stores instructions for implementing a file system, which include operations for acquiring an exclusive lock on a first node in an ordered tree data-structure, and adding an identifier and index of the first node to a path data structure. If the value of the index in the first node is non-zero, then each exclusive lock acquired between the first node and the root of the tree data structure is released. In any case, the operation proceeds to a second node, which is addressed at the index on the first node. In one embodiment, operations further include acquiring an exclusive lock on the second node, and, if the second node is a leaf node, performing updates to the second node, and then releasing each exclusive lock in the data-structure.	03-13-2014
20140074842	Computer Method and System for Detecting the Subject Matter of Online Communications - A computer-implemented system and method for monitoring the electronic communications of a subject to determine if dangerous behavior is occurring, especially in social networking platforms. The PG Guard™ web service permits a parent to monitor all of a child's activities on a social site such as Facebook®. To maintain the privacy of the child, the service only provides information to the parents about suspicious activities comprising, for example: conversations regarding violence, sex, alcohol, etc.; when a stranger interacts with their child (or vice versa); when their child discloses personal information; when a friend seems to be of a problematic nature; when a child uploads or is tagged in new images; and when a child adds or rejects a new friend. The method comprises generating a profile of the subject based on their keywords and the probability of their intended meaning, as well as their posted “Likes” and “Recommendations”.	03-13-2014
20140074843	SYSTEMS AND METHODS FOR DYNAMIC ANALYSIS, SORTING AND ACTIVE DISPLAY OF SEMANTIC-DRIVEN REPORTS OF COMMUNICATION REPOSITORIES - A collection module, an analysis module, and a report module are each operably coupled to a database. The collection module can receive data associated with a user and can send a signal indicative of an instruction to store the data associated with the user in the database. Based at least in part on a pattern analysis or a semantic analysis of the data associated with the user, the analysis module can analyze the data to define an aggregated data set representing one or more relationship attributes in the data associated with the user. The analysis module can send a signal indicative of an instruction to store the aggregated data set in the database. The report module can define a report representing the aggregated data set and can send a signal indicative of an instruction to present data associated with the report on a display of an electronic device.	03-13-2014
20140081973	SPIKE CLASSIFICATION - Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying a spike in a rate of occurrence of events. One of the methods includes receiving data identifying a spike at a particular time in a rate of occurrence of events relating to a particular search query, where an event relating to the particular search query is a receipt event of the particular search query or an indexing event of a resource that satisfies the particular search query, fitting the occurrences of the events in a time window to a reference distribution of occurrences of events to determine a goodness of fit value, wherein the reference distribution models a random occurrence of events relating to search queries, comparing the goodness of fit value to a primary threshold, and classifying the spike as a spurious spike if the goodness of fit value satisfies the predetermined threshold.	03-20-2014
20140081974	Aggregating Electronic Content Items from Different Sources - Systems and methods are provided for aggregating relevant electronic content items that are relevant to one another. In one embodiment, a content management application determines that a first electronic content item and a second electronic content item are relevant to one another. The first electronic content item is provided by a first client account and the second electronic content item is provided by a second client account. The content management application also aggregates the first and second electronic content items to form at least part of a collection of electronic content. The first and second electronic content items are aggregated based on determining that the first and second electronic content items are relevant to one another. The content management application also provides access to the collection of electronic content.	03-20-2014
20140081975	METHODS AND SYSTEMS FOR MEDIA FILE MANAGEMENT - Methods and systems for media file management are provided. When a plurality of media files in the electronic device are viewed, media data is real-time generated for the media files. In the generation of the media data, the media files are analyzed to obtain a theme for the media files. Then, a script file is identified according to the theme, and media data is produced for the media files according to the script file. In some embodiments, a frame buffer used for storing the media data is refreshed after each frame of the media data is rendered.	03-20-2014
20140081976	IMAGE DISPLAY APPARATUS AND CONTROL METHOD THEREOF - An image display apparatus which displays a list of image data, comprises a selection unit configured to select a classification condition of image data; a designation unit configured to designate a sort condition associated with a sorting order of image data; a grouping unit configured to execute grouping of image data according to the classification condition selected by the selection unit; a determination unit configured to determine whether or not to execute the grouping according to the classification condition selected by the selection unit and the sort condition designated by the designation unit; and a control unit configured to control whether or not image data are grouped and displayed in accordance with a determination result of the determination unit.	03-20-2014
20140089310	PRODUCT CLUSTER REPOSITORY AND INTERFACE: METHOD AND APPARATUS - The present invention is a method and apparatus for conducting transactions regarding similarity of products against a repository in which products are grouped in clusters according to their characteristics. A product suite repository interface facilitates such transactions. Such a repository is useful for consumers and participants in the supply chain. For example, a supplier could determine which products in its own offerings are related to those offered by a retailer. Partners in some effort might merge their offerings into a single catalog. A consumer might use the repository to find accessories that might enhance a purchased item.	03-27-2014
20140089311	SYSTEM. METHOD, AND COMPUTER-READABLE MEDIUM FOR CLASSIFYING PROBLEM QUERIES TO REDUCE EXCEPTION PROCESSING - A system, method, and computer-readable medium that facilitate classification of database requests as problematic based on estimated processing characteristics of the request are provided. Estimated processing characteristics may include estimated skew including central processing unit skew and input/output operation skew, central processing unit duration per input/output operation, and estimated memory usage. The estimated processing characteristics are made on a request step basis. The request is classified as problematic responsive to determining one or more of the estimated characteristics of a request step exceed a corresponding threshold. In this manner, mechanisms for predicting bad query behavior are provided. Workload management of those requests may then be more successfully provided through workload throttles, filters, or even a more confident exception detection that correlates with the estimated bad behavior.	03-27-2014
20140095502	CLUSTERING A TABLE IN A RELATIONAL DATABASE MANAGEMENT SYSTEM - Techniques are provided that address the problems associated with prior approaches for clustering a fact table in a relational database management system. According to one aspect of the invention, a database server clusters a fact table in a database based on one or more dimension tables. More specifically, rows are stored in the fact table in a sorted order and the order in which the rows are sorted is based on values in one or more columns of one or more of the dimension tables. A user specifies the columns of the dimension tables on which the sorted order is based in “clustering criteria”. The database server uses the clustering criteria to automatically store the rows in the fact table in the sorted order in response to certain user-initiated database operations on the fact-table.	04-03-2014
20140095503	COMPILE-TIME GROUPING OF TUPLES IN A STREAMING APPLICATION - A system and a method for initializing a streaming application are disclosed. The method may include initializing a streaming application for execution on one or more compute nodes which are adapted to execute one or more stream operators. The method may, during a compiling of code, identify whether a processing condition exists at a first stream operator of a plurality of stream operators. The method may add a grouping condition to a second stream operator of the plurality of stream operators if the processing condition exists. The method may provide for the second stream operator to group tuples for sending to the first stream operator.	04-03-2014
20140095504	SYSTEMS AND METHODS FOR CATALOGING USER-GENERATED CONTENT - Systems and methods are described herein for cataloging user-generated content. In one embodiment, user-generated content is received, and metadata associated with the user-generated content may be captured. The metadata may be automatically analyzed to determine one or more common characteristics of the user generated content. For example, it may be determined that the user-generated content was recorded during a trip along a particular route, or during a period where the user was exhibiting stressful physiological signs. Groupings of user-generated content may be defined based on the common characteristics. Post-processing and file storage operations may be performed on the user-generated content associated with a grouping. Groupings may be selected for playback and the user-generated content may be streamed to a target device.	04-03-2014
20140095505	PERFORMANCE AND SCALABILITY IN AN INTELLIGENT DATA OPERATING LAYER SYSTEM - Systems and methods that allow for an intelligence platform for distributed processing of big data sets including both structured and unstructured data types across two or more intelligent data operation engine servers. The intelligent data operation engine servers can form a conceptual understanding of content in each electronic file and then cooperates with a distributed index handler to index the conceptual understanding of the electronic file. A query pipeline and the distributed index handler in the intelligence platform cooperate with the two or more intelligent data operation engine servers to improve scalability and performance on the big data sets containing both structured and un-structured electronic files represented in the common index.	04-03-2014
20140095506	COMPILE-TIME GROUPING OF TUPLES IN A STREAMING APPLICATION - A system and a method for initializing a streaming application are disclosed. The method may include initializing a streaming application for execution on one or more compute nodes which are adapted to execute one or more stream operators. The method may, during a compiling of code, identify whether a processing condition exists at a first stream operator of a plurality of stream operators. The method may add a grouping condition to a second stream operator of the plurality of stream operators if the processing condition exists. The method may provide for the second stream operator to group tuples for sending to the first stream operator.	04-03-2014
20140095507	LIBRARY APPARATUS - A library apparatus according to an aspect of the present invention includes a plurality of magazines each including medium slots in which corresponding removable storage media are stored, and a magazine ID for identifying itself; at least one magazine slot each of which, at a time when one of the magazines has been loaded into itself, allocates physical slot numbers to the magazine; an ID reading unit; a logical library configuration information table including magazine presence or absence information which indicates whether or not any of the magazines is loaded in each of the at least one magazine slot; and a logical library management unit which updates the magazine presence or absence information on the basis of information indicating which of the at least one magazine slot the relevant magazine has been loaded into, and the magazine ID of the relevant magazine.	04-03-2014
20140101153	Naming Methodologies for a Hierarchical System - Methods and systems are disclosed for naming methodologies for a hierarchical system. In one embodiment, a computer implemented method of organizing instance names in a hierarchical system includes receiving a description of a hierarchical system that includes plurality of instances arranged in different branches in a plurality of hierarchical levels in a physical data structure, creating an instance name data structure configured to describe the corresponding instances in the hierarchical system, where the instance name data structure comprises a map of indexes and a corresponding array of offsets configured to access naming information in a subsequent level, and associating names of instances in the hierarchical system to a corresponding set of unique integers which are arranged in a sequential manner.	04-10-2014
20140101154	SIMPLIFYING GROUPING OF DATA ITEMS STORED IN A DATABASE - An aspect of the present invention simplifies grouping of data items previously stored in a database, the data items being stored in the form of rows and columns in respective tables (in the database). In one embodiment, a system displays a cross product of values from two or more columns in the form of multiple lines, where each line contains a respective value from each of the two or more columns to specify a corresponding criterion (combination of values). In response to receiving inputs indicating the respective groups for each of the lines, the system determines a group for each data item (stored in the database) based on the received inputs. A user is accordingly required to only specify the desired groups corresponding to various combinations of values of the columns to cause grouping of data items in the database.	04-10-2014
20140101155	GENERATING A TUNABLE FINITE AUTOMATON FOR REGULAR EXPRESSION MATCHING - Deterministic Finite Automatons (DFAs) and Nondeterministic Finite Automatons (NFAs) are two typical automatons used in the Network Intrusion Detection System (NIDS). Although they both perform regular expression matching, they have quite different performance and memory usage properties. DFAs provide fast and deterministic matching performance but suffer from the well-known state explosion problem. NFAs are compact, but their matching performance is unpredictable and with no worst case guarantee. A new automaton representation of regular expressions, called Tunable Finite Automaton (TFA), is described. TFAs resolve the DFAs' state explosion problem and the NFAs' unpredictable performance problem. Different from a DFA, which has only one active state, a TFA allows multiple concurrent active states. Thus, the total number of states required by the TFA to track the matching status is much smaller than that required by the DFA. Different from an NFA, a TFA guarantees that the number of concurrent active states is bounded by a bound factor b that can be tuned during the construction of the TFA according to the needs of the application for speed and storage. A TFA can achieve significant reductions in the number of states and memory space.	04-10-2014
20140101156	REGROUPING NON-DERMINISTIC FINITE AUTOMATON ACTIVE STATES TO MINIMIZE DISTINCT SUBSETS - Deterministic Finite Automatons (DFAs) and Nondeterministic Finite Automatons (NFAs) are two typical automatons used in the Network Intrusion Detection System (NIDS). Although they both perform regular expression matching, they have quite different performance and memory usage properties. DFAs provide fast and deterministic matching performance but suffer from the well-known state explosion problem. NFAs are compact, but their matching performance is unpredictable and with no worst case guarantee. A new automaton representation of regular expressions, called Tunable Finite Automaton (TFA), is described. TFAs resolve the DFAs' state explosion problem and the NFAs' unpredictable performance problem. Different from a DFA, which has only one active state, a TFA allows multiple concurrent active states. Thus, the total number of states required by the TFA to track the matching status is much smaller than that required by the DFA. Different from an NFA, a TFA guarantees that the number of concurrent active states is bounded by a bound factor b that can be tuned during the construction of the TFA according to the needs of the application for speed and storage. A TFA can achieve significant reductions in the number of states and memory space.	04-10-2014
20140101157	ENCODING NON-DERMINISTIC FINITE AUTOMATON STATES EFFICIENTLY IN A MANNER THAT PERMITS SIMPLE AND FAST UNION OPERATIONS - Deterministic Finite Automatons (DFAs) and Nondeterministic Finite Automatons (NFAs) are two typical automatons used in the Network Intrusion Detection System (NIDS). Although they both perform regular expression matching, they have quite different performance and memory usage properties. DFAs provide fast and deterministic matching performance but suffer from the well-known state explosion problem. NFAs are compact, but their matching performance is unpredictable and with no worst case guarantee. A new automaton representation of regular expressions, called Tunable Finite Automaton (TFA), is described. TFAs resolve the DFAs' state explosion problem and the NFAs' unpredictable performance problem. Different from a DFA, which has only one active state, a TFA allows multiple concurrent active states. Thus, the total number of states required by the TFA to track the matching status is much smaller than that required by the DFA. Different from an NFA, a TFA guarantees that the number of concurrent active states is bounded by a bound factor b that can be tuned during the construction of the TFA according to the needs of the application for speed and storage. A TFA can achieve significant reductions in the number of states and memory space.	04-10-2014
20140101158	File Handling in a Hierarchical Storage System - A mechanism is provided for file handling in a hierarchical storage system. A user virtual file system scans, reads and analyses data or user behavior to create or modify at least one rule or metadata. The user virtual file system identifies logical or temporal relationships of files based on the at least one rule or the metadata. The user virtual file system groups identified related files in the at least one container. The user virtual file system moves the at least one container to different tiers of storage based on the at least one rule or the metadata.	04-10-2014
20140101159	Knowledgebase Query Analysis - A computerized method of analyzing a knowledgebase comprising; assembling a collection of queries made by users to obtain information from the knowledgebase; identifying in each query, sets of collocated words in that query to form a list of collocated word sets in the collection; from the list, identifying and presenting frequently collocated word sets in the collection. Likewise, a histogram of scaled relative difference between the frequency of word sets at first and second time intervales may be presented.	04-10-2014
20140101160	SYSTEMS AND METHODS FOR CLASSIFYING AND TRANSFERRING INFORMATION IN A STORAGE NETWORK - Systems and methods for data classification to facilitate and improve data management within an enterprise are described. The disclosed systems and methods evaluate and define data management operations based on data characteristics rather than data location, among other things. Also provided are methods for generating a data structure of metadata that describes system data and storage operations. This data structure may be consulted to determine changes in system data rather than scanning the data files themselves.	04-10-2014
20140108403	License Reconciliation with Multiple License Types and Restrictions - Techniques for license reconciliation with multiple license types and restrictions. A method includes grouping a collection of multiple software installation instances, a collection of multiple hardware devices and a collection of multiple software licenses into multiple clusters, generating a reconciliation matrix for each cluster, wherein each row in the reconciliation matrix represents a software installation instance or a hardware device, each column in the reconciliation matrix represents a license type and/or an individual license, and each cell in the reconciliation matrix represents a license requirement and applicability of each software installation instance or hardware device, solving each reconciliation matrix, and generating a license reconciliation plan based on the solved reconciliation matrices.	04-17-2014
20140108404	License Reconciliation with Multiple License Types and Restrictions - Techniques for license reconciliation with multiple license types and restrictions includes grouping a collection of multiple software installation instances, a collection of multiple hardware devices and a collection of multiple software licenses into multiple clusters, generating a reconciliation matrix for each cluster, wherein each row in the reconciliation matrix represents a software installation instance or a hardware device, each column in the reconciliation matrix represents a license type and/or an individual license, and each cell in the reconciliation matrix represents a license requirement and applicability of each software installation instance or hardware device, solving each reconciliation matrix, and generating a license reconciliation plan based on the solved reconciliation matrices.	04-17-2014
20140108405	USER-SPECIFIED IMAGE GROUPING SYSTEMS AND METHODS - Digital images may be filtered according to a first user-selectable filtering metadata dimension. The filtered digital images may also be grouped according to a second user-selectable pivoting metadata dimension. A group of the filtered digital images may additionally be selected and focused on. The focused group of filtered digital images may be further filtered and grouped according to further user-selectable metadata dimensions.	04-17-2014
20140108406	Computer-Implemented System and Method For Generating A Reference Set Via Clustering - A computer-implemented system and method for generating a reference set via clustering is provided. A collection of unclassified documents is obtained and grouped into clusters. N-documents are selected from each cluster and are combined as reference set candidates. One of the n-documents from each cluster is located closest to a center of that cluster. A classification code is assigned to each of the reference set candidates. Two or more of the reference set candidates are grouped as a reference set of classified documents.	04-17-2014
20140108407	Computer-Implemented System and Method For Generating A Reference Set Via Seed Documents - A computer-implemented system and method for generating a reference set via seed documents is provided. A collection of documents is obtained. One or more seed documents are identified. The seed documents are compared with the document collection and those documents that are similar to the seed documents are identified as reference set candidates. A size threshold is applied to the reference set candidates, which are grouped as the reference set when the size threshold is satisfied.	04-17-2014
20140114972	SHARING INFORMATION BETWEEN NEXUSES THAT USE DIFFERENT CLASSIFICATION SCHEMES FOR INFORMATION ACCESS CONTROL - Systems and methods for sharing information between distributed computer systems connected to one or more data networks. In particular, a replication system implementing methodologies for sharing database information between computer systems where the databases use different classification schemes for information access control is disclosed.	04-24-2014
20140114973	SYSTEMS AND METHODS FOR PROCESSING AND ORGANIZING ELECTRONIC CONTENT - The present disclosure generally relates to processing and organizing electronic content. In accordance with one implementation, a computer-implemented method is provided that comprises receiving source data from at least one content server, the source data being associated with electronic content. The method also includes generating local data based on at least one of an analysis of the received source data or an extraction from the received source data. Additionally, the method includes classifying the electronic content as being associated with one or more content stacks. Further, the method includes generating representations of the electronic content based on the local data and generating instructions to display at least one content stack on a user interface, each displayed contact stack being operable to display one or more of the representations of the electronic content associated with the content stack based on the classification.	04-24-2014
20140114974	CO-CLUSTERING APPARATUS, CO-CLUSTERING METHOD, RECORDING MEDIUM, AND INTEGRATED CIRCUIT - A co-clustering apparatus that performs co-clustering processing on relational data to divide the relational data into cluster blocks, the apparatus including: a distribution tendency generating unit that generates a distribution tendency of statistic amounts of the cluster blocks in the entire relational data, each of the statistic amounts indicating a tendency of relations generated in the corresponding cluster block; a calculate calculating unit that calculates an importance degree for each of the cluster blocks based on the statistic amount of the cluster block and the distribution tendency generated by the distribution tendency generating unit, using a calculation method for changing a result of calculation of the importance degree according to the distribution tendency; and an output unit that outputs at least one piece of information indicating the cluster blocks and information indicating the importance degree calculated for the at least one of information by the calculating unit.	04-24-2014
20140114975	METHOD, SYSTEM AND AGGREGATION ENGINE FOR PROVIDING STRUCTURAL REPRESENTATIONS OF PHYSICAL ENTITIES - The present disclosure relates to a method a system and an aggregation engine for providing a structural representation of a physical entity. Processing units provide representation of elements composing the physical entity. Processing units comprise a label, which represent the elements, and a state. Links are established between the processing units. By iteration in the aggregation engine, the states and labels of the processing units are updated based on states and labels of linked processing units. A graphical representation of the physical entity is obtained based on the labels, on the states, and on the links.	04-24-2014
20140114976	ANALYSIS ENGINE CONTROL DEVICE - An analysis engine control device includes: an analysis data selecting for selecting at least two analysis data of a plurality of analysis data that are analysis results obtained by analysis by a plurality of analysis engines, respectively; and an analysis data integration calculating for executing new analysis by using at least two analysis data selected by the analysis data selecting as integration target analysis data. Based on classifications assigned according to one characteristic or each of a plurality of characteristics previously set of the analysis engines, the analysis data selecting selects analysis data obtained by the analysis engines that the classifications are at least partly different from each other, as the integration target analysis data.	04-24-2014
20140114977	SYSTEM AND METHOD FOR DOCUMENT ANALYSIS, PROCESSING AND INFORMATION EXTRACTION - The present invention is directed to a method and computer system for representing a dataset comprising N documents by computing a diffusion geometry of the dataset comprising at least a plurality of diffusion coordinates. The present method and system stores a number of diffusion coordinates, wherein the number is linear in proportion to N.	04-24-2014
20140122483	SYSTEM AND METHOD FOR DETERMINING A DURATION FOR USER ACTIVITIES BASED ON SOCIAL-NETWORK EVENTS - An activity-modeling system computes an amount of time that a user is expected to spend when performing activities of a certain type. During operation, the system can obtain a plurality of location events associated with the user, such that a respective location event indicates a time at which a user logged his location while performed an activity related to the activity type. The system selects, from the plurality of location events, a set of location events associated with the activity type. The system determines an activity start-time and an activity end-time for the activity type from the set of location events, and computes an activity-duration time for the activity type based on the determined activity start-time and the activity end-time.	05-01-2014
20140122484	System and Method for Flexible Distributed Massively Parallel Processing (MPP) Database - An embodiment method for massively parallel processing includes assigning a primary key to a first table in a database and a foreign key to a second table in the database, the foreign key of the second table identical to the primary key of the first table, determining a number of partition groups desired for the database, partitioning the first table into first partitions based on the primary key assigned and the number of partition groups desired, partitioning the second table into second partitions based on the foreign key assigned and the number of partition groups desired, and distributing the first partitions and the second partitions to the partition groups as partitioned. An embodiment system for implementing the embodiment methods is also disclosed.	05-01-2014
20140122485	METHOD AND APPARATUS FOR GENERATING A MEDIA COMPILATION BASED ON CRITERIA BASED SAMPLING - An approach is provided for initiating generation of a media compilation based on one or more sampling criteria. A sampling platform determines at least one subset of one or more media items captured of at least one event. The sampling platform also partitions the at least one subset of the one or more media items into one or more bins and generates at least one compilation of the at least one subset of the one or more items based, at least in part, on whether the one or more media items in the one or more bins at least substantially meet one or more sampling criteria.	05-01-2014
20140122486	AUTO-CLASSIFICATION SYSTEM AND METHOD WITH DYNAMIC USER FEEDBACK - An auto-classification system and method provides dynamic user feedback in a guide that is presented to the user. The feedback presented in the guide enables the user to refine the classification model by adding or removing exemplars, creating, editing or deleting rules, or performing other such adjustments to the classification model. This technology enhances the overall transparency and defensibility of the auto-classification process.	05-01-2014
20140122487	Food Supply Chain Automation Farm Testing System And Method - A computationally implemented system and method that is designed to, but is not limited to: electronically storing farming related test information generated through testing of one or more test samples of one or more first farming related items involved in farming related creation of one or more first biologically based substances; and electronically storing symbolic information recorded as being correlated with one or more tags, said one or more tags recorded as being at least temporarily in vicinity of one or more second farming related items involved in farming related creation of one or more second biologically based substances. In addition to the foregoing, other method aspects are described in the claims, drawings, and text forming a part of the present disclosure.	05-01-2014
20140122488	Food Supply Chain Automation Farm Testing System And Method - A computationally implemented system and method that is designed to, but is not limited to: electronically storing farming related test information generated through testing of one or more test samples of one or more first farming related items involved in farming related creation of one or more first biologically based substances; and electronically storing symbolic information recorded as being correlated with and as being correlated with one or more tags, said one or more tags recorded as being at least temporarily in vicinity of one or more second farming related items involved in farming related creation of one or more second biologically based substances. In addition to the foregoing, other method aspects are described in the claims, drawings, and text forming a part of the present disclosure.	05-01-2014
20140122489	System and Method for Providing Differentiated Storage Service in a Database - In accordance with some embodiments, classification of input/output requests from a database to a storage system may be performed. Each input/output request may be associated with a database class, and each database class may be mapped to a quality of service policy. Thus, quality of service may be enforced such that different data blocks within the storage system of the database may be afforded appropriate quality of service.	05-01-2014
20140122490	Correlation of Written Notes to Digital Content - A system and method receives a plurality of data streams from a smart pen device and a computing device, and indexes the data streams to a synchronized time index. A processor receives a first data stream representing gesture data captured by a smart pen device, and a second data stream representing a sequence of states associated with applications executing on a computing device, such that each state identifies content displayed by a computing device while the gesture data is captured. For example, a state could be a particular page of a digital document displayed by the computing device. After receiving the first and second data streams, the processor indexes the data streams to a synchronized time index, and stores the indexed data streams in a memory.	05-01-2014
20140122491	SYSTEMS AND METHODS FOR AUTHENTICATING AND AIDING IN INDEXING OF AND SEARCHING FOR ELECTRONIC FILES - According to some aspects there is provided a system, method and a device for generating at least one electronic file. The method includes receiving primary data for at least one page to be included in an electronic file; receiving metadata associated with the primary data, the metadata comprising a plurality of tags and corresponding tag values associated therewith; generating a globally unique identifier associated with the page based upon the primary data and the metadata associated therewith; storing the globally unique identifier as a tag value for a unique identifier tag in the metadata associated with that page; generating the at least one page for the file, the at least one page comprising the page data and the metadata including the globally unique identifier; if the at least one page includes a plurality of pages, repeating above to generate a plurality of the pages for the electronic file; and storing the file comprising the at least one generated page in a data storage device.	05-01-2014
20140122492	CROSS-DOMAIN CLUSTERABILITY EVALUATION FOR CROSS-GUIDED DATA CLUSTERING BASED ON ALIGNMENT BETWEEN DATA DOMAINS - A method and system for evaluating cross-domain clusterability upon a target domain and a source domain. Target clusterability is calculated as an average of a respective clusterability of at least one target data item comprised by the target domain. Target-side matchability is calculated as an average of a respective matchability of each target centroid of the target domain to source centroids of the source domain, wherein the source domain comprises at least one source data item. Source-side matchability is calculated as an average of a respective matchability of each source centroid of said source centroids to the target centroids. Source-target pair matchability is calculated as an average of the target-side matchability and the source-side matchability. Cross-domain clusterability between the target domain and the source domain is calculated as a linear combination of the calculated target clusterability and the calculated source-target pair matchability. The cross-domain clusterability is transferred to a device.	05-01-2014
20140129558	Timeline-Based Data Visualization of Social Media Topic - A mechanism is provided in a data processing system for timeline-based social media data visualization. The mechanism receives social media data from at least one social media server. The mechanism filters the social media data to identify a plurality of social media posts related to a time-based event. The mechanism assigns the plurality of social media posts into a plurality of time periods within a timeline of the time-based event. The mechanism generates a timeline-based data visualization presenting the plurality of social media posts in relation to the timeline of the time-based event and presents the timeline-based data visualization.	05-08-2014
20140129559	Timeline-Based Data Visualization of Social Media Topic - A mechanism is provided in a data processing system for timeline-based social media data visualization. The mechanism receives social media data from at least one social media server. The mechanism filters the social media data to identify a plurality of social media posts related to a time-based event. The mechanism assigns the plurality of social media posts into a plurality of time periods within a timeline of the time-based event. The mechanism generates a timeline-based data visualization presenting the plurality of social media posts in relation to the timeline of the time-based event and presents the timeline-based data visualization.	05-08-2014
20140129560	CONTEXT LABELS FOR DATA CLUSTERS - Systems and methods for applying and using context labels for data clusters are provided herein. A method described herein for managing a context model associated with a mobile device includes obtaining first data points associated with a first data stream assigned to one or more first data sources; assigning ones of the first data points to respective clusters of a set of clusters such that each cluster is respectively assigned ones of the first data points that exhibit a threshold amount of similarity and are associated with times within a threshold amount of time of each other; compiling statistical features and inferences corresponding to the first data stream or one or more other data streams assigned to respective other data sources; assigning context labels to each of the set of clusters based on the statistical features and inferences.	05-08-2014
20140136537	ANALYSIS OF CLUSTERING SOLUTIONS - A computing system determines incremental values associated with a plurality of clustering solutions. Each of the clustering solutions groups stores of a retailer into clusters in a different way. For each clustering solution in the plurality of clustering solutions, the incremental value associated with the clustering solution indicates a difference between an estimated revenue associated with the clustering solution and revenue associated with a baseline clustering solution. The computing system then determines, based on the incremental values associated with the plurality of clustering solutions, the appropriate number of clusters. The clustering solutions that group the stores into more or fewer clusters than the appropriate number of clusters tend to be associated with incremental values that are the same or lower than the clustering solutions that group the stores into the appropriate number of clusters.	05-15-2014
20140136538	Method and Apparatus for Communications Analysis - A method of grouping communication sessions, the method comprising: selecting a plurality of communications sessions from a data stream; determining which data structures, of said communication sessions, occur more frequently than chance; and sorting the communication sessions into groups, wherein communication sessions which have similar data structures, determined to occur more frequently than chance, are sorted into the same group.	05-15-2014
20140136539	Computer-Implemented System And Method For Visual Document Classification - A computer-implemented system and method for visual document classification are provided. One or more uncoded documents, each associated with a visual representation, are obtained. Reference documents, each associated with a classification code and a visual representation of that classification code, are obtained. At least one of the uncoded documents is compared to the reference documents and the reference documents similar to the uncoded document are identified based on the comparison. A suggestion for assigning one of the classification codes to the uncoded document based on the classification codes of the similar reference documents is provided, including displaying the visual representation of the suggested classification code placed on a portion of the visual representation associated with the at least one uncoded document. An acceptance of the suggested classification code is received and a size of the displayed visual representation of the accepted classification code is increased.	05-15-2014
20140143247	METHOD AND SYSTEM TO CURATE MEDIA COLLECTIONS - Disclosed is a service which obtains media directly from users and from online sources, which obtains events and anniversaries from online sources, which obtains location and date information associated with photographs, which dynamically provides users with a selection of automatically curated collections of photographs based on the then-current location of the user, based on and relevant to personal and publicly recognized anniversaries and holidays (with dates obtained directly from the users and from online sources), based on specific people or locations associated with dates, events, and anniversaries, and which presents intelligently organized location-based collections which can be quickly re-organized by a user.	05-22-2014
20140143248	INTEGRATION TO CENTRAL ANALYTICS SYSTEMS - Embodiments may provide a system and method for providing aggregated data to a client from a plurality of data sources. The method may include maintaining data in a data repository, maintaining a first analytical view having a first set of metadata including a first set of attributes, and maintaining a second analytical view having a second set of metadata including a second set of attributes. The method may include receiving a request for information, the request specifying the first analytical view and the second analytical view. The information from the data repository in accordance with the first set of metadata and the second set of metadata may be extracted and the extracted information may be analyzed to generate aggregated data in accordance with the first set of metadata and the second set of metadata. The aggregated data may be provided to the client.	05-22-2014
20140143249	UNSUPERVISED PRIORITIZATION AND VISUALIZATION OF CLUSTERS - Techniques are disclosed that automatically identify and order the most differentiated clusters from a given collection of clusters within a dataset. A measure of dissimilarity is computed for each cluster from a defined reference cluster, and the clusters are ordered according to the chosen dissimilarity. At least N clusters are selected as the most differentiated clusters relative to the defined reference. Within each cluster, the top-M most distinguishing cluster attributes can be automatically identified by an analogous process that computes the dissimilarity of each cluster attribute to its corresponding attribute in the reference cluster, and orders the attributes by dissimilarity. This then allows for automatic surfacing of what it is about a cluster that differentiates its members relative to the population as a whole, and to provide insight on what action or treatment might be made to address that specific segment of the underlying population.	05-22-2014
20140143250	Centralized Tracking of User Interest Information from Distributed Information Sources - User interest information, including both explicit and implicit interests, is aggregated from numerous distributed information sources and stored in a canonical format. This user interest information can in turn be accessed, edited and analyzed to provide a variety of useful applications for end users and entities that provide information sources.	05-22-2014
20140143251	MASSIVE CLUSTERING OF DISCRETE DISTRIBUTIONS - The trend of analyzing big data in artificial intelligence requires more scalable machine learning algorithms, among which clustering is a fundamental and arguably the most widely applied method. To extend the applications of regular vector-based clustering algorithms, the Discrete Distribution (D2) clustering algorithm has been developed for clustering bags of weighted vectors which are well adopted in many emerging machine learning applications. The high computational complexity of D2-clustering limits its impact in solving massive learning problems. Here we present a parallel D2-clustering algorithm with substantially improved scalability. We develop a hierarchical structure for parallel computing in order to achieve a balance between the individual-node computation and the integration process of the algorithm. The parallel algorithm achieves significant speed-up with minor accuracy loss.	05-22-2014
20140143252	AUTOMATED PRESENTATION OF INFORMATION USING INFOGRAPHICS - In one embodiment, a web browser-based scheme for combining structured data, infographic definitions, and visual styling information to render infographics and aggregate collections of infographics, referred to herein as “Vizumes” and “Personas.” In exemplary embodiments of the disclosure, a relational database and/or a file system stores user data, infographic definitions, templates and palettes; combines these elements to produce individual infographic representations or a collection of infographic/visualizations (Vizumes) on a single canvas; allows users to choose different infographic visualizations of the same underlying data; and allows users to change the layout, font style, and color palette to instantly produce different visual presentations from the same data.	05-22-2014
20140149407	REPORT VIEWER USIGN RADIOLOGICAL DESCRIPTORS - A method and a report viewer for viewing a structured report, such as medical report describing radiological images using descriptors selected from a predefined list of descriptors, includes the acts of opening the medical report; and in response to the opening act, searching for a further report related to the descriptors of the medical report, and highlighting words and/or sentences in the further report that match keywords derived from the descriptors. The medical report and the further report may be displayed simultaneously with the words and/or sentences being highlighted. The further report may include an unstructured text report, and the method further includes mapping the descriptors to findings in the text report and highlighting the findings.	05-29-2014
20140149408	DATA MINING SHAPE BASED DATA - Embodiments of the disclosure include a method for data mining shape based data, the method includes receiving shape data for each of a plurality of data entries and creating a first abstract from the shape data for each of the plurality of data entries. The method also includes organizing the first abstracts into a plurality of groups based on a criterion and creating a second abstract for each data entry in the plurality of groups based on the criterion and information derived from the first abstract.	05-29-2014
20140149409	MASSIVE RULE-BASED CLASSIFICATION ENGINE - Systems and methods are disclosed herein for performing classification of documents or performing other tasks based on rules. A rule generator receives a request for a rule that will receive as an input a document and output an outcome such as a classification of the document, addition of the document to a whitelist or blacklist, or occurrence of some other outcome. The rules are applied to a document and the document and outcome of the rules are presented to a rater. A rating of the accuracy of the outcome is received from the rater and the rating is propagated to quality metrics of rules that contributed to the outcome. Rules with a quality metric above a threshold may be added to a production rule set. Rules with a quality metric below a threshold may be removed.	05-29-2014
20140149410	METHOD AND SYSTEM FOR IDENTIFYING CLUSTERS WITHIN A COLLECTION OF DATA ENTITIES - Embodiments of a method and system for identifying clusters in collections of data entities are generally described herein. In some embodiments, the method includes defining a metric space over the data entities. A distance function of the metric space may satisfy the triangle inequality. The method may include determining, based on the distance function of the metric space, a value for a number of clusters that minimizes a number of data bits used to define a model of the collection of the data entities. The model may thereby describe the collection of data entities using a minimum description length (MDL). The method may include assigning data entities of the collection of data entities to the clusters. The number of clusters to which the data entities are assigned may correspond to the determined value.	05-29-2014
20140149411	BUILDING, REUSING AND MANAGING AUTHORED CONTENT FOR INCIDENT MANAGEMENT - Building, reusing and calibrating network of authored content, in one aspect, may comprise clustering a plurality of problem tickets into one or more clusters. The clusters may be associated to one or more FAQ nodes in a FAQ network. The associated one or more FAQ nodes may be checked to determine whether the nodes are part of a broken branch. If the one or more FAQ nodes leads to a broken branch, a user may be notified to update the branch, e.g., with an answer or resolution to the one or more FAQ nodes.	05-29-2014
20140149412	INFORMATION PROCESSING APPARATUS, CLUSTERING METHOD, AND RECORDING MEDIUM STORING CLUSTERING PROGRAM - An information processing apparatus, a clustering method, and a clustering program stored on a recording medium, each of which determines an initial value of model parameter of an input data set based on a model parameter of a reference data set that is similar to the input data set and is previously clustered, modifies the initial value so as to match the input data set, and to obtain a clustering result of the input data set using the updated initial value of model parameter.	05-29-2014
20140149413	MANAGING A CLASSIFICATION SYSTEM AND ASSOCIATED SELECTION MECHANISM - Generating a wizard includes receiving a scheme as an input source to form a received scheme, wherein the received scheme is a taxonomy, receiving a defined set of content files to form received content files, and loading the received content files and the received scheme. The received content files can be tagged using the received scheme. A wizard can be generated using the received scheme. The generated wizard is capable of use with an application utilizing the scheme, wherein a change in the received scheme is directly represented in the generated wizard.	05-29-2014
20140149414	METHOD AND SYSTEM FOR COLLECTION, AGGREGATION AND DISTRIBUTION OF FREE-TEXT INFORMATION - An interactive computer system and method collects, aggregates and distributes information derived from free-text responses to questions. The system and method collect free-text responses from a subject user and aggregates them with free-text responses from other users. The system and method then uses these free-text responses in learning methodologies (such as temporal spacing) and styles to facilitate long-term learning and knowledge retention.	05-29-2014
20140149415	SYSTEM AND METHOD FOR PROVIDING SEARCH QUERY REFINEMENTS - A system and method for providing search query refinements are presented. A stored query and a stored document are associated as a logical pairing. A weight is assigned to the logical pairing. The search query is issued and a set of search documents is produced. At least one search document is matched to at least one stored document. The stored query and the assigned weight associated with the matching at least one stored document are retrieved. At least one cluster is formed based on the stored query and the assigned weight associated with the matching at least one stored document. The stored query associated with the matching at least one stored document are scored for the at least one cluster relative to at least one other cluster. At least one such scored search query is suggested as a set of query refinements.	05-29-2014
20140149416	SYSTEM AND METHOD FOR ASSET MANAGEMENT - In a method for asset management, a transmission of information is received from a first reporting source about an asset and a transmission of information from is received from a second reporting source about the asset. The second reporting source uses a different means of transmission than the first reporting source. A database is populated with the information from the first reporting source and the information from the second reporting source such that information from the first reporting source and information from the second reporting source can be collected from the database. A computer system, accessing the database, generates an asset information report comprising at least a portion of information from the first reporting source and at least a portion of information from the second reporting source.	05-29-2014
20140156659	FUSING CONTEXTUAL INFERENCES SEMANTICALLY - System and methods for performing context inference in a computing device are disclosed. In one embodiment, a method of performing context inference includes: determining, at a computing device, a first context class using context-related data from at least one data source associated with a mobile device; and determining, at the mobile device, a fusion class based on the first context class, the fusion class being associated with at least one characteristic that is common to the first context class and a second context class that is different from the first context class.	06-05-2014
20140156660	METHODS AND SYSTEMS FOR QUANTIFYING AND TRACKING SOFTWARE APPLICATION QUALITY - A computer-implemented method and system for quantifying and tracking software application quality based on aggregated user reviews.	06-05-2014
20140156661	APPARATUS AND METHOD FOR DETECTING VEHICLE - A vehicle detecting apparatus obtains a magnetic signal from strength of a geomagnetic field generated due to a movement of a vehicle, and detects and classifies the moving vehicle on the basis of information regarding the latest magnetic signal change pattern indicating a change in the magnetic signal and information regarding a magnetic signal change pattern with respect to each vehicle stored in a database. The vehicle detecting apparatus receives detection and classification information regarding a vehicle which has actually moved, from an external information providing server, and gradually updates it through learning of a magnetic signal change pattern of the corresponding vehicle.	06-05-2014
20140156662	PREDICTING VARIABLE IDENTIFYING DEVICE, METHOD, AND PROGRAM - A predicting variable identifying device includes a candidate profile creating portion sequentially rotates correspondence relationships between intervals in a reference profile and ranks, to create respective candidate profiles, for each different amount of rotation, a group classifying portion, based on a setting from the outside, performs classification of one interval or a plurality of continuous intervals into individual control groups, a rank match checking portion, checks, for each candidate profile B, whether or not the ranks match across the intervals that belong to identical control groups, and a predicting variable identifying portion identifies, as predicting variables, those variables that indicate energy indicator values in the intervals going back by retrospection times, from a point in time for which a prediction is to be made, for those candidate profiles wherein the ranks have been confirmed to be matching for all of the control groups.	06-05-2014
20140156663	INFORMATION STORAGE MEDIUM STORING CONTENT, CONTENT PROVIDING METHOD, CONTENT REPRODUCING METHOD AND APPARATUS THEREFOR - An information storage medium is disclosed. The information storage medium includes one container file, wherein the container file includes a data box including in which a plurality of component files forming content are arranged; and a meta box including information on locations in the content and in the data box with respect to the plurality of component files and characteristic information on of the plurality of component files, and the plurality of component files are arranged in the data box according to the characteristic information.	06-05-2014
20140156664	Computer-Implemented System And Method For Populating Clusters Of Documents - A computer-implemented system and method for populating clusters of documents is provided. A set of clusters is placed in a display in relation to a common origin. One of a plurality of unclustered documents in the display is selected and an angle θ of the document from the common origin is determined. An angle σ of the cluster relative to the common origin is computed for each cluster. A difference is determined between the document angle θ and one such cluster angle σ. A predetermined variance is applied to the difference. The document is placed into the cluster when the difference is less than the variance.	06-05-2014
20140164376	HIERARCHICAL STRING CLUSTERING ON DIAGNOSTIC LOGS - A set of strings can be assigned to clusters utilizing one or more clustering techniques. In accordance with one aspect, hierarchical clustering can be performed in which there are several iterations of clustering. For instance, strings can be clustered based on string length, and each cluster can be assigned to separate sub-clusters based on edit distance between strings. In accordance another aspect, clusters can be analyzed based on the similarity or difference of strings in a cluster to determine if a clustering error exists, and if a clustering error is detected, the cluster can be partitioned into separate clusters.	06-12-2014
20140164377	SYSTEMS FOR SYNCHROPHASOR DATA MANAGEMENT - A system includes a Synchrophasor Data Management System (SDMS), in which the SDMS includes a Synchrophasor Processor System (SPS). The SPS includes a Phasor Data Concentrator (PDC) configured to receive a first plurality of inputs from a first Phasor Measurement Unit (PMU), transform at least one of the first plurality of inputs into a first time aligned output by time aligning the at least one of the first plurality of inputs. The SPS further includes a virtual PMU configured to aggregate the first time aligned output into a PMU dataset, in which the SPS is configured to transmit the PMU dataset to a second PMU, an external PDC, a super PDC, or a combination thereof.	06-12-2014
20140164378	SOURCE RECORD MANAGEMENT FOR MASTER DATA - A method, system, and computer program product for source record management for master data are provided in the illustrative embodiments. A set of data records is received from a set of data sources. A first subset of data records received from a first data source is pre-processed. A match engine is requested to match a data record from the first subset using at least one record in the master data repository, the requesting resulting in a set of matched data records. The set of matched data records, which includes a first data record in the first subset and a second data record in a second subset, is post-processed. The second subset is received from a second data source. The first data record is assigned to a group of records, together representing the entity as a master data record.	06-12-2014
20140164379	Automatic Attribute Level Detection Methods - A method of detecting attribute levels in a dataset that includes determining whether column data in a column for a case identifier is the same, classifying the column data as case level attributes if all of the column data is identical, and classifying the column data as event level attributes if the column data is different for at least one data entry in the column.	06-12-2014
20140164380	METHOD AND APPARATUS FOR AGGREGATING, EXTRACTING AND PRESENTING REVIEW AND RATING DATA - In an embodiment, a system is provided. The system includes a ratings database and a website database. A web crawler is provided to crawl review sites and rating sites of the web. The web crawler is further to access data from the web and store data from the web in a ratings database. The web crawler is also to identify links to other review sites and ratings sites of the data from the web, to store the links in a website database, and to identify sites to crawl from the website database. A data analyzer is provided to analyze data of the ratings database and to normalize data from the web within the ratings database on an essentially continuous basis. A data presenter is provided to receive queries from users and to query the ratings database based on queries from users. The data presenter is to present data to a user responsive to queries and to filter data presented to the user responsive to further user queries.	06-12-2014
20140164381	METHOD AND SYSTEM FOR AGGREGATE BANDING - The invention provides a method of aggregate banding comprising defining an aggregate banding dimension for a first data source, the aggregate banding dimension including at least one aggregation variable, at least one banding variable, and at least one band based at least partly on the at least one banding variable; summarizing the data source based at least partly on the at least one aggregation variable, the summary including at least one distinct value of the at least one aggregation variable; and defining a mapping relationship of the at least one distinct value of the at least one aggregation variable to the or respective band(s) based on the value of the at least one banding variable. The invention further provides related systems and processor-executable instructions.	06-12-2014
20140164382	System and Method for Managing Online Dynamic Content - Methods and systems for managing and publishing online dynamic content. Content can be generated or managed at a publishing server. Content at the publishing server is periodically polled for additions, changes or deletions. Modifications are transmitted to a storage system, which may be further transmitted to a content distribution network, for delivery to a client.	06-12-2014
20140164383	COMPUTER-IMPLEMENTED METHOD AND SYSTEM FOR COMBINING KEYWORDS INTO LOGICAL CLUSTERS THAT SHARE SIMILAR BEHAVIOR WITH RESPECT TO A CONSIDERED DIMENSION - A computer-implemented method and system for combining keywords into logical clusters that share a similar behavior with respect to a considered dimension are disclosed. Various embodiments are operable to order a list of keywords from high activity to low activity, partition the list into at least two sets, a head partition including keywords with an activity level above a predefined threshold, a tail partition including the remainder of the keywords in the list, model the keywords in the head partition based on a set of variables, score the keywords in the head partition based on the modeling, and cluster head partition keywords with tail partition keywords having at least one common variable into at least one keyword cluster.	06-12-2014
20140172854	Apparatus and Methods For Anonymizing a Data Set - Methods and systems are disclosed for anonymizing a dataset that correlates a set of entities with respective attributes. The method may include: for each entity included in a set of entities, transforming two or more attribute values associated with the entity using received preference information, thereby creating for the entity a set of two or more transformed attribute values; clustering the entities included in the set of entities using said transformed attribute values to form at least a first entity cluster consisting of a first subset of the entities and a second entity cluster consisting of a second subset of the entities, wherein no entity included in the first entity cluster is included in the second entity cluster; anonymizing the first subset of entities; and anonymizing the second subset of entities.	06-19-2014
20140172855	FORMATION AND DESCRIPTION OF USER SUBGROUPS - A system forms sub-groups from a given user group of a social networking system and form descriptions of the sub-groups that provide an intuitive understanding of sub-group composition, such as likings of the sub-groups. In one embodiment, a given user group of a social networking system is clustered into a plurality of sub-groups, and representative characteristics—such as the characteristics of a composite or actual member of the sub-group—are determined for each sub-group. In order to form sub-group descriptions, a set of objects, such as pages of the social networking system, is ranked with respect to the representative characteristics of the sub-group. The highest-ranking objects for a sub-group are then used to form the description of that sub-group. For example, the topics associated with each of the highest-ranking pages can be combined into the sub-group description.	06-19-2014
20140172856	METHOD AND SYSTEM FOR STORYTELLING ON A COMPUTING DEVICE - Disclosed is a method and system for enabling storytelling on a computing device. A processor analyzes a set of media items associated with a user, where each media item has associated metadata. The processor identifies, based on analysis of the associated metadata, one or more related characteristics among the media items in the set to form a cluster of media items associated with an event. The processor selects, based on analysis of the media items in the cluster, a plurality of templates from a template database, where each template is configured to represent a moment in the event. The processor edits selected media items in the cluster to fit into the selected templates. The processor creates a mixed-media module including the plurality of templates organized into a desired sequence for the selected templates.	06-19-2014
20140181106	SHARING PHOTOS - Implementations generally relate to sharing photos. In some implementations, a method includes collecting photos associated with one or more objects, where the photos are collected from a plurality of users. The method also includes collecting attention information associated with the one or more objects. The method also includes generating an attention map based on the attention information. The method also includes grouping the one or more photos into groups of photos based on the attention map. The method also includes causing the groups of photos to be displayed to a target user based on one or more predetermined criteria.	06-26-2014
20140181107	Private Queue for a Media Playback System - Embodiments are discussed for providing private playback queues in a media playback system such that users without access rights to the playback queue may not access the contents of the playback queue. The embodiments may involve receiving at a playback device of a network media system a playlist responsive to an instruction via a first controller interface, adding the playlist to a playback queue associated with the zone, receiving a request from a second controller interface for the information identifying the one or more items in the playback queue, determining that the second controller interface lacks a credential to receive the information identifying the one or more items in the playback queue, and providing the information identifying a subset of the one or more items in the playback queue to the second controller interface.	06-26-2014
20140181108	USING DATA FROM WEARABLE MONITORS TO IDENTIFY AN INDIVIDUAL - The methods and systems described herein may involve determining at least one lifeotype of at least one individual, analyzing the at least one lifeotype, and delivering content to at least one individual based on the analysis. The methods and systems described herein may involve providing a game, determining at least one lifeotype of at least one player of the game, analyzing the at least one lifeotype, and affecting the game play based on the analysis. The methods and systems described herein may involve providing an interactive space, determining at least one lifeotype of at least one individual in the space, analyzing the at least one lifeotype, and modifying at least one attribute of the space based on the analysis.	06-26-2014
20140181109	SYSTEM AND METHOD FOR ANALYSING TEXT STREAM MESSAGE THEREOF - A system and method for analyzing text stream message for a micro-blog are provided. The system includes a sliding window module, storing a plurality of text stream messages from the micro-blog and updating the plurality of text stream messages once every preset duration; a dynamic text weight module, receiving the plurality of text stream messages and calculating the plurality of text stream messages for generating a burst weight according to a dynamic text stream weight algorithm; a clustering module, clustering the plurality of text stream messages for generating a plurality of clusters by a clustering algorithm according to the plurality of text stream messages and the burst weight; and a memory device, storing the clusters.	06-26-2014
20140181110	METHOD AND SYSTEM FOR STORYTELLING ON A COMPUTING DEVICE VIA MULTIPLE SOURCES - Disclosed is a method and system for enabling storytelling on a computing device. A processor (a) analyzes a first set of media items associated with a user, where each media item has associated metadata; (b) identifies, based on analysis of the associated metadata in the first set, one or more related characteristics among the media items in the first set; (c) forms a cluster of media items associated with an event based on the one or more characteristics; (d) repeats steps (a) and (b) for a second set of media items; (e) adds one or more media items in the second set having the one or more related characteristics, from a computing device associated with the second set, to the cluster; (f) edits selected media items in the cluster to fit into selected templates; and (g) creates a mixed-media module comprising the templates organized into a desired sequence.	06-26-2014
20140188881	System and Method To Label Unlabeled Data - In accordance with an embodiment of the invention, there is provided a technique for permitting a machine to discover classes and topics that data contains and to annotate data objects with those identified classes. The technique enables machines to group and annotate data objects in ways that are meaningful and intuitive for a user of the data objects. An interactive method uses clustering, along with feedback from a user on the clustering output, to discover a set of classes. The feedback from the user is used to guide the clustering process in the later stages, which results in better and better discovery of classes and annotation with more and more human feedback. A method can be used to produce labeled data that involves discovering classes and annotating a given dataset with the discovered class labels. This is advantageous for building a classifier that has wide applications, such as call routing and intent discovery.	07-03-2014
20140188882	SPECIFIC ONLINE RESOURCE IDENTIFICATION AND EXTRACTION - A method of automatically identifying and extracting distributed online resources may include locating in a website a candidate entry list page. The method may also include verifying the candidate entry list page as an entry list page using repeated pattern discovery. The method may also include segmenting the entry list page into a plurality of entry items. The method may also include extracting from the plurality of entry items a plurality of candidate target pages. The method may also include verifying at least some of the candidate target pages as target pages including analyzing a visual structure and presentation of the candidate target pages. The method may also include extracting metadata from the target pages. The method may also include organizing the target pages and/or the metadata in one or more databases.	07-03-2014
20140188883	INFORMATION EXTRACTING SERVER, INFORMATION EXTRACTING CLIENT, INFORMATION EXTRACTING METHOD, AND INFORMATION EXTRACTING PROGRAM - According to one embodiment, an information extracting method includes: collecting a text in which a keyword of interest appears, the keyword of interest, and a time of creation of the text; extracting a keyword included in the text except for the keyword of interest and configured to extract the time of creation; extracting the keyword having a time score obtained on the basis of an appearance frequency of the keyword in a time interval exceeding a first threshold value and a local score obtained from the appearance frequency of the keyword in a predetermined local area exceeding a second threshold value as a local hot word, and also extract the extracted time interval of the extracted keyword and the keyword of interest corresponding to the keyword; and storing the extracted local hot word, the time interval, and the keyword of interest.	07-03-2014
20140195534	CREATING DIMENSION/TOPIC TERM SUBGRAPHS - A term graph for a group (G), where G is defined by a given set of values d for a set of dimensions (D) relative to a topic (X) may be created by retrieving a graph (H) comprising terms related to an entity and associated with topic X; identifying a node (N) that represents topic X in graph H; identifying resources (R) associated with topic X in group G (used or accessed by, or otherwise associated with values d in group (G); compiling a list (L) of terms used in the identified resources (R); and creating, starting from node N, a connected subgraph S representing the term graph, wherein each node in subgraph S represents one of the terms from list L and has a path to node N.	07-10-2014
20140195535	METHODS AND APPARATUSES FOR PROVIDING AND DISPLAYING CLUSTER DATA - Embodiments of a method and apparatus for providing cluster data are generally described herein. In some embodiments, the method includes receiving, from a geographic information system (GIS) client device, geographical coordinates of a viewable extent of the GIS client device. The method may include dividing the viewable extent into portions. The method may include receiving geographically-referenced data points. The method may include determining the portions of the viewable extent in which the data points are located. The method may include transmitting, to the GIS client device, clustering information for the portions. The clustering information may be generated based on the data points determined to be within the portions. Other embodiments are also described.	07-10-2014
20140195536	CREATING DIMENSION/TOPIC TERM SUBGRAPHS - A term graph for a group (G), where G is defined by a given set of values d for a set of dimensions (D) relative to a topic (X) may be created by retrieving a graph (H) comprising terms related to an entity and associated with topic X; identifying a node (N) that represents topic X in graph H; identifying resources (R) associated with topic X in group G (used or accessed by, or otherwise associated with values d in group (G); compiling a list (L) of terms used in the identified resources (R); and creating, starting from node N, a connected subgraph S representing the term graph, wherein each node in subgraph S represents one of the terms from list L and has a path to node N.	07-10-2014
20140195537	CONTENT PROVIDING TECHNIQUES - Techniques for content providing and classifying users based on content search conditions are generally described. In some examples, the techniques may be embodied in apparatus, systems, and methods. An example content providing apparatus may include a receiving unit, a classifying unit, a content acquisition unit, and a determining unit. The receiving unit may be configured to receive content search conditions and the classifying unit may be configured to classify users into types according to the search conditions. The content acquisition unit may be configured to acquire content that includes non-text data based on the received search conditions and the determining unit may be configured to evaluate acquired content to identify data of the non-text data in the acquired content that is firstly processed to output based on the user type.	07-10-2014
20140195538	EFFICIENT ACTIVITY CLASSIFICATION FROM MOTION INPUTS - A device worn by a human performing a method using accelerometer data to classify human activities is disclosed. The method uses memory and computation efficiently. A stream of samples is divided into a sequence of sampling periods; each sample has an acceleration value for each axis (e.g., a 3-D accelerometer). Standard deviation of the values for each axis during each sampling period are calculated. A running sum of each axis' values can be maintained sample-by-sample. Each value is sorted into a bin of a histogram, by quantifying a deviance from a respective mean in standard deviations. A standard deviation produced from samples of a previous period can be used. The histogram is compared with histograms associated with particular activities and a classification output can be produced. Classification outputs from multiple sampling periods can be used for voting. A threshold amount of activity can be required to begin activity classification.	07-10-2014
20140201208	Classifying Samples Using Clustering - An unlabeled sample is classified using clustering. A set of samples containing labeled and unlabeled samples is established. Values of features are gathered from the samples contained in the datasets and a subset of features are selected. The labeled and unlabeled samples are clustered together based on similarity of the gathered values for the selected subset of features to produce a set of clusters, each cluster having a subset of samples from the set of samples. The selecting and clustering steps are recursively iterated on the subset of samples in each cluster in the set of clusters until at least one stopping condition is reached. The iterations produce a cluster having a labeled sample and an unlabeled sample. A label is propagated from the labeled sample in the cluster to the unlabeled sample in the cluster to classify the unlabeled sample.	07-17-2014
20140201209	INFORMATION PROCESSING DEVICE AND METHOD FOR MANAGING FILE	07-17-2014
20140207776	METHOD AND SYSTEM FOR LINKING DATA SOURCES FOR PROCESSING COMPOSITE CONCEPTS - A computer-implemented method and system and computer-readable medium are disclosed for linking an ontology provided by a content service (i.e. category ontology) with a word expansion ontology (i.e. lexical ontology). A user may provide an input such as a voice command to an application. The voice command is processed by a natural language processing (NLP) engine to derive the user's intent and to extract relevant entities embodied in the command. The NLP engine may create a composite concept set containing multiple permutations of the concepts (entities extracted) and provide the composite concept set to a concept mapper. The concept mapper searches a mapping file and applies one or more scoring operations to determine a best match between the composite concept set and at least one category provided by the category ontology. The content service is searched using the category and the results are displayed to the user.	07-24-2014
20140207777	COMPUTER IMPLEMENTED METHODS AND APPARATUS FOR IDENTIFYING SIMILAR LABELS USING COLLABORATIVE FILTERING - Disclosed are methods, apparatus, systems, and computer-readable storage media for identifying similar labels. In some implementations, one or more servers maintain a plurality of data entries in one or more database tables storing textual data, each data entry of a first portion of the data entries including: a text sequence, a label, and a text-to-label association score, and each data entry of a second portion of the data entries including: a first label, a second label, and a similarity score. The one or more servers analyze the data of the first portion of data entries to generate one or more pairs, each pair including information identifying a first label and a second label. The one or more servers calculate a similarity score for each of the one or more pairs and store the respective similarity scores in the second portion of the data entries.	07-24-2014
20140207778	SYSTEM AND METHODS THEREOF FOR GENERATION OF TAXONOMIES BASED ON AN ANALYSIS OF MULTIMEDIA CONTENT ELEMENTS - A method and system for generating concept structures and taxonomies based on received multimedia data elements (MMDEs) are provided. The method comprises receiving at least one MMDE; generating at least one signature for the at least one received MMDE; matching the at least one generated signature to a plurality of clusters to find at least one matching cluster; associating the at least one generated signature with each of the at least one matching cluster; and analyzing the at least one generated signature with respect to a signature reduced cluster (SRC) of each of the at least one matching cluster to generate a taxonomy, wherein the taxonomy relates to the at least one received MMDE and an MMDE respective of each of the at least one matching cluster.	07-24-2014
20140207779	MANAGING TAG CLOUDS - A method, data processing system, and computer program product for managing tags. A computer system identifies one or more groups of similar tags from a multiplicity of tags proposed for inclusion in a tag cloud. The computer system identifies one or more representative tags to represent the respective one or more groups of similar tags. The computer system displays the one or more representative tags in the tag cloud instead of all the similar tags in the one or more groups of similar tags, and concurrently displays other tags in the multiplicity of tags that are not included in the one or more groups of similar tags.	07-24-2014
20140214831	INTEGRATING SMART SOCIAL QUESTION AND ANSWERS ENABLED FOR USE WITH SOCIAL NETWORKING TOOLS - Embodiments include a program product and a method for providing responses to questions provided on a social media site. The method includes receiving, via a processor, a user question from a social networking site and decomposing and filtering the user question so that it can be further analyzed. The method also includes generating a list of most closely matched potential responders based on analysis of the user question and sending the most closely matched potential responders the user question. Upon receiving responses back from the most closest matched potential responders, these responses are aggregated by the processor in a final response format.	07-31-2014
20140214832	INFORMATION GATHERING VIA CROWD-SENSING - Methods and arrangements for gathering and managing crowd-sourced information. An event is identified using crowd-sourced information, and component parts of the event are identified using the crowd-sourced information. Information missing from the event is identified using the crowd-sourced information. Individuals associated with the event are identified, and additional crowd-sourced information on the event is harvested from the individuals.	07-31-2014
20140214833	SEARCHING THREADS - Searching threads can comprise extracting a number of keywords from a number of threads inside a discussion forum in response to a search query, clustering the number of keywords utilizing thread titles and thread content from the within the number of threads, and searching for a thread from within the number of threads that is relevant to the search query based on the clustering.	07-31-2014
20140214834	CLUSTERING SIGNIFIERS IN A SEMANTICS GRAPH - Clustering signifiers in a semantics graph can comprise coarsening a semantics graph associated with an enterprise communication network containing a plurality of nodes into a number of sub-graphs containing supernodes; partitioning each of the number of sub-graphs into a number of clusters; and iteratively refining the number of clusters to reduce an edge-cut of the semantics graph, based on the number of clusters.	07-31-2014
20140214835	SYSTEM AND METHOD FOR AUTOMATICALLY CLASSIFYING DOCUMENTS - A system and method for automatically classifying documents using an annotated topic tree is provided. A set of topics may be extracted from a document corpus such that each document in the document corpus is associated with a topic model. A sample set of documents may be selected from the document corpus during a current sampling round. The topic models associated with the sample set of documents may be annotated by human reviewers with coding information. Each coded document may be coded as ‘responsive’, ‘non-responsive’, ‘arguably responsive’, ‘null’, and/or for other codes or issues, which are related to the topic model associated with that document. An annotated topic tree may be formed based on the annotated topic model. One or more machine learning algorithms may be used to project the information in the annotated topic tree to the rest of the document corpus. A voting algorithm which may comprise a plurality of machine learning algorithms may also be used to project the sampling judgments to the rest of the document corpus. To continuously enhance the performance of automatic classification of documents, the projection results may be analyzed after each sampling round.	07-31-2014
20140214836	SYSTEMS AND METHODS USING AN INDIVIDUALS PREDICTED TYPE AND CONTEXT FOR BEHAVIORAL MODIFICATION - The methods and systems described herein may involve determining at least one lifeotype of at least one individual, analyzing the at least one lifeotype, and delivering content to at least one individual based on the analysis. The methods and systems described herein may involve providing a game, determining at least one lifeotype of at least one player of the game, analyzing the at least one lifeotype, and affecting the game play based on the analysis. The methods and systems described herein may involve providing an interactive space, determining at least one lifeotype of at least one individual in the space, analyzing the at least one lifeotype, and modifying at least one attribute of the space based on the analysis.	07-31-2014
20140214837	AUTOMATICALLY ANALYZING OPERATION SEQUENCES - An automatic analysis method for operation sequence and a system thereof are disclosed. The method comprising: receiving at least one operation sequence containing at least one operation record, the operation record including an operation of switching from a previous user interface to a post user interface, an interval time of switching from the previous user interface to the post user interface; forming time-dependent operation record groups of respective operation sequences based on the interval time and a first time threshold, wherein the time-dependent operation record group includes operation records whose interval time is less than the first time threshold; comparing time-dependent operation record groups of respective operation sequences to obtain identical time-dependent operation record groups; and calculating a frequency that identical time-dependent operation record groups occur in the operation sequence to obtain the identical time-dependent operation record groups having high frequency.	07-31-2014
20140214838	METHOD AND SYSTEM FOR PROCESSING LARGE AMOUNTS OF DATA - A method of processing data by creating an inverted column index is presented. The method entails categorizing words in documents according to data type, generating a posting list for each of the words that are categorized, and organizing the words in an inverted column index format. In an inverted column index, each column represents a data type, and each of the words is encoded in a key and the posting list is encoded in a value associated with the key. In some cases, the words that are categorized may be the most commonly appearing words arranged in the order of frequency of appearance in each column. This indexing method provides an overview of words that are in a large dataset, allowing a user to choose the words that are of interest to him and “drill down” into contents that include that word by way of queries.	07-31-2014
20140214839	METHOD AND APPARATUS FOR GENERATING A SORTED LIST OF ITEMS - An electronic device and method for automatic generation of a sorted list of items related to a seed item comprises a relatedness determinator to compare the seed item with a plurality of further items and to determine a relatedness value for each further item with respect to the seed item. The device also has a clustering engine to cluster the further items by determining a relative relatedness between (among) the further items. Each further item is assigned to one cluster. The device also has a list generator to generate a sorted result list by sorting the further items according to both, their relatedness value and their belonging (or membership) to a cluster in that once an item is added to the sorted list, the relatedness value depending ranking of the further items in that cluster is at least momentary lowered so as to promote adding items of further clusters.	07-31-2014
20140222816	FEEDBACK ANALYSIS FOR CONTENT IMPROVEMENT TASKS - Provided are a method, computer program product, and system for improving content. Feedback related to the content is received from a reviewer. The feedback is analyzed with text analytics and classified by on the feedback analysis. A reviewer score is generated and a task is generated for reviewing the feedback wherein the task includes the feedback classification and the reviewer score.	08-07-2014
20140222817	System and method for grouping segments of data sequences into clusters - A system and method for grouping segments of data sequences into clusters is a hierarchical clustering method that groups data points into clusters that are globular or compact. Cluster sets can be constructed only for each select level of a hierarchical sequence. Whether a level of a hierarchical sequence is meaningful is determinable prior the beginning of when the corresponding cluster set is constructible.	08-07-2014
20140222818	UPDATE CONTROL DEVICE, UPDATE CONTROL PROGRAM, AND UPDATE CONTROL METHOD - An update control device includes an acquiring unit, a classifying unit, and an update processing unit. The acquiring unit acquires component information that indicates a component in multiple devices. The classifying unit calculates the similarity of the component information related to the multiple devices acquired by the acquiring unit and classifies, on the basis of the calculated similarity, the multiple devices into one or multiple device groups. The update processing unit performs a process for updating systems of the devices that are classified into the same device group by the classifying unit.	08-07-2014
20140236946	NEIGHBORHOOD BLOCK COMMUNICATION METHOD AND SYSTEM - A method and system of communication in a neighborhood block includes obtaining member data associated with a first member of the community network and determining a first location associated with the first member based on the member data, where the first location is a neighborhood address. The method and system further include storing the member data in a member repository, determining a first number of points associated with the first member, wherein points are connections of members with each other through social links, and obtaining a first region of communication for the first member based on the first location and the first number of points. The method and system may include displaying the first region of communication on a geo-spatial map and/or bounding the first region of communication based on a connectedness of the first member in the first region of communication.	08-21-2014
20140236947	Computer-Implemented System And Method For Visually Suggesting Classification For Inclusion-Based Cluster Spines - A computer-implemented system and method for visually suggesting classification for inclusion-based document cluster spines are provided. A set of reference documents each associated with a classification code is designated. A different set of uncoded documents is obtained. One or more of the coded reference documents are combined with a plurality of uncoded documents into a combined document set. The documents in the combined document set are grouped into clusters. The clusters are organized along one or more spines, each spine including a vector. A visual suggestion for assigning one of the classification codes to one of the spines is provided, including visually representing each of the reference concepts in the clusters along that spine.	08-21-2014
20140236948	SYSTEM AND METHOD FOR IMPROVING APPLICATION CONNECTIVITY IN A CLUSTERED DATABASE ENVIRONMENT - A clustered database environment includes multiple database instances that appear as one server. An application server can use a data source and connection pools to connect with the clustered database. A notification service broadcasts notifications describing state changes in the database instances, which are then used by the data source and connection pools to control access to the database instances. A data source configuration allows for specification of a preferred affinity policy. A session affinity policy is used to provide database instance affinity for database access made under the context of a web session, whereby database operations are directed to a particular instance for a period of time when the application may be performing multiple, related updates to a specific data set. Directing such operations to a single database instance can be used to improve application performance due to increased local cache utilization.	08-21-2014
20140236949	USER INPUT AUTO-COMPLETION - Methods and computer program product relate to user input auto-completion. The methods and product are executable on a processing device in a computing system environment so as to provide an auto-completion scheme with enhanced capabilities that improve user efficiency when performing a task.	08-21-2014
20140236950	SYSTEM AND METHOD FOR SUPPORTING CLUSTER ANALYSIS AND APPARATUS SUPPORTING THE SAME - Disclosed is a cluster analysis supporting system, with respect to providing a cluster analysis function, including a cluster analysis service apparatus configured to request a distributed processing service apparatus to perform a k-means clustering based on k values within a predetermined range and a preset iteration frequency until a predefined converge condition is satisfied, and if center values of the k values are calculated from the distributed processing service apparatus, select an optimum center value among the center values, and control calculation and application of an optimum k value through an index calculation with respect to applying clustered indexes assigned based on the selected optimum center value to data, and the distributed processing service apparatus configured to perform the k-means clustering based on the k values and the preset iteration frequency provided from the cluster analysis service apparatus upon the request by the cluster analysis service apparatus.	08-21-2014
20140244640	METHOD AND APPARATUS FOR CONTENT MANIPULATION - A method and system for organizing content data by adding one or more identifiers that place the content data in context or provide additional details about the content. The method includes attaching a label to image data and classifying the image data based on the label.	08-28-2014
20140244641	HOLISTIC CUSTOMER RECORD LINKAGE VIA PROFILE FINGERPRINTS - The present disclosure extends to methods, systems, and computer program products for linking customer profiles in a customer profile database. Customer profile data are transformed from text data to large, sparse bit sets. The bit sets are then clustered into clusters based on similarities between the bit sets. Evaluation and analysis of customer profiles within clusters permit linking of customer profiles that exhibit selected degrees of similarity. This technology is both fast and accurate, and it preserves confidentiality of customer information by converting text data to bit sets.	08-28-2014
20140244642	DATA PARTIONING BASED ON END USER BEHAVIOR - End user data partitioning can include receiving a number of data queries for a data source from a user, developing a dimension relation graph based on attributes of the number of data queries, and partitioning the data source based on the dimension relation graph.	08-28-2014
20140244643	WORKLOAD IDENTIFICATION - An embodiment of the invention provides an apparatus and method for classifying a workload of a computing entity. In an embodiment, the computing entity samples a plurality of values for a plurality of parameters of the workload. Based on the plurality of values of each parameter, the computing entity determines a parameter from the plurality of parameters that the computing entity's response time is dependent on. Here, the computing entity's response time is indicative of a time required by the computing entity to respond to a service request from the workload. Further, based on the identified significant parameter, the computing entity classifies the workload of the computing entity by selecting a workload classification from a plurality of predefined workload classifications.	08-28-2014
20140244644	EVENT DETECTION ALGORITHMS - A method for analysing incoming data, comprising the steps of processing the incoming data in segments to output a sequence of segment types by extracting one or more properties of an incoming data segment and forming an Unknown Property Vector for each segment of data in the incoming data, and processing the sequence of segment types to identify events in the incoming data. The sequence of segment types is determined, for each segment, by reference to a set of Reference Property Vectors that are relevant to the Unknown Property Vector. This may involve application of first and/or second and/or further functions to identify at least a first subset of Reference Property Vectors that are relevant to the Unknown Property Vector. Alternatively, a logistic regression algorithm, derived using clustering or classification methods for identifying candidate vectors, may be used.	08-28-2014
20140244645	SYSTEM AND METHOD TO PROVIDE GROUPING OF WARNINGS GENERATED DURING STATIC ANALYSIS - The present disclosure generally relates to warnings generated based on static analysis and, more particularly, to grouping warnings generated based on static analysis. In one embodiment, a method for grouping a plurality of warnings generated based on a static analysis of an application program is provided. The method may include analyzing, by one or more processors using programmed instructions stored in a memory, the application program to generate the plurality of warnings; identifying, by the one or more processors, one or more similar warnings based on the plurality of warnings, the similar warnings having structurally and semantically similar expressions of interest (EOI); and generating, by the one or more processors, one or more groups of warnings based on the plurality of warnings, the one or more groups of warning including one or more of corresponding identified similar warnings.	08-28-2014
20140244646	PROCESSING WEBPAGE DATA - A method, apparatus and system for processing webpage data. The method includes: in response to a webpage being opened, sending a link contained in the webpage to a network side device; receiving a group identification from the network side device, the group identification being determined by the network side device according to the link and used to specify a group the link belongs to; determining whether there is a browsed link belonging to the group specified by the group identification; and in response to determining there is a browsed link belonging to the group specified by the group identification, prompting that webpage content pointed by the link contained in the webpage has been browsed.	08-28-2014
20140244647	DERIVATION OF ONTOLOGICAL RELEVANCIES AMONG DIGITAL CONTENT - System for ontological evaluation and filtering of digital content evaluates metadata associated with content available from an original content server. The metadata is filtered and evaluated by a processing cluster to develop correlation among content for the formation of content “channels”. In general, the filtering and evaluation criteria use predictive algorithms and seek to identify content that is likely to be desired for download by the consumers located at, for example, a particular multi-dwelling unit. The content, once so correlated, is then grouped or aggregated into “channels”.	08-28-2014
20140244648	GEOGRAPHICAL DATA STORAGE ASSIGNMENT BASED ON ONTOLOGICAL RELEVANCY - System and method for distributing channelized content to a plurality of cohesive local networks. The channelized content is aggregated at a remote central processor with associated storage according to ontological relevancy, and then distributed over a network such as the internet to the local networks. The consumers connected to the respective local networks receive that channelized data with low latency and improved data rates.	08-28-2014
20140244649	SYSTEM AND METHODS FOR PREDICTING FUTURE TRENDS OF TERM TAXONOMIES USAGE - A method and system for predicting future trends of terms taxonomies of users generated content are provided. The method includes crawling one or more sources of users generated content to collect phrases mentioned by users of the one or more data sources; periodically analyzing one or more term taxonomies to determine at least a trend of at least a non-sentiment phrase with respect of a plurality of sentiment phrases, wherein a term taxonomy is an association between a non-sentiment phrase and a sentiment phrase, the non-sentiment and sentiment phrases are included in the collected phrases; and generating a prediction of future behavior of the at least trend with respect of the one or more term taxonomies.	08-28-2014
20140244650	DISTRIBUTED EVENT PROCESSING - A distributed event processing method includes providing a plurality of connectors. Each provided connector is configured to acquire event data from an assigned data source, partition acquired event data into clusters, and divide each cluster into chunks. The method also includes collecting the chunks from the plurality of connectors and storing the chunks to a data file that can be queried.	08-28-2014
20140250125	IDENTIFYING AN INCIDENT-ADDRESSING STEP - Relationships among incident-addressing steps, applications, and incidents are determined, based on information relating to the applications and the incidents. For addressing a given incident that occurred with respect to a particular application, at least one incident-addressing step is identified using the determined relationships.	09-04-2014
20140250126	Photo Clustering into Moments - In one embodiment, a method includes automatically and without user input grouping one or more images captured by a first user into clusters of particular moments based at least in part on metadata associated with one or more of the images or data determined through analysis of one or more of the images. Each particular moment being associated with a particular geo-location and time. The method also includes, for each of one or more of the clusters, determining curating information corresponding to the cluster based at least in part on the metadata associated with images in the cluster, the data determined through analysis of images in the cluster, or social-graph information associated with images in the cluster; and providing the clusters of images and at least some of the curating information corresponding to them for display on a computing device of the first user.	09-04-2014
20140258293	NAVIGATION SYSTEM WITH DEDUPER MECHANISM AND METHOD OF OPERATION THEREOF - A method of operation of a navigation system includes: determining a similarity level based on comparing a plurality of a point of interest (POI) record; generating a record cluster based on the similarity level for grouping the plurality of the POI record; and generating an exemplary POI based on the record cluster for displaying on a device.	09-11-2014
20140258294	ON-DEMAND CONTENT CLASSIFICATION USING AN OUT-OF-BAND COMMUNICATIONS CHANNEL FOR FACILITATING FILE ACTIVITY MONITORING AND CONTROL - Communications to a server over an in-band communications channel are monitored for requests to access a file. Based on the communications, a request to access a particular file stored by the server is identified. Security and/or audit rules are identified based on the request. A determination is thereafter made that the security and/or audit rules require evaluation of classification information for contents of the requested file. Thus, a determination is made as to whether classification information for the contents of the particular file is available, such as determining whether the classification information is stored in a local classification cache. Responsive to a determination that the classification information is not available, classification information is obtained for the contents of the particular file using an out-of-band communications channel. Thereafter, processing with respect to the request to access the particular file is performed based on the obtained classification information and the one or more security and/or audit rules.	09-11-2014
20140258295	Approximate K-Means via Cluster Closures - A set of data points is divided into a plurality of subsets of data points. A set of cluster closures is generated based at least in part on the subset of data points. Each cluster closure envelopes a corresponding cluster of a set of clusters and is comprised of data points of the enveloped cluster and data points neighboring the enveloped cluster. A k-Means approximator iteratively assigns data points to a cluster of the set of clusters and updates a set of cluster centroids corresponding to the set of clusters. The k-Means approximator assigns data points based at least in part on the set of cluster closures.	09-11-2014
20140258296	SYSTEM AND METHOD FOR MANAGEMENT OF NETWORK MONITORING INFORMATION - A system and method for management of network monitoring information includes a data collector for collecting real-time network information from network switching units, an aggregator for periodically aggregating the collected real-time network information and generating corresponding history information, a preprocessor for periodically determining results for first queries based on the collected real-time network information and the history information, a data storage system, and a data retriever for retrieving information from the data storage system. The data storage system stores the collected real-time network information, the aggregated history information, and the preprocessed results of the first queries. The data storage system also periodically purges the stored real-time information based on a first time-to-live value and the stored history information based on a second time-to-live value. The information is retrieved based on the stored real-time network and history information, the stored preprocessed results of the first queries, the first queries, and second queries different from the first queries.	09-11-2014
20140258297	Automatic grouping of photos into folders and naming the photo folders - This invention provides intelligent automatic grouping of photos. This is done by recognizing multiple photos taken from the same scene and grouping them together and displaying only one of the photos (auto-selected photo) from the group as the top-most visible photo on the screen. The user can see other photos from the same scene by placing curser or clicking on the auto-selected photo from the group. The device can automatically propose the best photo from the group and suggest photos that can be deleted from the group. This invention also provides another feature. Based on Time and date, GPS location information, contacts address, events (such as Birthday, Christmas, Valentine), etc. the device automatically groups the photos in a folder and provides the proper name for it such as “Valentine 2013 in Los Angeles”, or “John's birthday”.	09-11-2014
20140258298	PROFESSIONAL ADVICE AGGREGATION SYSTEM - A forum server distinguishes credentialed professional users from other users while maintaining the anonymity of all users. All users supply information by which they are accurately and personally identified. Personally identifying information of credentialed professionals is used to very the credential of the user through a credential authority server. In a user forum, both lay and professionally credentialed users can participate. Lay users are prevented from rating messages posted by credentialed professional users. In a professional consultation forum, only credentialed professionals are permitted to participate. Credentialed professional users are provided with a user interface by which they can rate and comment upon messages posted by other credentialed professionals. Thus, the ratings of peers are well-informed ratings. When a credentialed professional user participates in either forum, the aggregate peer rating of the user is displayed.	09-11-2014
20140258299	Method for Assigning Similarity-Based Codes to Life Form and Other Organisms - A system and method for assigning a classification code, name or identification number to a life form. The classification code is based on the similarity of a nucleic acid sequence of a life form to another life form. The classification code has a plurality of predetermined positions with each position corresponding to a threshold level of nucleic acid similarity to a reference life form having a nucleic acid sequence.	09-11-2014
20140258300	Independent Table Nodes In Parallelized Database Environments - A recipient node of a multi-node data partitioning landscape can receive, directly from a requesting machine without being handled by a master node, a first data request related to a table. A target node of a plurality of processing nodes can be identified to handle the data request. The determining can include the recipient node applying partitioning information to determine a target data partition of the plurality of data partitions to which the data request should be directed and mapping information associating each data partition of the plurality of data partitions with an assigned node of the plurality of processing nodes. The recipient node can redirect the data request to the target node so that the target node can act on the target data partition in response to the data request.	09-11-2014
20140280137	SENSOR ASSOCIATED DATA OF MULTIPLE DEVICES BASED COMPUTING - Computer-readable storage media, apparatus and method associated with storing a copy of local data in a historical data store, among other embodiments, are disclosed herein. In embodiments, one or more computer-readable storage media may contain instructions which when executed by a computing device may provide access of local data to one or more applications on the computing device for contemporaneous processing by the one or more applications. The local data may be associated, at least in part, with one or more sensors of the computing device. In some embodiments, a copy of the local data may be transmitted to a remote historical data store where it may be categorized and correlated with data from computing devices associated with one or more other users for further processing.	09-18-2014
20140280138	CONTEXT DEMOGRAPHIC DETERMINATION SYSTEM - Systems, methods, and devices for determining contexts and determining associated demographic profiles using information received from multiple demographic sensor enabled electronic devices, are disclosed. Contexts can be defined by a description of spatial and/or temporal components. Such contexts can be arbitrarily defined using semantically meaningful and absolute descriptions of time and location. Demographic sensor data is associated with or includes context data that describes the circumstances under which the data was determined. The demographic sensor data can include demographic sensor readings that are implicit indications of a demographic for the context. The sensor data can also include user reported data with explicit descriptions of a demographic for the context. The demographic sensor data can be filtered by context data according a selected context. The filtered sensor data can then be analyzed to determine a demographic profile for the context that can be output to one or more users or entities.	09-18-2014
20140280139	Detection and Visualization of Schema-Less Data - Embodiments provide a viewer/editor for schema-less data, such as a NoSQL database. The data structures are displayed so that each entity type in the data uses a different color and variable column widths. This allows the user to identify relationships between entities. For a selected entity, only the properties applicable to that entity are displayed by the viewer/editor. The column width for each property is optimized to reduce confusion and to allow the user to focus on the selected data.	09-18-2014
20140280140	METHOD AND SYSTEM FOR DISPLAYING RECOMMENDED CONTENT SUCH AS MOVIES ASSOCIATED WITH A CLUSTER - A method and system for recommending content includes a user device having a memory storing a taxonomy table having content cluster identifiers therein. The user device receives an external recommendations list for the content cluster at the user device. The recommendations list has a plurality of content identifiers each having one content cluster identifier. A viewer tracking module generates a viewed content history for content relative to the content clusters identifiers that correspond to viewed content at the user device. A recommendation module generates an internal recommendations list by comparing the external recommendations list to the viewed content history at the user device. The internal recommendation list also presents recommendations capturing the distinct user tastes in a family viewing device. A display displays the internal recommendations list, with section headers of different granularity describing the nature of the recommended content at cluster, sub-genre and genre levels.	09-18-2014
20140280141	METHOD AND SYSTEM FOR GROUPING AND CLASSIFYING OBJECTS IN COMPUTED TOMOGRAPHY DATA - A method for classifying objects in volumetric computed tomography (CT) data is described. The method is implemented by a computing device having a processor and a memory coupled to the processor. The method includes receiving, by the computing device, one or more volumetric CT data sets, identifying, by the computing device, a first object in the one or more volumetric CT data sets, identifying, by the computing device, a second object in the one or more volumetric CT data sets, determining, by the computing device, a first similarity amount between the first object and the second object, identifying, by the computing device, a first group comprising at least the first object and the second object, based at least in part on the first similarity amount, and designating, by the computing device, all of the objects in the first group as non-contraband.	09-18-2014
20140280142	DATA ANALYTICS SYSTEM - Methods, computer readable media, and apparatuses for building data models and performing model-based analytics are presented. A data analytics system may be implemented including software components to provide a data analytics platform and/or a data analytics programming interface. The components of the data analytics system may allow users to create and execute software applications to build data models model-based analytics functionality such as classification and prediction. For example, a programming interface may allow users to build data models of various predetermined model types, such as statistical data models, spatial statistical data models, graphical data models, pattern mining data models, clustering data models, and machine learning data models, among others, based on an input data set. A parallel computing infrastructure, for example, a cloud computing environment, may be used to build data models and perform model-based analytics with distributed processing and distributed storage.	09-18-2014
20140280143	PARTITIONING A GRAPH BY ITERATIVELY EXCLUDING EDGES - Methods, machines, and stored instructions are provided for partitioning a graph of nodes into clusters of nodes by iteratively excluding edges in the graph. For each node of at least a subset of nodes in the graph, a graph partitioning module determines whether to exclude edges for the node and, if so, selects for exclusion edge(s) to at least a subset of the node's neighbor(s). The module selects edge(s) to the node's neighbor(s) for exclusion based at least in part on a degree of overlap between the node's neighbor(s) and neighbor(s) of the node's neighbor(s). For any subset(s) that are yet not sufficiently partitioned into clusters, the module repeats the step of determining whether to exclude edges and, if so, selecting nodes for exclusion, and determining whether or not the nodes are sufficiently partitioned. Subset(s) of nodes that are already sufficiently partitioned may be skipped during the repeated steps.	09-18-2014
20140280144	SYSTEM AND METHOD FOR CLUSTERING DATA IN INPUT AND OUTPUT SPACES - A method of clustering a plurality of documents having input and output space data is disclosed that uses both input and output space criteria. The method can include aggregating documents into clusters based on input and/or output space similarity measures, and then refining the clusters based on further input and/or output space similarity measures. Aggregating the documents into clusters can include forming a hierarchical tree based on the input and/or output space similarity measures where the hierarchical tree has a root node, branching into intermediate nodes, and branching into leaf nodes covering individual documents, where the hierarchical tree includes a leaf node for each document of the plurality of documents. The method can then include forming a forest of sub-trees of the hierarchical tree based on cluster criteria. Textual and numeric similarity measures can be used depending on the type and distribution of data in the input and output spaces.	09-18-2014
20140280145	SYSTEM AND METHOD FOR CLUSTERING DATA IN INPUT AND OUTPUT SPACES - A system for clustering a plurality of documents having input and output space data is disclosed that uses both input and output space criteria. The system can aggregate documents into clusters based on input and/or output space similarity measures, and then refine the clusters based on further input and/or output space similarity measures. Aggregation of documents into clusters can include forming a hierarchical tree based on the input and/or output space similarity measures where the hierarchical tree has a root node, branching into intermediate nodes, and branching into leaf nodes covering individual documents, where the hierarchical tree includes a leaf node for each document of the plurality of documents. The system can include forming a forest of sub-trees of the hierarchical tree based on cluster criteria. Textual and numeric similarity measures can be used depending on the type and distribution of data in the input and output spaces.	09-18-2014
20140280146	PER-ATTRIBUTE DATA CLUSTERING USING TRI-POINT DATA ARBITRATION - Systems, methods, and other embodiments associated with clustering using tri-point arbitration are described. In one embodiment, a method includes selecting a data point pair and a set of arbiter points. A tri-point arbitration similarity is calculated for data point pairs based, at least in part, on a distance between the first and second data points and the arbiter points. In one embodiment, similar data points are clustered.	09-18-2014
20140280147	DATABASE ONTOLOGY CREATION - According to an example embodiment, a device for providing information regarding database contents includes data storage and a processor associated with the data storage. The processor identifies a database including a plurality of members and feature information regarding at least one feature of the members, respectively. The processor determines at least one categorizing indicator from a source that is external to the database and determines whether there are any associated indicators in the feature information that correspond to the categorizing indicator. The processor identifies the members of the database having the associated indicators and associates the identified members with a category based on the categorizing indicator.	09-18-2014
20140280148	METHOD FOR ANALYZING AND CATEGORIZING SEMI-STRUCTURED DATA - A computer system interconnected to a community of users having a data processor input module programmed to receive communications from said users including one or more inputs regarding food recipes and store said inputs in accessible memory. A data processor determining module programmed to access stored data and to apply a data interpretative algorithm to said data to unify and organize disparate data inputs into a cohesive database relating to recipes. Also, a search entry module connected to the recipe database to permit access to the database to support a search algorithm applied to the database.	09-18-2014
20140280149	METHOD AND SYSTEM FOR CONTENT AGGREGATION UTILIZING CONTEXTUAL INDEXING - A content management system interconnects multiple information sources and enables rapid access to documents, files or email by creating a contextual indexing layer in which information is organized around the people and companies within an organization's network and then presents the linked information through an application interface layer allowing a user to find anything rapidly, without the use of traditional keyword searching.	09-18-2014
20140280150	Multi-source contextual information item grouping for document analysis - A method and system for processing informational items originating from a plurality of information sources into a derived document for topical analysis thereof. Informational items are collated from a one of the sources in accordance with a predetermined plurality of relevant attributes and a key property value of common to select ones of the relevant attributes. Informational items are then grouped from the plurality of sources associated with the key common property value to form a document, wherein the informational items therein are marked on the informational source thereof. The document is then analyzed for topical identification.	09-18-2014
20140280151	Computerized System and Method for Identifying Relationships - Methods, systems and articles of manufacture for discovering relationships among data elements within a dataset are disclosed. A first relationship is identified between a first data element and a second data element by identifying a correlation between a first attribute of the first data element and the first attribute of a second data element. A second relationship indicator is generated that is indicative of a relationship between the first data element and the second data element based on the correlation between the first attribute of the first and second data elements. Various embodiments can identify implicit relationships across one or more levels of explicit relationships where the explicit relationships can be across different attributes. Such techniques can be employed in various types of application programs.	09-18-2014
20140280152	COMPUTING SYSTEM WITH RELATIONSHIP MODEL MECHANISM AND METHOD OF OPERATION THEREOF - A computing system includes: a contact identification module configured to identify a contact-profile for representing a contact; a recording module, coupled to the contact identification module, configured to identify an interaction with the contact; a clustering module, coupled to the recording module, configured to generate a category cluster from processing the interaction; and a relationship modeling module, coupled to the clustering module, configured to generate a connection model including the category cluster for characterizing the interaction with the contact for displaying on a device.	09-18-2014
20140280153	SYSTEMS, METHODS, AND APPARATUSES FOR IMPLEMENTING A GROUP COMMAND WITH A PREDICTIVE QUERY INTERFACE - Disclosed herein are systems and methods for implementing a GROUP command with a predictive query interface including means for generating indices from a dataset of columns and rows, the indices representing probabilistic relationships between the rows and the columns of the dataset; storing the indices within a database of a host organization; exposing the database of the host organization via a request interface; receiving, at the request interface, a query for the database specifying a GROUP command term and a specified column as a parameter for the GROUP command term; querying the database using the GROUP command term and passing the specified column to generate a predictive record set; and returning the predictive record set responsive to the query, the predictive record set having a plurality of groups specified therein, each of the returned groups of the predictive record set including a group of one or more rows of the dataset. Other related embodiments are further disclosed.	09-18-2014
20140280154	SCALABLE DATA TRANSFER IN AND OUT OF ANALYTICS CLUSTERS - Embodiments of the invention relate to analytics clusters and to efficiently supporting read and write requests in the cluster. In one aspect, one or more compute nodes within a region of the cluster are designated to support the request, and based upon the designation, the request is directly communicated between a requesting agent external to the cluster and the supporting compute node(s). The direct communication mitigates the functionality of the head node(s) supporting the compute node(s).	09-18-2014
20140280155	COMPUTER-IMPLEMENTED SYSTEMS AND METHODS FOR COMPARING AND ASSOCIATING OBJECTS - Computer-implemented systems and methods are disclosed for comparing and associating objects. In some embodiments, a method is provided for associating a first object with one or more objects within a plurality of objects, each object comprising a first plurality of properties, each property comprising data reflecting a characteristic of an entity represented by the object, the associated objects comprising matching data in corresponding properties for a second plurality of properties. The method may include executing, for each object within the plurality of objects and for the first object, the following: creating a slug for the object, the slug comprising the second plurality of properties from the object; and inputting the slug for the object into a Bloom filter. Further, the method may include creating for a bin within the Bloom filter corresponding to the slug for the first object, an association between objects whose slugs correspond to the bin if the slugs for those objects match.	09-18-2014
20140280156	GENERATING CUSTOM AUDIO CONTENT FOR AN EXERCISE SESSION - System, apparatuses, and methods can provide customized exercise sessions and customized videos corresponding to the exercise session. Audio clips can be dynamically selected to make custom audio content for an exercise session. The audio clips and metadata can be obtained, where the audio clips correspond to categories. The exercise session can include one or more components. A destination timeline for a component can include one or more first segments that require audio, and one or more second segments that can optionally have audio. Audio clips can be selected for the various segments, where a segment can be designated for a particular category of audio clips. Identification information for the selected audio clips can be saved and used to generate the custom audio content.	09-18-2014
20140280157	MANAGEMENT OF DATA FEEDS FROM DEVICES AND PUBLISHING AND CONSUMPTION OF DATA - The present invention is directed towards a computer-implemented method and system for managing device data feeds. The computer-implemented method and system comprise using a data model to describe type of data received from the devices, grouping the received type of data based on a data description, and forwarding the device data to a receiver endpoint as directed by the subscription information comprising a receiver endpoint and a rule uniquely identified by the subscription identifier using application programming interface key to manage access to the device data.	09-18-2014
20140280158	SYSTEMS AND METHODS FOR MANAGING LARGE VOLUMES OF DATA IN A DIGITAL EARTH ENVIRONMENT - A computer-implemented method for managing large volumes of data comprises dividing data about a number of features into a plurality of data groups, each of the groups having a plurality of features, each of the features having a plurality of properties, and each of the properties having a property value; for each of the groups, determining a number of distribution ranges for the property values for each of the properties; for each of the groups, determining a number of features having property values that are within each of the distribution ranges; and generating a summary associated with each of the groups, the summary comprising the properties of the features in the group and the number of the features that are within each of the distribution ranges for the properties.	09-18-2014
20140280159	DATABASE SEARCH - The present disclosure provides a method and apparatus for database search, by grouping data entries based on join conditions between the data entries as set in search conditions; and executing the search based on the grouping of the data entries, and can efficiently and effectively resolve the issues that are common to the existing MapReduce query processing systems, thereby being particularly suitable for big dataset analytics in a large cluster system.	09-18-2014
20140280160	SYSTEM FOR NON-DETERMINISTIC DISAMBIGUATION AND QUALITATIVE ENTITY MATCHING OF GEOGRAPHICAL LOCALE DATA FOR BUSINESS ENTITIES - There is provided a method that includes (a) receiving data that describes a location, (b) extrapolating, from the data, an address associated with the location, (c) identifying a segment of a thoroughfare that includes the address, (d) defining a polygon that has a perimeter that encompasses a geographic region that is in a vicinity of the segment, (e) obtaining geo-coordinates of a point within the polygon, (f) identifying an address at the geo-coordinates, and (g) identifying an entity that is associated with the address at the geo-coordinates. There is also provided a system that performs the method, and a storage device that contains instructions that control a processor to perform the method.	09-18-2014
20140280161	SYNTACTIC TAGGING IN A DOMAIN-SPECIFIC CONTEXT - This application relates generally to defining a domain-specific syntax characterizing a functional information system and performing operations on data entities represented by the domain-specific syntax, including defining a domain-specific syntax, receiving and storing a domain-specific data entity, assigning a syntactic tag to the domain-specific data entity, and electronically storing the tag assigned to the data entity in the electronic data store so that the tag is logically linked to the stored data entity.	09-18-2014
20140280162	DATABASES AND METHODS OF STORING, RETRIEVING, AND PROCESSING DATA - A non-transitory computer-readable medium having computer-readable instructions stored thereon which, when executed by a computer, cause the computer to perform a method of processing data comprising the steps of: receiving data associated with event instances; and for each of a plurality of iteration methods: partitioning the incoming event instances into logical data partitions; assigning an identifier to each event instance such that events classified in the same logical data partition receive the same identifier and a given event instance is always assigned the same identifier; and inserting each event instance into a doubly-linked list associated with the identifier in an appropriate location.	09-18-2014
20140280163	SYSTEMS AND METHODS FOR FIELD DATA COLLECTION - Systems and methods are provided for automated field data collection. A client system may download graphical representations and space hierarchy information associated with the project from a server. A user may open, on the client system, an area where the field data collection is to occur (e.g., a room in a building), whereby the client system automatically navigates to a pre-defined region of the graphical representation (e.g., an architectural floor plan). For each discrepancy identified, the user may touch a corresponding location on the graphical representation and select a discrepancy type from a list. The user may then associate additional data files, such as image files, audio files, video files, and GPS coordinates with the discrepancy. An organization responsible for correcting the discrepancy may be automatically assigned and/or notified based on an association made on the server.	09-18-2014
20140280164	SEMANTIC TO NON-SEMANTIC ROUTING FOR LOCATING A LIVE EXPERT - A system and method of semantic to non-semantic routing for locating an expert. An inquiry-type database has a first layer of inquiry types organized from underlying criteria groupings, (humanly understandable descriptors). Additional layers are associated in a one-to-one correspondence with the first layer of inquiry types. Experts, having individualized knowledge, are listed in a skill-set database associated with the inquiry types. The skill-set database entries are linked to the associated inquiry-type by a numerical routing identifier. An expert is selected from the skill-set database entry linked by the numerical routing identifier. In another embodiment, multiple enterprises are mapped to separate layers of inquiry types having a one-to-one correspondence with the underlying groupings. A skill-set database entry is related to the inquiry type through a numerical routing identifier, the identifier being selected from a respective range of identifiers associated with the respective multiple enterprises.	09-18-2014
20140289245	OPTIMIZING A CLUSTERED VIRTUAL COMPUTING ENVIRONMENT - Exemplary embodiments of the present invention disclose a method, computer program product, and system for optimizing a clustered virtual computing environment. In exemplary embodiments, performance attributes are identified for a set of operating devices within the clustered virtual computing environment. Historical data of the identified performance attributes is obtained to create a historical data repository. A rulebase is developed using the historical data repository and input from user. A combined correlation pattern repository is generated using a first correlation pattern, a second correlation pattern and a scale-time invariant weight fraction.	09-25-2014
20140289246	Systems and Methods for the Distributed Categorization of Source Data - Systems and methods for the crowdsourced clustering of data items in accordance embodiments of the invention are disclosed. In one embodiment of the invention, a method for determining categories for a set of source data includes obtaining a set of source data, determining a plurality of subsets of the source data, where a subset of the source data includes a plurality of pieces of source data in the set of source data, generating a set of pairwise annotations for the pieces of source data in each subset of source data, clustering the set of source data into related subsets of source data based on the sets of pairwise labels for each subset of source data, and identifying a category for each related subset of source data based on the clusterings of source data and the source data metadata for the pieces of source data in the group of source data.	09-25-2014
20140289247	ANNOTATION SEARCH APPARATUS AND METHOD - According to an embodiment, an annotation search apparatus includes a feature extractor and an annotation search unit. The feature extractor is configured to extract an annotation feature from an input document and an annotation appended by a user to the input document. The annotation search unit is configured to search annotation information items to retrieve at least one of the annotation information items according to an intended purpose of the user, one of the annotation information items corresponding to the input document and including the annotation feature.	09-25-2014
20140289248	DISPLAY APPARATUS AND METHOD FOR DISPLAYING INFORMATION REGARDING ACTIVITIES THEREOF - A display apparatus is provided, which includes a display; a storage storing information regarding user activities performed in the display apparatus for a time period; and a controller analyzing a pattern of the user activities performed for the time period based on the information regarding the activities, dividing the time period into a plurality of time periods based on the analyzed pattern, and controlling the display to display the information regarding the activities that belong to the respective divided time periods.	09-25-2014
20140289249	SYSTEM AND METHOD FOR MESSAGE CLUSTERING - The disclosure describes systems and methods delivering communications associated with delivery conditions in which the occurrence of the delivery condition is determined by monitoring information received from a plurality of sources via multiple communication channels. The message delivery systems allow messages to be delivered to any “Who, What, When, Where” from any “Who, What, When, Where” upon the detection of an occurrence of one or more “Who, What, When, Where” delivery conditions. A message (which may be any data object including text-based messages, audio-based message such as voicemail or other audio such as music or video-based prerecorded messages) is delivered in accordance with delivery conditions based on any available data, including topical, spatial, temporal, and/or social data. Furthermore, because the systems coordinate delivery of messages via multiple communication channels and through multiple devices, the communication channel for delivery of a message may be dynamically determined based on the delivery conditions.	09-25-2014
20140289250	DATA COLLECTION APPARATUS AND DATA COLLECTION PROGRAM - Provided are a first storage unit	09-25-2014
20140289251	DATA COLLECTION SYSTEM AND DATA COLLECTION SYSTEM PROGRAM - Provided are a group selector	09-25-2014
20140297639	APPARATUS, SYSTEM, AND METHOD FOR DETECTING COMPLEX ISSUES BASED ON SOCIAL MEDIA ANALYSIS - Disclosed are an apparatus, a system, and a method for detecting complex issues based on social media analysis according to the present invention. A system for detecting complex issues based on social media analysis according to the present invention includes: a unit issue detecting unit configured to receive a keyword from a user terminal, and to detect per-type unit issues associated with the received keyword; a complex issue detecting unit configured to detect per-type complex issues from the detected per-type unit issues; a complex issue ranking unit configured to analyze the detected per-type complex issues, and to rank the per-type complex issues based on the analysis result; and a complex issue configuring unit configured to configure the ranked per-type complex issues in a predetermined form that enable users to induce a micro trend, and to provide the configured form to a user.	10-02-2014
20140297640	CLUSTERING BASED PROCESS DEVIATION DETECTION - Systems and methods for data analysis include correlating event data to provide process instances. The process instances are clustered, using a processor, by representing the process instances as strings and determining distances between strings to form a plurality of clusters. One or more metrics are computed on the plurality of clusters to monitor deviation of the event data.	10-02-2014
20140297641	DISCUSSION SUPPORT METHOD, INFORMATION PROCESSING APPARATUS, AND STORAGE MEDIUM - A storage medium storing a discussion support program that causes a computer to execute a process, the process includes collecting opinion data obtained from each information terminal and a group characteristic corresponding to the opinion data collected in each group including a plurality of information terminals set in advance, and calculating a point of a participant related to the information terminal based on representative opinion data chosen from the opinion data collected in each group and the group characteristic corresponding to the group to which the information terminal having transmitted the representative opinion data belongs.	10-02-2014
20140297642	SYSTEMS AND METHODS FOR MAPPING PATIENT DATA FROM MOBILE DEVICES FOR TREATMENT ASSISTANCE - In various embodiments, a system comprises a map and a patient data assessment module. The map includes a plurality of groupings and interconnections of the groupings, each grouping having one or more patient members that share biological similarities, each interconnection interconnecting groupings that share at least one common patient member, the map identifying a set of groupings and a set of interconnections having a medical characteristic of a set of medical characteristics. The patient data assessment module may be configured to receive sensor data from a user's mobile device and to assess the sensor data to generate user medical attributes, to determine whether the user shares the biological similarities with the one or more patient members of each grouping based, at least in part, on the user medical attributes, thereby enabling association of the user with one or more of the set of medical characteristics.	10-02-2014
20140297643	SYNTHESIZED IDENTIFIERS FOR SYSTEM INFORMATION DATABASE - Techniques for managing system information are disclosed. In one embodiment, a piece of system information is received, a synthesized link is created linking a system information identifier corresponding to the system information to a synthesized group identifier, the synthesized group identifier represents a group to which the synthesized information/synthesized information identifier belongs.	10-02-2014
20140297644	Knowledge graph mining method and system - It is described a knowledge graph mining method, including: users in a community are clustered to form a community user circle according to original community data of the users, attributes of the users, a bulletin board system participated by the users, or a chat group of an instant messaging application participated by the users, wherein the original community data of a user comprise information about following-up of the user on another user in the community, and an amount of topics which both the user and the another user take part in; and a knowledge graph corresponding to the community user circle is created according to user behavior data generated by users in the community user circle.	10-02-2014
20140297645	SYSTEM AND METHOD FOR AUTOMATICALLY CREATING A PHOTO CALENDAR - System and method are disclosed for creating a photo calendar. A computer storage medium stores images taken in a time period spanning a plurality of capture months. A computer processor automatically divides the images into groups based the capture months, distributes the images in one of the capture month to one or more calendar months according to an adjacency distribution function, and creates a design of a photo calendar comprising a plurality of calendar months and images distributed in the calendar months.	10-02-2014
20140304263	IN-DATABASE PROVISIONING OF DATA - A user uploads date sets through a client to a database. The data sets are provisioned in the database for in-database searching. The data sets are evaluated and classifications for the columns of the tables that include the data set are detected. Columns content may be classified into different analysis types, aggregation types, formats, categories, hierarchies, etc. Metadata is generated based on the evaluation of the data sets. A schema is used to store the metadata that describes the detected classification of the columns. The schema is stored in the database and is used when a search in the database is performed.	10-09-2014
20140304264	MOBILE WEB-BASED PLATFORM FOR PROVIDING A CONTEXTUAL ALIGNMENT VIEW OF A CORPUS OF DOCUMENTS - A content platform for providing a mobile, web-based contextual alignment view of a corpus of documents is disclosed. A corpus of documents is mined to identify a set of topics. Each document in the corpus is analyzed to determine a set of opinions associated with the set of topics, the set of opinions including a corpus opinion. Each document in the corpus is classified based on alignment with the corpus opinion. The corpus of documents is presented to the user according to the document classification.	10-09-2014
20140304265	DISCOVERING AND PRESENTING DECOR HARMONIZED WITH A DECOR STYLE - Technology is disclosed for discovering décor harmonized with a décor style (“the technology”). The décor includes décor items, e.g. artworks, paintings, pictures, artifacts, architectural pieces, arrangement of artworks, color selection, room décor, rugs, mats, furnishings, household items, fashion, clothes, jewelry, car interiors, garden arrangements etc. The technology facilitates analyzing user input to identify a décor style from a décor style dictionary, obtaining décor that harmonizes with décor style, and presenting a representation of the décor to the user. The décor style dictionary includes décor styles that are generated based on an analysis of content, including images and description of décor, from a plurality of sources. The décor styles can be based on a number of concepts, including a theme of the décor, a color/color palette, a mood of the person, a fashion era, a type of architecture, etc. The technology facilitates presentation of discovered décor using computer generated imagery techniques.	10-09-2014
20140304266	DATA BASE INDEXING - The present disclosure relates to a method, and a system for structuring or re-structuring a plurality of data records, wherein the plurality of data records are organised in a hierarchical structure of a plurality of clusters. Each one of the plurality of clusters comprises one or more of the plurality of data records. The clustering of the plurality of clusters is based on a nearness of the data records in the clusters and the plurality of clusters are arranged in the hierarchical structure according to the nearness of the data records.	10-09-2014
20140304267	SUFFIX TREE SIMILARITY MEASURE FOR DOCUMENT CLUSTERING - The subject innovation provides for systems and methods to facilitate weighted suffix tree clustering. Conventional suffix tree cluster models can be augmented by incorporating quality measures to facilitate improved performance. Further the quality measure can be employed in determining cluster labels that show improvements in accuracy over conventional means. Additionally “stopnodes” can be defined to facilitate traversing suffix tree models efficiently. Quality measurements can be determined based in part on weighting factors applied to terms in a vector model, said terms being mapped from a suffix tree model.	10-09-2014
20140310278	CREATING GLOBAL AGGREGATED NAMESPACES FOR STORAGE MANAGEMENT - Embodiments are directed to creating global, aggregated namespaces for storage management and to providing consistent namespaces in a distributed storage system. In one scenario, a computer system defines data storage objects for each data storage node. The data storage objects uniquely identify storage elements of the data storage nodes, where each data storage object includes various associated attributes. The computer system replicates the defined data storage objects and any associated attributes from a first data storage node to a second, different data storage node among the data storage nodes. As such, the defined data storage objects are visible from any node in the data storage nodes. The computer system also aggregates the defined data storage objects for each of the data storage nodes and creates a global, aggregated namespace that includes the aggregated data storage objects for each of the data storage nodes.	10-16-2014
20140310279	MANAGEMENT OF FILE STORAGE Locations - The embodiments described may be directed toward a file management system for managing a file folder location, a method for managing one or more data clusters, and a method of recommending a file storage location. The method of recommending a file storage location may also include plotting one or more data points onto one or more vectors. A received data point may be obtained from a file save request. The method may also include creating one or more data clusters from the vector data points using a clustering mechanism.	10-16-2014
20140310280	SYSTEM AND METHOD FOR DISCOVERY, GROUPING AND SHARING OF MEDIA CONTENT - A system and method for simplifying discovery, grouping and sharing of entertainment content, including media content, live shows and events, and the availability of such content.	10-16-2014
20140310281	EFFICIENT AND FAULT-TOLERANT DISTRIBUTED ALGORITHM FOR LEARNING LATENT FACTOR MODELS THROUGH MATRIX FACTORIZATION - A method for estimating model parameters. The method comprises receiving a data set related to a plurality of users and associated content, partitioning the data set into a plurality of sub data sets in accordance with the users so that data associated with each user are not partitioned into more than one sub data set, storing each of the sub data sets in a separate one of a plurality of user data storages, each of said data storages being coupled with a separate one of a plurality of estimators, storing content associated with the plurality of users in a content storage, where the content storage is coupled to the plurality of estimators so that the content in the content storage is shared by the estimators, and estimating, asynchronously by each estimator, one or more parameters associated with a model based on data from one of the sub data sets.	10-16-2014
20140310282	GENERATING DATA CLUSTERS - Techniques are disclosed for prioritizing a plurality of clusters. Prioritizing clusters may generally include identifying a scoring strategy for prioritizing the plurality of clusters. Each cluster is generated from a seed and stores a collection of data retrieved using the seed. For each cluster, elements of the collection of data stored by the cluster are evaluated according to the scoring strategy and a score is assigned to the cluster based on the evaluation. The clusters may be ranked according to the respective scores assigned to the plurality of clusters. The collection of data stored by each cluster may include financial data evaluated by the scoring strategy for a risk of fraud. The score assigned to each cluster may correspond to an amount at risk.	10-16-2014
20140310283	COMBINED ACTIVITIES HISTORY ON A DEVICE - A method includes performing a first activity with content associated with a first content type selected from the group consisting of television programming, online content, on-device application, search queries, information views, and other content types described using a predefined format, wherein the predefined format includes an action specification and a content specification; logging the first activity in accordance with the predefined format; performing a second activity with content associated with a second content type selected from the group consisting of television programming, online content, on-device applications, search queries, information view's, and other content types described using the predefined format, the second content type being distinct from the first content type; and logging the second activity in accordance with the predefined format.	10-16-2014
20140317113	TABULAR DATA PARSING IN DOCUMENT(S) - One or more techniques and/or systems are provided for parsing tabular data of a document. That is, a document may comprise arbitrarily formatted content (e.g., an equipment inspection report generated by an engineer). Respective rows of the document may be clustered into one or more row clusters based upon row proximity and/or numeric content (e.g., rows having similar numeric content may comprise logically related information). One or more vertical clusters may be generated within respective row clusters based upon vertical overlap. In this way, row clusters and/or vertical clusters may be searched for one or more values that may be assigned to a search term. For example, a row cluster may comprise a search term “Average temp”. One or more vertical clusters within the row cluster may be searched for a word that matches a pattern criteria (e.g., a two digit number), which may be assigned to the search term.	10-23-2014
20140317114	METHODS AND APPARATUS TO MONITOR MEDIA PRESENTATIONS - Methods, apparatus, systems and articles of manufacture to monitor media are disclosed. An example apparatus includes a software development kit provider to provide a software development kit to enable an application developer to create an application developer to create a monitoring enabled application. The example apparatus further includes a monitoring data receiver to receive data collected from a media device executing the monitoring enabled application, the data collected via the monitoring enabled application, the collected data including a media identifier and at least one of a device identifier or a user identifier. The example apparatus further includes a data store to store the collected data, and a database proprietor interface to request demographic information from a database proprietor, the database proprietor interface to store the demographic information in association with the media identifier in the data store.	10-23-2014
20140317115	APPARATUS, SYSTEMS AND METHODS FOR DATA STORAGE AND/OR RETRIEVAL BASED ON A DATABASE MODEL-AGNOSTIC, SCHEMA-AGNOSTIC AND WORKLOAD-AGNOSTIC DATA STORAGE AND ACCESS MODELS - A database access model and storage structure that efficiently support concurrent OLTP and OLAP activity independently of the data model or schema used, are described. The storage structure and access model presented avoid the need to design schemas for particular workloads or query patterns and avoid the need to design or implement indexing to support specific queries. Indeed, the access model presented is independent of the database model used and can equally support relational, object and hierarchical models amongst others.	10-23-2014
20140317116	FACILITATING COLLABORATION ON A RECORD AMONG A GROUP OF USERS OF A FEED-BASED ENTERPRISE NETWORK - Disclosed are some examples of systems, methods and storage media for associating a group of users to a record and facilitating collaboration on the record by the users via a group feed of an enterprise network. In some implementations, a system includes first data associating each of a plurality of group identifiers to one or more record identifiers, and second data associating each of a plurality of feed item identifiers to a respective group identifier or record identifier. In one implementation, the system is configured to receive a request for a first group feed associated with a first group identifier. Based on the request, the system identifies one or more first record identifiers associated with the first group identifier, identifies one or more first feed item identifiers associated with the first group identifier or the first record identifiers, and generates the first group feed to include the corresponding feed items.	10-23-2014
20140317117	METHOD, DEVICE AND COMPUTER STORAGE MEDIA FOR USER PREFERENCES INFORMATION COLLECTION - Disclosed are a method, device and computer storage media for user preferences information collection belonging to the computer field. The method comprises: extracting keywords which the user operation involved in the process; determining whether there is the same tag with the keyword in the pre-set tag library; if determining that there is the same tag with the keyword in the pre-set tag library, storing the same tag with the keyword into the user personal tag library. The present invention can improve the coverage and accuracy of user preferences information collection by extracting the keywords which the user operation involved in daily applications and storing the same name tag with the extracted keyword into the user personal tag library.	10-23-2014
20140324861	Block Partitioning For Efficient Record Processing In Parallel Computing Environment - A computer-implemented method is disclosed for efficiently processing a large number of records. In the method, a computer system may obtain a plurality of records and count the number of records thereof corresponding to each block of a plurality of blocks. The computer system may also identify a plurality of partitions corresponding to selected blocks of the plurality of blocks. Each partition of the plurality of partitions may be substantially uniform in processing time. The computer system may then distribute a workload associated with a block or partition to each node of a plurality of nodes contained within the computer system. Each node may then process the block or partition in parallel such that each node completes the processing within a selected period of time.	10-30-2014
20140324862	CORRELATION FOR USER-SELECTED TIME RANGES OF VALUES FOR PERFORMANCE METRICS OF COMPONENTS IN AN INFORMATION-TECHNOLOGY ENVIRONMENT WITH LOG DATA FROM THAT INFORMATION-TECHNOLOGY ENVIRONMENT - Methods and computer-program products are provided for storing a set of performance measurements relating to performance of a component in an IT environment, and associating with the performance measurement a time at which the performance measurement was obtained for each performance measurement in the set of performance measurements. The methods and computer-program products include storing portions of log data produced by the IT environment, wherein each portion of log data has an associated time; providing a graphical user interface enabling selection of a time range; and receiving through the graphical user interface a selection of a time range. The methods and computer-program products further comprise retrieving one or more performance measurements, wherein each of the retrieved performance measurements has an associated time in the selected time range; retrieving one or more portions of log data, wherein each of the retrieved portions of log data has an associated time in the selected time range; displaying an indication of the retrieved performance measurements having their associated times in the selected time range; and displaying an indication of the retrieved portions of log data having their associated times in the selected time range.	10-30-2014
20140324863	Systems and Methods for Gathering and/or Presenting Information - The present invention provides systems and methods for presenting a quantity of information in a single tool. Such a tool includes a map of various objects, the objects having themes relating to a given overall concept, wherein at least one object contains information relating to other objects that have a relationship with that object.	10-30-2014
20140324864	GRAPH MATCHING BY SUB-GRAPH GROUPING AND INDEXING - Relational graphs may be used to extract information. Similarities between the relational graphs and the items they represent may be determined. For example, when applied to video searching, relational graphs may be obtained from searching videos to extract objects, events and/or relations therebetween. Each relational graph may comprise a plurality of nodes and edges, wherein at least some of the detected objects and events are represented by each node, and wherein each edge and represents a relationship between two nodes. Subgraphs may be extracted from each relational graph and dimension reduction may be performed on the subgraphs to obtain a reduced variable set which may then be used to perform searches, such as similarity analyses of videos.	10-30-2014
20140324865	METHOD, PROGRAM, AND SYSTEM FOR CLASSIFICATION OF SYSTEM LOG - Method and system for classifying system logs. A data processing system reads a message in one line of a system log; prepares a root node of a tree structure in which each node holds a format; calculates a similarity between a log of the root node and the message; generates and stores a first format in the root node if the calculated similarity is equal to or greater than a threshold value; adds the message to a child node of the root node, in accordance with a given condition; searches for, after the first format is created, a second format similar to the first format in a format storage table; combines the first format and the similar format to produce a combined parent format, where the combined parent format holds a plurality of formats; and stores the combined parent format in the format storage table to produce a classified format.	10-30-2014
20140324866	System for decomposing events from managed infrastructures - An event clustering system includes an extraction engine in communication with an infrastructure. The extraction engine receives data from the infrastructure and produces events. An alert engine receives the events and creates alerts mapped into a matrix, M. A sigalizer engine includes one or more of an NMF engine, a k-means clustering engine and a topology proximity engine. The sigalizer engine determines one or more common steps from events and produces clusters relating to the alerts and or events.	10-30-2014
20140324867	Situation dashboard system and method from event clustering - A computer-implemented method is provided that is stored on computer readable non-transitory media. One or more data fields are accessed within a file. Accessed data field, are mapped mapping on a display computer system. The accessed one or more data fields are from one or more data sources that relate to situations from clustering messages received from managed infrastructure. The mapping being performed based on a input of the situation summaries using a graphical user interface. Displayed on the display computer system are one or more dashboards of situations relative to summaries from clustering messages received from managed infrastructure. The one or more dashboards include at least one of actions that a user can take relative to clustered messages.	10-30-2014
20140324868	METHOD FOR RAPID DATA CLASSIFICATION - A method for rapid data classification comprises selecting a data queue in which data with the same type is arranged adjacently. A starting point pointer, a middle point pointer, and an ending point pointer are enabled to point to a starting point, a middle point, and an ending point of the data queue, respectively. The method further comprises determining whether the types of the data pointed to by the starting point pointer and the middle point pointer are the same, and classifying the data with one type. The method further comprises enabling the starting point pointer to point to a starting position of a next type of the data, determining whether the types of the data pointed to by the starting point pointer and the middle point pointer are the same, and classifying the next type of data.	10-30-2014
20140324869	VARIANT DATABASE - The invention provides a system and method for describing polymorphisms or genetic variants based on information about mutations and relationships among them. The invention uses object-oriented concepts to describe variants as variant objects and relations among those variants as variant relation object, each object being an instance of an abstract class of genomic feature and able to contain any number of other objects. Information about genetic disorders is stored in association with the object that represents the pathogenic variant. Genetic test results are used to access corresponding objects to provide a report based on variants or polymorphisms in a patient's genetic material.	10-30-2014
20140324870	SIMILARITY DETECTING APPARATUS AND DIRECTIONAL NEAREST NEIGHBOR DETECTING METHOD - In order to detect similar data from a great deal of data at high speed, a similarity detecting apparatus includes a random number generating unit	10-30-2014
20140330826	METHODS AND SYSTEMS FOR DATA REDUCTION IN CLUSTER ANALYSIS IN DISTRIBUTED DATA ENVIRONMENTS - Systems and methods for data reduction of a data set are included. A computing system may group data points in a data set into a number of data point bubbles represented by a number of representative points. A data point bubble may include a one or more data points from the data set and a representative point from the data set. The computing system may calculate a cluster assignment for the representative point by executing a clustering algorithm using the number of representative points.	11-06-2014
20140330827	METHODS AND SYSTEMS TO OPERATE ON GROUP-BY SETS WITH HIGH CARDINALITY - This disclosure describes methods, systems, computer-readable media, and apparatuses for efficiently calculating group-by statistics. A data set that includes multiple entries is accessed. The multiple entries are grouped into group-by subsets which are formed on two or more group-by variables and which are subsets are subsets of the data set. Cardinality data is determined for each of the group-by subsets, wherein cardinality data represents a number of entries in a group-by subset. At least one summary of data in each of the group-by subsets is generated, wherein each of the summaries includes the cardinality data determined for the group-by subset. Objects for the group-by subsets are initialized such that the objects store the summaries. The objects may then be used to generate multiple statistical summaries of the data set.	11-06-2014
20140330828	SYSTEM AND METHOD FOR SIGNATURE-BASED UNSUPERVISED CLUSTERING OF DATA ELEMENTS - A method and system for signature-based unsupervised clustering of data elements. The method comprises receiving a plurality of clusters; generating a triangular matrix respective of the clusters; generating a signature for each of the clusters; generating a match score between each of two different clusters; storing the match score in a cell of the triangular matrix corresponding to the two clusters; determining whether any of the match scores is above a predefined threshold value; clustering every two clusters that are determined to have a score above a predetermined threshold; and repeating the generation of a triangular matrix respective of the clusters until a single cluster is reached. The system comprises an interface; a processor; a memory for storing at least one cluster; and a memory coupled to the processor, the memory containing instructions that, when executed by the processor, configure the system to perform the steps of the method.	11-06-2014
20140330829	SYSTEMS AND METHODS FOR A CACHE-SENSITIVE INDEX USING PARTIAL KEYS - Systems and methods are disclosed for a cache-sensitive index that uses fixed-size partial keys. The index may include a node comprising a child group pointer, a number of partial keys and a similar number of full-key pointers. The node may also include a record count. The nodes are organized into groups. The groups may contain a number of nodes one greater than the number of partial keys in a node and the nodes in a group may be stored contiguously in memory. The child group pointer and the number of partial keys may fit within a cache line. A method is disclosed for traversing the index, for bulk-loading the index, and for live deletion of records from the index.	11-06-2014
20140337338	EFFICIENT MULTI-TENANT SPATIAL AND RELATIONAL INDEXING - Methods, computer systems, and computer-storage media are provided for increasing the efficiency of a multi-tenant geospatial data index. Efficiency is increased by using a multi-tenant model for storing and serving the data, processing raw geospatial data received from tenants into a runtime-optimized format, and by partitioning tenant geospatial data into a processor memory portion and a file system memory portion. Efficiency is also increased by executing a staged upload of the processor memory portion and the file system memory portion to a subset of host machines in order to check for invalid data before uploading the data to the remaining host machines. Additionally, efficiency is increased by optimizing geospatial search queries using query filters stored in a query filter cache, and executing the query filters initially against the processor memory.	11-13-2014
20140337339	Creating and Organizing Events in an Activity Stream - A system and method for creating and organizing events includes an activity stream application that captures, searches and collaborates on one or more events. The events include unstructured data comprising text, digital ink, an audio clip and an image. The activity stream application receives user input and generates a new event and combines related events into the same activity. The activity stream application receives a search query and searches for events that are relevant to the search query. In one embodiment, the search query includes contextual information that includes at least one of at a similar time, at a similar location, in a similar situation and a relatedness of event attributes.	11-13-2014
20140337340	METHODS AND SYSTEMS FOR ON-DEVICE SOCIAL GROUPING - Methods, systems, and computer readable media for social grouping are provided to perform social grouping of a user's contacts based on the user's interactions with the contacts. A set of attributes associated with interactions between a user and a set of contacts may be determined by a first device. The set of attributes associated with the interactions may be related to the first device. The set of contacts may be organized into a set of groups based on the set of attributes.	11-13-2014
20140337341	Auto-Tagging In Geo-Social Networking System - In one embodiment, a social networking system automatically tags one or more users to an image file by creating a list of potential matches, and selecting a subset of potential matches based on location, asking a first user to confirm the subset of potential matches, and tagging one or more matched users to the image file.	11-13-2014
20140337342	DATA MANAGEMENT DEVICE, DATA MANAGEMENT METHOD, DATA MANAGEMENT PROGRAM, AND INFORMATION PROCESSING DEVICE - A data management device includes a first storage unit configured to store data; a second storage unit configured to store data, to which access is possible at a high speed compared to the first storage unit; and a processor configured to execute a process including reading, from the first storage unit or the second storage unit, data according to an input data request, and outputting the read data, analyzing relevance between data items stored in the first storage unit or the second storage unit, based on history of data requests that have been input, and dividing, into groups, the data items stored in the first storage unit or the second storage unit based on a result of the analysis, and storing, in the second storage unit, the data items in units of the groups into which the data items have been divided.	11-13-2014
20140337343	METHOD, COMPUTER PROGRAM AND COMPUTER FOR DETECTING COMMUNITIES IN SOCIAL MEDIA - The present invention provides at least a method includes: extracting a plurality of partial communities from a plurality of users, based on the relationships of companion messages; computing a first degree of similarity for showing the similarity of the companion partial communities, based on the relationship of a user belonging to one partial community with a user belonging to the other partial community, from among the plurality of communities; computing a second degree of similarity for showing the similarity of companion partial communities, based on words within the messages sent by users belonging to both partial communities and under the condition that the first similarity be higher than a predetermined first threshold value; and creating an integrated community by integrating the companion partial communities under the condition that the second similarity be higher than a predetermined second threshold value.	11-13-2014
20140344270	DATA CLUSTERING AND USER MODELING FOR NEXT-BEST-ACTION DECISIONS - Embodiments herein provide data clustering and user modeling for next-best-action decisions. Specifically, a modeling tool is configured to: receive indicators within unstructured social data from a plurality of users; analyze the unstructured social data of each of the plurality of users to assign a set of feature vectors to each of the plurality of users, each feature vector corresponding to one or more personality characteristics of each of the plurality of users; and analyze the feature vectors to identify two or more users from the plurality of users sharing a set of similar feature vectors. The modeling tool is further configured to: group the two or more users from the plurality of users sharing the set of similar feature vectors to form a cluster; identify attributes of the cluster; and input the attributes of the cluster into a predictive model to determine an offer corresponding to the cluster.	11-20-2014
20140344271	REQUIREMENTS CHARACTERISATION - A method of determining a requirements characterisation profile for an entity is disclosed. The method includes the steps of receiving classification parameters defining a requirement for an entity and selecting, in dependence on the classification parameters, a set of entities from a database of previously assessed entities. The method further includes retrieving from the database characterisation parameters of the selected set of entities, and constructing, in dependence on the characterisation parameters, a requirements characterisation profile for the entity. An apparatus is provided that includes means for performing such a method.	11-20-2014
20140344272	METHOD FOR IDENTIFYING AND EMPLOYING HIGH RISK GENOMIC MARKERS FOR THE PREDICTION OF SPECIFIC DISEASES - A reorganization of genomic data into a simpler standard form leads to more transparent data analyses. The customary selection practice that focuses on high odds ratios loci is shown to be biased, reflecting quality of presently reported risk loci for T2D. A selection criterion, based on Shannon information theory, brings clarity to this issue and provides a rational and optimal basis for selecting potential risk loci. This is used to determine an optimal disease classifier. Within the framework of the FUSION database this leads to a relatively successful degree of T2D prediction and nearly an order of magnitude more effective in detecting T2D. Chromosome 7 is strongly associated with T2D. A hypothesis of this study is that the genomic disease signal is possibly weak, and instead of focusing on individual loci a collection of loci contribute to a composite Score, which functions as the determinant of disease or its absence.	11-20-2014
20140344273	System and method for categorizing time expenditure of a computing device user - The present invention relates to the analysis and categorization of time expenditure of users of computing devices having a graphical user interface. A computer implemented method for categorizing time expenditure of a computing device user is provided, comprising detecting an item of content has been added to one or more digital content repositories; selecting a content identifier from the item of content; determining whether text of the content identifier contains an analysis identifier, the analysis identifier having an identification string, and whereupon an analysis identifier is determined to be absent from the content data of the content identifier building an analysis identifier, and altering the content data of the content identifier to include the built analysis identifier; and receiving a plurality of user activity data records, each user activity data record associated with a user identifier, and each user activity data record containing a time indicator and the active window title of the computing device at or during that time indicator.	11-20-2014
20140344274	INFORMATION STRUCTURING SYSTEM - The present invention periodically and exhaustively extracts analysis dimension candidates from text data, such as medical literatures disclosed on the Internet. A name of disease, a medical agent, a checkup, and an operation included in actual clinical data are linked to the analysis dimension candidates. In the analysis dimension candidates, clinically important candidates and non-important candidates including extraction errors are mixed. To distinguish the candidates, weighting is provided to the link. First, a weight is made large when a level of an evidence of a medical literature from which an analysis dimension candidate is extracted is high. In literature groups of each name of disease, the degree of cooccurrence between a word of an analysis dimension candidate, and a word related to a medical agent/checkup/operation is calculated, and the weight of the link is made larger according to the magnitude of the degree of cooccurrence.	11-20-2014
20140344275	INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, INFORMATION PROCESSING PROGRAM, AND RECORDING MEDIUM - In selection of transaction objects, attribute fields regarded as important by a user are specified. An information processing apparatus includes: a first extracting means that extracts transaction objects whose transaction object information was browsed within a preset time, from browsing time of the transaction object information of the transaction objects registered in a list where the transaction objects selected by a user are registered, among the transaction objects whose transaction object information including attribute values of the transaction objects have been browsed to the user; a second extracting means that extracts transaction objects that have not been registered in the list among the transaction objects extracted by the first extracting means; and a specifying means that specifies attribute fields regarded as important by the user among a plurality of attribute fields of the transaction objects, based on attribute values differently set between transaction objects registered in the list and transaction objects extracted by the second extracting means.	11-20-2014
20140344276	Method and System for Generating Evaluation Information, and Computer Storage Medium - A method for generating evaluation information, including the following steps: obtaining first information from user behavior information; determining whether the first information matches key matching information, and if yes, then obtaining a category to which the key matching information corresponding to the first information belongs; and generating evaluation information according to the category. The method and system for providing evaluation information, and a corresponding computer storage medium, obtain first information from the user behavior information, and obtain a corresponding category according to the first information which matches the preset key matching information, thus generating evaluation information corresponding to the category. The generated evaluation information varies as the first information varies, and dynamic adjustment of evaluation information is achieved.	11-20-2014
20140344277	RELEASED OFFENDER GEOSPATIAL LOCATION INFORMATION CLEARINGHOUSE - A clearinghouse for integrating information related to released criminal offenders. The clearinghouse includes a computer system that receives geospatial location information related to released criminal offenders from multiple disparate systems, the geospatial location information including date and time information. The computer system then converts the information to a homogenous data format. The present invention further includes a method of integrating information related to released criminal offenders.	11-20-2014
20140344278	METHOD FOR INTERACTING WITH A GROUP OF INDIVIDUALS AS A SINGLE CONTACT - A method for establishing a group of individuals as a single contact entity eligible for contact services within a contact center includes the steps (a) identifying a group and each group member according to existing group rules and member profiles; (b) identifying and quantifying the unifying aspects of the members in the group; (c) aggregating the contact information for each group member relative to communications channels common to the group members and to the contact center; and (d) establishing one or more temporary and or permanent group channels between the contact center and the group members.	11-20-2014
20140351253	DYNAMIC RESPONSE ENTRY - Methods, systems, and devices for dynamic response entry are disclosed herein. In some embodiments, a dynamic response entry system can include a user device that can be a proctor device or a testee device. The testee device can display a list to a testee for a predetermined time period. After the passing of the predetermined time period, the displaying of the list to the testee can be terminated. The testee can provide response to one or several questions, which responses can be input into the proctor device. The input responses can be evaluated and categorized and displayed according to the evaluation and categorization	11-27-2014
20140351254	Unique Value Calculation in Partitioned Table - An estimation algorithm can generate a uniqueness metric representative of data in a database table column that is split across a plurality of data partitions. The column can be classified as categorical if the uniqueness metric is below a threshold and as non-categorical if the uniqueness metric is above the threshold. A first estimation factor can be assigned to the column if the column is classified as categorical or a larger second estimation factor can be assigned if the column is non-categorical. A cost estimate for system resources required to perform a database operation on the database table can be calculated. The cost estimate can include an estimated total number of distinct values in the column across all of the plurality of data partitions determined using the assigned first estimation factor or second estimation factor and a number of rows in the table as inputs to an estimation function.	11-27-2014
20140358919	Automatic Isolation and Selection of Screenshots from an Electronic Content Repository - Automatic isolation of screenshots from other captured content items stored in an electronic content repository is provided. When a screen capture is performed on an electronic device, such as a smartphone, screen resolution information for the capturing device is stored with the captured content item (e.g., screenshot). When a user subsequently desires to recall a given stored captured screenshot, the resolution associated with each stored content item may be used for isolating screenshots from other stored content items like photographs, text items, clip art, and the like by comparing the resolutions of any of the stored content items with a screen resolution of the user's device or with known screen resolutions of various devices that may be used for capturing screen images.	12-04-2014
20140358920	RESOLVING PAIRWISE LINKS TO GROUPS - The present invention extends to methods, systems, and computer program products for resolving pairwise links to groups. Embodiments of the invention use an iterative algorithm to transform a collection of pairwise links to groups of records that correspond to the same entity. The algorithm can be stopped after any number of iterations for an increasing accurate approximation result. The algorithm essentially guarantees a correct solution for groups of size up to the number of iterations. This algorithm scales linearly on the size of the record set, with little impact from the number of links.	12-04-2014
20140358921	DELEGATING RESEMBLING DATA OF AN ORGANIZATION TO A LINKED DEVICE - A computerized method for pooling objects in a computerized system having a storage for objects, comprising identifying in the computerized system objects having an at least one common metadata entity associated with the objects, and including the identified objects in a pool of objects, and an apparatus for performing the same.	12-04-2014
20140358922	Routing of Questions to Appropriately Trained Question and Answer System Pipelines Using Clustering - Mechanisms for selecting a pipeline of a question and answer (QA) system to process an input question are provided. An input question is received and analyzed to identify at least one feature of the input question. Clustering of the input question, with one or more previously generated clusters of questions, is performed based on the at least one feature of the input question. Based on results of the clustering, a matching cluster, of the one or more previously generated clusters, is identified with which the input question is associated. A QA system pipeline associated with the matching cluster is identified and the input question is processed using the identified QA system pipeline to generate one or more candidate answers for the input question. Each cluster in the one or more previously generated clusters has an associated QA system pipeline.	12-04-2014
20140358923	Systems And Methods For Automatically Determining Text Classification - A software product and a method a method determines classification of text displayed within a browser on a computer. A processor within a server is used to generate a consolidated index of tokens contained within the text. The processor is used to identify a first classification of the text by matching each of one or more terms of a first association defined within a rule set with the tokens of the consolidated index. The first association associates the one or more terms with the first classification. The first classification is indicated with the text by interacting with the browser. The server may continually receive characters from a communication stream and report any matched classifications therein.	12-04-2014
20140358924	DATA ANALYSIS APPARATUS AND METHOD - The present invention relates to a heterogeneous data cluster generation apparatus and method and a data clustering method and apparatus, and more particularly, to a data clustering method and apparatus which cluster data measured by different sensors into a number of groups. Aspects of the present invention provide an apparatus and method for generating clusters by putting together heterogeneous data which are values measured by different types of sensors. Aspects of the present invention also provide an apparatus and method for generating clusters by setting indices in order to effectively cluster multi-dimensional data, massive data, or scattered data.	12-04-2014
20140358925	SYSTEM AND METHOD FOR STORING CONTENT ON A CONTENT DELIVERY NETWORK - Aspects of the present disclosure involve systems, methods, computer program products, and the like, for grouping a plurality of content files in content delivery network (CDN) for easier storage and access. In one embodiment, the CDN may store related files in one or more container files within the CDN to reduce the number of stored files. In addition, a manifest provided to the requesting device relating to the content may be altered to point to the container files rather than the separate content files within the container. The manifest may also provide information to the requesting to extract and process the content files within the container file in the proper order for playing on the requesting device.	12-04-2014
20140358926	SYSTEM, METHOD AND COMPUTER PROGRAM FOR MULTI-DIMENSIONAL TEMPORAL AND RELATIVE DATA MINING FRAMEWORK, ANALYSIS & SUB-GROUPING - The present invention relates to a system, method and computer program product that is a multi-dimensional data mining environment and that operable to apply a series of temporal and relative rules (i.e., STDM	12-04-2014
20140358927	OVERRIDE OF AUTOMATICALLY SHARED META-DATA OF MEDIA - An override of automatically shared meta-data of media method and apparatus are disclosed. In one embodiment, a method of a server device includes automatically populating a hierarchy using a play-list history data associated with a media data of a client device, and modifying the hierarchy based on a user override. The hierarchy may be a hierarchy of the play-list history data of certain items associated with the media data of the client device. A modified hierarchy may be generated based on an addition, deletion and/or an adjust modifying operation of the user override on the hierarchy, and may be automatically populated on a new mark-up language file. A new compatibility rating may be determined between the user and the other users based on the similar attributes between the modified hierarchy and the other hierarchies, and each user may be enabled to view mark-up language files of the other users.	12-04-2014
20140365490	METHOD AND SYSTEM EMPLOYING GRAPHICAL ELECTRIC LOAD CATEGORIZATION TO IDENTIFY ONE OF A PLURALITY OF DIFFERENT ELECTRIC LOAD TYPES - A system for different electric loads includes sensors structured to sense voltage and current signals for each of the different electric loads; a hierarchical load feature database having a plurality of layers, with one of the layers including a plurality of different load categories; and a processor. The processor acquires voltage and current waveforms from the sensors for a corresponding one of the different electric loads; maps a voltage-current trajectory to a grid including a plurality of cells, each of which is assigned a binary value of zero or one; extracts a plurality of different features from the mapped grid of cells as a graphical signature of the corresponding one of the different electric loads; derives a category of the corresponding one of the different electric loads from the database; and identifies one of a plurality of different electric load types for the corresponding one of the different electric loads.	12-11-2014
20140365491	Method for managing personalized playing lists of the type comprising a URL template and a list of segment identifiers - A first splicer manages a get-list request coming (	12-11-2014
20140365492	Data Partitioning Method and Apparatus - A data partitioning method and apparatus. The method includes: determining tuple relationship information according to received mixed loads and structure information of a database; determining tuple split cost information according to the tuple relationship information and a feature about whether the mixed loads are executable in parallel; obtaining multiple partitioning schemes according to the tuple split cost information, and determining, from the partitioning schemes, a partitioning scheme with a minimum total cost value as an optimum partitioning scheme to perform partitioning processing on data stored in the database. In the data partitioning method and apparatus, optimum partitioning is performed on data associated with the mixed loads in a database, after partitioning, data has features of a transaction load and an analytical load in the mixed loads, thereby improving working performance of the database system oriented to the mixed loads.	12-11-2014
20140372438	DETERMINISTIC PROGRESSIVE BIG DATA ANALYTICS - A plurality of data items that are annotated with progress markers may be obtained. The progress markers may indicate progress points associated with atemporal processing progress of the respective data items. Deterministic, massively parallel, progressive processing may be initiated on the plurality of data items on a plurality of devices, the progress markers indicating which of the plurality of data items are to be incorporated into results of the progressive processing, the progress markers further indicating an ordering for incorporation of the respective data items into the results.	12-18-2014
20140372439	SYSTEMS AND METHODS FOR CREATING A VISUAL VOCABULARY - Systems, devices, and methods for creating a visual vocabulary extract a plurality of descriptors from one or more labeled images; cluster the descriptors into augmented-space clusters in an augmented space, wherein the augmented space includes visual similarities and label similarities; generate a descriptor-space cluster in a descriptor space based on the augmented-space clusters, wherein one or more augmented-space clusters are associated with the descriptor-space cluster; and generate augmented-space classifiers for the augmented-space clusters that are associated with the descriptor-space cluster based on the augmented-space clusters.	12-18-2014
20140372440	MMA Glove Incorporating a Tightly Secured Wireless Impact Processing Circuit - An improved mixed martial art (“MMA”) glove includes an impact sensing circuit board that holds a microcontroller, a three-axis accelerometer, a wireless interface chip, and is coupled to an impact sensing circuit. The circuit board is securely mounted to the wrist portion of the improved MMA glove by one or more sewing holes.	12-18-2014
20140372441	CONFLATING ENTITIES USING A PERSISTENT ENTITY INDEX - Systems, methods, and computer-readable storage media are provided for conflating entities using a persistent entity index. Information (including attributes) pertaining to a plurality of entities is received. The received information is either matched with one or more existing entities in the persistent entity index or, if no match is found, selected for addition to the persistent entity index. The persistent entity index includes entity-attribute pairs associated therewith. Attributes associated with matching entities for which information is received are aggregated and/or reconciled with the entity-attribute pairs associated with existing entities included in the persistent entity index. The persistent entity index may be incrementally updated at predetermined time intervals to insure the accuracy and freshness of the information associated therewith.	12-18-2014
20140372442	K-GRID FOR CLUSTERING DATA OBJECTS - Algorithms and systems for clustering information objects. Objects including metadata may be populated within a k-dimensional grid (K-Grid). A distance function between objects may be calculated, and a cost function may be calculated. Optimization may occur over several iterations by applying random mutation operations on the K-Grid and re-calculation of the cost function.	12-18-2014
20140372443	Tracking System and Method - A method of tracking an entity includes generating data relevant to the entity at a first location, storing the data at a server, and accessing at least a portion of the data at the first location or at a second location. The data can include location information, time information, image data, text, and/or biometric data. The data an be encrypted, organized, categorized, updated, accumulated with other data, classified, and/or disseminated, and an automated search can include an image feature recognition and facial recognition search. The entity can be a person, an animal, or an object. A communications system includes a processing device and a server that includes an automated search engine. The server can be configured to perform data analysis, such as data grouping.	12-18-2014
20140372444	DATA CLUSTERING APPARATUS AND METHOD - Provided are a data clustering apparatus and method, which can rapidly and accurately cluster data. The data clustering apparatus includes an index discriminating unit discriminating an index corresponding to an input position of new data input to a space for data clustering, including a lattice-type segmented space having lattice unit spaces set with different indexes, and a clustering unit creating a new cluster in the discriminated index using the input new data as a representative value when a cluster is not created at the discriminated index.	12-18-2014
20140379713	COMPUTING A MOMENT FOR CATEGORIZING A DOCUMENT - For documents in a collection, respective data structures containing information representing occurrence of terms in the corresponding documents are generated. For a first one of the documents, at least one moment is computed based on the information in the data structure corresponding to the first document, where the at least one moment represents at least one characteristic of a distribution of values derived from the information in the data structure corresponding to the first document. The at least one moment is useable to categorize the first document into one of a plurality of classes of documents.	12-25-2014
20140379714	DETECTING HARDWARE AND SOFTWARE PROBLEMS IN REMOTE SYSTEMS - A method for detecting hardware and/or software anomalies in remote systems. The method may include aggregating, in a centralized electronic database, by an electronic database server, data received via a network from each of the remote systems, the data relating to operating statistics of one or more subcomponents of the remote systems over time. The method may also include utilizing an electronic database client communicatively coupled to the centralized database to automatically periodically access and analyze data stored in the centralized database to identify anomalies in hardware and/or software components of the remote systems. In one embodiment, the data relating to operating statistics of the subcomponents may include data from statistics counters corresponding to the subcomponents, each statistics counter, in one state, indicative of an identifiable error. In this regard, analyzing data stored in the centralized database may involve comparing data from the statistics counters to identify the anomalies.	12-25-2014
20140379715	Grouping of Objects in a Distributed Storage System Based on Journals and Placement Policies - Managing placement of object replicas is performed at a first instance of a distributed storage system. One or more journals are opened for storage of object chunks. Each journal is associated with a single placement policy. A first object is received comprising at least a first object chunk. The first object is associated with a first placement policy. The first object chunk is stored in a first journal whose associated placement policy matches the first placement policy. The first journal stores only object chunks for objects whose placement policies match the first placement policy. For the first journal, the receiving and storing operations are repeated for multiple objects whose associated placement policies match the first placement policy, until a first termination condition occurs. Then, the first journal is closed. Subsequently, the first journal is replicated to a second instance of the distributed storage system according to the first placement policy.	12-25-2014
20150012536	GROUP DECISION METHOD ON RANKING OF A LARGE NUMBER OF ALTERNATIVES - In a group decision method on ranking of a large number of alternatives, multiple alternatives are re-grouped into subgroups, and each alternative in every subgroup are evaluated based on information of each alternative to generate a ranking number for each alternative in every subgroup. Then, a normalized score for each alternative in every subgroup are determined to generate an average normalized score for each alternative, so as to increase the accuracy of ranking results of alternatives in a large scale competition.	01-08-2015
20150012537	ELECTRONIC DEVICE FOR INTEGRATING AND SEARCHING CONTENTS AND METHOD THEREOF - An electronic device for integrating and searching contents and a method thereof. The method includes extracting tag information of at least one contents executed in at least one application, classifying at least the one executed contents based on the extracted tag information according to types of the contents, and displaying the classified contents according to a set order.	01-08-2015
20150012538	FLEXIBLE NAMESPACE PRIORITIZATION - Access to resources on a computer may be provided by using a first namespace of resources and a second namespace of resources, where one or more names are common to both namespaces and those names refer to different respective instances of resources. A request is received for a first resource name from an application, where the first resource name exists in the first resource namespace and in the second resource namespace. In response to the request, whether to obtain a resource from the first namespace or from the second namespace is determined by applying one or more resource policies to the first resource namespace and to the second resource namespace.	01-08-2015
20150012539	SYSTEM AND METHOD FOR CLUSTERING DISTRIBUTED HASH TABLE ENTRIES - A distributed storage system may store data object instances in persistent storage and may store keymap information for those data object instances in a distributed hash table on multiple computing nodes. Each data object instance may include a composite key containing a user key. The keymap information for each data object instance may map the user key to a locator and the locator to the data object instance. A request to store or retrieve keymap information for a data object instance may be routed to a particular computing node based on a consistent hashing scheme in which a hash function is applied to a portion of the composite key of the data object instance. Thus, related entries may be clustered on the same computing nodes. The portion of the key to which the hash function is applied may include a pre-determined number of bits or be identified using a delimiter.	01-08-2015
20150019551	REGION CLASSIFICATION BASED ON REGIONAL DISTRIBUTION INFORMATION - A region classification server classifies a region based on region distribution information. The region classification server may include several databases to facilitate the classification of the region, including a classification type database, a classification evaluation database, and a region classification database. The region classification server may also include one or more interfaces for receiving region distribution information, such as an automated business listing interface and a user-input classification interface. The region classification server may also classify a region by modifying a classification evaluation stored in the classification evaluation database, where the modification is based on received user information. Moreover, the region classification server may provide the region classifications to other systems in communication with the region classification server, such as search engine providers, augmented reality developers, or other third-party entities.	01-15-2015
20150019552	MERGING SETS OF DATA OBJECTS FOR DISPLAY - Disclosed is a method of merging a first set of data objects and a second set of data objects for displaying on a display screen, the method comprising: retrieving a first identifier associated with a first memory location, the first memory location for storing the first set of data objects; retrieving a second identifier associated with a second memory location, the second memory location for storing the second set of data objects; comparing the first identifier and the second identifier; and grouping one or more first data objects from the first set of data objects and one or more second data objects from the second set of data objects based on the comparison.	01-15-2015
20150019553	DATA CONSOLIDATION MECHANISMS FOR INTERNET OF THINGS INTEGRATION PLATFORM - A method of consolidating Internet of Things (IoT) devices connected via an IoT network is disclosed. The method includes extracting a first data record from a data source connected to the IoT integration platform; analyzing the first data record to generate a derivative record relevant to a user context; aggregating the first data record and the derivative record to a contextually grouped data cluster; and presenting the derivative record on an integration interface along with other data records in the data cluster.	01-15-2015
20150019554	NUMBER OF CLUSTERS ESTIMATION - A method of determining a number of clusters for a dataset is provided. Centroid locations for a defined number of clusters are determined using a clustering algorithm. Boundaries for each of the defined clusters are defined. A reference distribution that includes a plurality of data points is created. The plurality of data points are within the defined boundary of at least one cluster of the defined clusters. Second centroid locations for the defined number of clusters are determined using the clustering algorithm and the reference distribution. A gap statistic for the defined number of clusters based on a comparison between a first residual sum of squares and a second residual sum of squares is computed. The processing is repeated for a next number of clusters to create. An estimated best number of clusters for the received data is determined by comparing the gap statistic computed for each iteration of the number of clusters.	01-15-2015
20150019555	METHOD FOR ENRICHING A MULTIMEDIA CONTENT, AND CORRESPONDING DEVICE - According to the invention, the method comprises the following steps of:	01-15-2015
20150019556	SYSTEM AND METHOD FOR MANAGING DEDUPLICATED COPIES OF DATA USING TEMPORAL RELATIONSHIPS AMONG COPIES - Systems and methods are disclosed for managing deduplicated images of data objects that change over time. The method includes: organizing unique content of each data object as a plurality of content segments and storing the content segments in a data store; for each data object, creating an organized arrangement of hash structures, wherein each structure, for a subset of the hash structures, includes a hash signature for a corresponding content segment and is associated with a reference to the corresponding content segment, and for each data object, maintaining an organized arrangement of temporal structures to represent a corresponding data object over time, wherein each structure is associated with a temporal state of the data object, and wherein each temporal state is associated with the hash structures representing the content of the data object during that temporal state.	01-15-2015
20150019557	DYNAMICALLY PROCESSING AN EVENT USING AN EXTENSIBLE DATA MODEL - Systems and methods of dynamically processing an event using an extensible data model are disclosed. One embodiment includes, specifying attributes of the event in a data model; the data model being extensible to add properties to the event as the dataset is streamed from the source to the sink.	01-15-2015
20150026178	SUBJECT-MATTER ANALYSIS OF TABULAR DATA - A method for subject-matter analysis of tabular data is provided in the illustrative embodiments. A first document including the tabular data is received. A library of functional signatures for a first subject-matter domain is selected. A determination is made whether a threshold number of functional signatures from the selected library are applicable to the tabular data, wherein a functional signature is applicable to the tabular data when values in the tabular data correspond to an operation and a table structure specified in the functional signature. Responsive to the threshold number of functional signatures from the selected library being applicable to the tabular data, a processor and a memory process the first document according to a process for the first subject matter domain selected from a plurality of processes for respective subject matter domains.	01-22-2015
20150026179	ELECTRONIC DEVICE AND METHOD FOR PROCESSING CLIPS OF DOCUMENTS - According to one embodiment, an electronic device includes a display processor and a processor. The display processor is configured to display on a screen a plurality of clips. Each of the plurality of clips corresponds to at least a part of a document. The processor is configured to designate a first clip group in the plurality of clips as a search key in accordance with an operation by a user, and to acquire information regarding one or more second clips of the plurality of clips, the one or more second clips being related to the first clip group. The display processor is further configured to display the one or more second clips as a search result corresponding the search key.	01-22-2015
20150026180	COLLATION DEVICE, COLLATION METHOD, AND COMPUTER-READABLE RECORDING MEDIUM - A collation device specifies a combination of event conditions in parallel with each other based on a query. The collation device sets the combination of event conditions in a parallel relation to the same window, connects a plurality of windows in series, and generates a similar query that sets a window interval condition between the windows in a connected relation. The collation device compares a similar query with collation data, and detects a combination of events that satisfies a condition of the similar query from among events included in the collation data.	01-22-2015
20150026181	Matching Anonymized User Identifiers Across Differently Anonymized Data Sets - Provided is a process of obtaining a plurality of location data sets from different providers of user geolocation history, each location data set including a plurality of user-activity records, each user-activity records being associated with a user identifier and including geolocations of the corresponding user and times that the corresponding user was at the geolocations, the different providers having different user identifiers for a given corresponding user; matching, by one or more processors, the user identifiers between the location data sets based on geolocations of the corresponding user and times that the corresponding user was at the geolocations; and storing the matched user identifiers in association with one another in corresponding user profiles.	01-22-2015
20150026182	SYSTEMS AND METHODS FOR GENERATION OF SEARCHABLE STRUCTURES RESPECTIVE OF MULTIMEDIA DATA CONTENT - Methods and systems for generating concept structures from signature reduced clusters (SRCs) are provided. The method includes retrieving at least one SRC including a cluster of signatures respective of a plurality of multimedia data elements (MMDEs); generating at least one metadata for each signature of the cluster of signatures; identifying a number of repetitions of a metadata of the at least one generated metadata; and determining whether the number of repetitions of the metadata exceeds a predefined repetition threshold; upon determining that the number of repetitions of the metadata exceeds the predefined repetition threshold, identifying the metadata as representative of the SRC; comparing the representative metadata to metadata that is representative of at least one previously generated SRC to determine a metadata match; and upon determining the metadata match, identifying the retrieved SRC and the matching previously generated SRC as a concept structure.	01-22-2015
20150032746	SYSTEM AND METHOD FOR DISCOVERING AND EXPLORING CONCEPTS AND ROOT CAUSES OF EVENTS - A method for determining a cause of events detected in a plurality of interactions includes: identifying, on a processor, a plurality of elements in the interactions; detecting, on the processor, a plurality of sequences of elements in the interactions; mining, on the processor, the plurality of sequences for generating a set of supported patterns; computing, on the processor, association rules from the set of supported patterns; and returning the computed association rules.	01-29-2015
20150032747	METHOD FOR SYSTEMATIC MASS NORMALIZATION OF TITLES - A method for normalizing raw titles to canonical titles is described. The method includes designating a set of canonical titles, generating a set of n-grams for each canonical title, assigning a set of attributes to each n-gram, assigning a set of labels to each of the attributes, and storing the labeled canonical title and labeled n-grams in a database. In some examples, a new title may be mapped to an existing canonical title in the database by generating a set of n-grams for the new title, looking up the n-grams in the database of canonical titles, retrieving the set of labels assigned to n-grams in the database that match n-grams from the new title, and assigning those labels to the corresponding attributes of the new title. The new title may then be mapped to a canonical title on the basis of similarly labeled attributes.	01-29-2015
20150032748	GROUP-BASED DOCUMENT RETRIEVAL - Embodiments relate to retrieving a document from a plurality of document groups in which mutually related documents are each included. An aspect includes acquiring a retrieval condition that includes a plurality of conditions and at least one logical operator that connects the plurality of conditions. Another aspect includes identifying, with respect to each condition of the plurality of conditions, a document group including a document satisfying the condition from among the plurality of document groups. Another aspect includes identifying a document that satisfies at least one condition. Another aspect includes determining a document that is a retrieval result by making a selection to omit or retain that depends on the at least one logical operator. Another aspect includes generating information showing the document that is the retrieval result based on the retrieval condition.	01-29-2015
20150032749	METHOD OF CREATING CLASSIFICATION PATTERN, APPARATUS, AND RECORDING MEDIUM - A method includes: extracting a partial character string including a reserved word and a character string immediately previous or subsequent to the reserved word from each of a plurality of pieces of target data, the plurality of pieces of target data conforming to a first pattern character string including the reserved word defined by a protocol; detecting target data including the partial character string among the plurality of pieces of target data; specifying a first partial character string from the extracted partial character string based on the detected target data; and creating, by a processor, a second pattern character string for classifying the plurality of pieces of target data based on the first pattern character string and the first partial character string.	01-29-2015
20150032750	METHOD FOR DISCOVERING RELATIONSHIPS IN DATA BY DYNAMIC QUANTUM CLUSTERING - Data clustering is provided according to a dynamical framework based on quantum mechanical time evolution of states corresponding to data points. To expedite computations, we can approximate the time-dependent Hamiltonian formalism by a truncated calculation within a set of Gaussian wave-functions (coherent states) centered around the original points. This allows for analytic evaluation of the time evolution of all such states, opening up the possibility of exploration of relationships among data-points through observation of varying dynamical-distances among points and convergence of points into clusters. This formalism may be further supplemented by preprocessing, such as dimensional reduction through singular value decomposition and/or feature filtering.	01-29-2015
20150039611	DISCOVERY OF RELATED ENTITIES IN A MASTER DATA MANAGEMENT SYSTEM - Methods and arrangements for discovering entity types for a set of records. A set of records is input, with each record comprising attributes with associated attribute values. The records are grouped into candidate entity types in view of at least one of: the attribute values of the records, at least one domain ontology and at least one dimension hierarchy. An interestingness measure of each candidate entity type is calculated, via estimating interestingness based on at least one factor selected from the group consisting of: a correlation between attribute values of records, a number of attributes, a log of queries issued to a server, and an average group size for candidate entity types. At least one candidate entity type is validated based on the calculated interestingness measures. Other variants and embodiments are broadly contemplated herein.	02-05-2015
20150039612	STORAGE-BASED DATA ANALYTICS KNOWLEDGE MANAGEMENT SYSTEM - Aspects of the present invention provide a solution for augmenting managed data. In an embodiment, a data item that is being and/or has been accessed by a user is analyzed to retrieve a set of features. Based on this set of features, the data item is indexed. An analytical analysis is performed on the data item based on the set of features. This analytical analysis can be used to obtain a group of data items that is related to the data item. This group of related data items can be returned to the user, such as when the user reacquires network connectivity.	02-05-2015
20150039613	FRAMEWORK FOR LARGE-SCALE MULTI-LABEL CLASSIFICATION - A framework for large-scale multi-label classification of an electronic document is described. An example multi-label classification system is configured to identify seed labels that represent respective one or more candidate content topics associated with the electronic document and determine additional labels based on the seed labels and label correlation data derived from member profiles maintained by an on-line social network system. The multi-label classification system then constructs a graph comprising nodes that correspond to the seed labels and the additional labels. A clustering algorithm is applied to the constructed graph to produce a labels graph. The labels graph is deemed to include nodes that correspond to topics discussed or referenced in the electronic document.	02-05-2015
20150039614	METHOD AND SYSTEM FOR RAPID SEARCHING OF GENOMIC DATA AND USES THEREOF - A method, apparatus and system for transforming genomic data into a computer database environment comprising a forward lookup table and a plurality of reverse lookup tables which relate consecutive overlapping reference sequence segments to reference sequences stored in the forward lookup table enables rapid and precise matching of undefined biological sequences with reference sequences.	02-05-2015
20150039615	POS DEVICE - An embodiment of a POS system according to the present invention stores log information and RAS information into a storage device (	02-05-2015
20150039616	DISCOVERY AND SHARING OF PHOTOS BETWEEN DEVICES - Described are systems, media, and methods for discovery of media relevant to a user and sharing the media with the user.	02-05-2015
20150046452	GEOTAGGING UNSTRUCTURED TEXT - Mechanisms are described to extract location information from unstructured text, comprising: building a language model from geo-tagged text; building a classifier for differentiating referred and physical location; given unstructured text, identifying referred location using the language model (that is, the location to which the unstructured text refers); given the unstructured text, identifying if referred location is also the physical location using the classifier; and predicting (that is, performing calculation(s) and/or estimation(s) of degree of confidence) of referred and physical location.	02-12-2015
20150046453	TUNABLE HARDWARE SORT ENGINE FOR PERFORMING COMPOSITE SORTING ALGORITHMS - Embodiments include methods, systems and computer program products for performing a composite sort on a tunable hardware sort engine includes determining desired sort performance parameters, configuring a composite sort engine based on the desired sort performance parameters, and receiving a plurality of keys having a payload associated with each of the plurality of keys. The method also includes reserving DRAM storage for each of the payloads, generating a tag for each of the plurality of keys, the tag identifying the DRAM storage reserved for each of the payloads, and storing the payloads in the portions of the DRAM storage. The method further includes generating a composite key for each of the plurality of keys, sorting the composite keys by the composite sort engine, and retrieving the payloads associated with the sorted composite keys from the DRAM storage. The method also includes outputting the payloads associated the sorted composite keys.	02-12-2015
20150046454	SOCIAL NETWORK POSTING ANALYSIS USING DEGREE OF SEPARATION CORRELATION - A degree of social network separation of a social network user that generated expressive content of a social media posting is identified relative to a specified social network user for each of a group of social media postings. Social media postings with an equivalent identified degree of social network separation relative to the specified social network user are grouped. Differences between the expressive content of the grouped social media postings at different degrees of social network separation are determined. The determined differences between the expressive content of the grouped social media postings at the different degrees of social network separation are rendered.	02-12-2015
20150046455	METHOD FOR STORING XML DATA INTO RELATIONAL DATABASE - A method for storing XML data into a relational database, including the following steps: splitting an XML Schema into one or more mapping configuration files, each mapping configuration file corresponding to a relational database table; parsing an XML text, and according to the associative relationship in the mapping configuration files, inserting the data in the XML text into the multiple relational database tables; and accessing the database to read the data in the XML text. The method stores XML file data into a relational database, and accelerates data reading and access speed.	02-12-2015
20150046456	METHODS AND APPARATUS FOR POINT CLOUD DATA PROCESSING - Methods and apparatus are provided for processing data representing three-dimensional points organized in a data structure wherein each point has multiple components, the data is organized in a respective layer per component, each layer is segmented in cells of a two-dimensional grid, the cells are arranged such that the components of a given point are contained in corresponding cells of multiple layers, the cells are grouped in patches by layer, and the patches are arranged such that the components of an array of points is represented by corresponding patches of multiple layers. At least one first criterion and at least one second criterion are obtained. Data are retrieved from cells of patches meeting the at least one first criterion and from layers meeting the at least one second criterion. The retrieved data are processed to obtain a derivative data set.	02-12-2015
20150046457	SYSTEMS AND METHODS FOR DETERMINING OPTIMAL PARAMETERS FOR DYNAMIC QUANTUM CLUSTERING ANALYSES - In the present work, quantum clustering is extended to provide a dynamical approach for data clustering using a time-dependent Schrödinger equation. To expedite computations, we can approximate the time-dependent Hamiltonian formalism by a truncated calculation within a set of Gaussian wave-functions (coherent states) centered around the original points. This allows for analytic evaluation of the time evolution of all such states, opening up the possibility of exploration of relationships among data points through observation of varying dynamical-distances among points and convergence of points into clusters. This formalism may be further supplemented by preprocessing, such as dimensional reduction through singular value decomposition and/or feature filtering. Additionally, the parameters of the analysis can be modified in order to improve the efficiency of the dynamic quantum clustering processes.	02-12-2015
20150052134	Method and Apparatus for Storing Sparse Graph Data as Multi-Dimensional Cluster - A system for storing graph data as a multi-dimensional cluster having a database with a graph dataset containing data and relationships between data pairs and a schema list of storage methods that use a table with columns and rows associated with data or relationships. An analyzer module to collect statistics of a graph dataset and a dimension identification module to identify a plurality of dimensions that each represent a column in the table. A schema creation and loading module creates a modified storage method and having a plurality of distinct table blocks and a plurality of table block indexes, one index for each table block and arranges the data and relationships in the given graph dataset in accordance with the modified storage method to create the multi-dimensional cluster.	02-19-2015
20150052135	AUTOMATED DOCUMENT CLUSTERING IN A COLLABORATIVE MULTI-USER DOCUMENT STORE - Methods, systems and techniques for managing revisions of documents in a collaborative, multiuser document store are provided. Example embodiments provide an Automated Document Revision Management Server (“ADRMS”) to automatically cluster and remove revisions of file content for easy navigation and management. Revisions are trimmed when necessary to conserve storage space. The ADRMS creates logical clusters of revisions based upon some measure of their similarities. That is, revisions that are similar and can be represented by the latest revision in the cluster formulate one cluster, and those that are markedly dissimilar are placed in a different cluster. The logic used to cluster revisions accounts for time-based factors, content-based factors, and context-based factors to determine whether a revision is incremental and can be grouped in the same cluster or is significant enough to warrant a new cluster. Revisions may be trimmed based upon age and/or available space by a revision trimming component.	02-19-2015
20150052136	Image Categorization Database and Related Applications - The present invention relates to a system and method for providing a new way of categorizing and searching image files. More specially, the present invention provides a system and method that can provide image categorization in a game like environment wherein the participants can freely input language units that describe or refer to corresponding images, wherein the inputted language units can be categorized and corresponding scores will be given according to the which categories the inputted language units fall into, wherein an image categorization database can be established and images can be searched according to the various types of categorization, and the categories of the language units in the image database can be used to rank image search results.	02-19-2015
20150052137	APPARATUS FOR COLLECTING CONTENTS USING SOCIAL RELATION CHARACTER AND METHOD THEREOF - Disclosed is an apparatus for collecting contents using social relation characters, which includes: an input unit for receiving search information from a main user; a database for storing SNS subscriber list of the main user and related users in relation to the main user and group information in relation to friendship in an SNS; and a content managing unit for searching contents in relation to the received search information by using the group information from contents possessed by the main user and the related users in an SNS server, defining the searched contents as a first content group, calculating a first interest index for each content included in the first content group based on additional information input by the related users, and determining a predetermined content, on which interest of the related users is focused, from the searched contents based on the calculated first interest index.	02-19-2015
20150052138	CLASSIFYING SOCIAL ENTITIES AND APPLYING UNIQUE POLICIES ON SOCIAL ENTITIES BASED ON CROWD-SOURCED DATA - Technology is disclosed for detecting, classifying, and/or enforcing rules on social networking activity. The technology can scan and collect social content data from one or more social networks, store the social content data, classify content data posted to a social network, create and apply a set of social data content rules to future posted social content data.	02-19-2015
20150052139	IMAGE SEARCH DEVICE, IMAGE SEARCH METHOD, PROGRAM, AND COMPUTER-READABLE STORAGE MEDIUM - An image search device includes a common memory and a plurality of parallel processors for executing a same instruction. The image search device transfers, from storage, a plurality of representative feature vectors, which respectively represent a plurality of clusters including a plurality of image feature vectors, stores, in the common memory, one or more query feature vectors extracted from an image serving as a query, calculates a distance between the plurality of transferred representative feature vectors and the query feature vector using the plurality of parallel processors, and selects one or more of a plurality of images based on a distance between the plurality of image feature vectors, which belong to the cluster selected by the calculated distance, and the query feature vector.	02-19-2015
20150052140	INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM - Disclosed is an information processing apparatus including an expression extraction unit, a feature extraction unit, a clustering unit, a related expression extraction unit, and an output unit. The expression extraction unit extracts a plurality of expressions from a plurality of documents. The feature extraction unit extracts feature amounts of the extracted respective expressions while distinguishing the expressions having the same notation. The clustering unit clusters the extracted respective expressions together while distinguishing the expressions having the same notation and calculates assignment degree vectors having assignment degrees of the respective expressions to two or more respective clusters as components. The related expression extraction unit extracts related expressions having the assignment degree vectors similar to those of a provided input expression while distinguishing the expressions having the same notation. The output unit outputs the related expressions and identification information for identifying the related expressions.	02-19-2015
20150052141	ELECTRONIC DEVICE AND METHOD FOR TRANSMITTING FILES - Method of transmitting ancillary files to users in support of main files includes acquiring information as to what is read by users within a predetermined period of time. According to the information as to what is read by users, the users are classified into groups using a clustering method. A current user is determined and a group that comprises the current user is determined. Target files read by the other users in the determined group are transmitted for the current user.	02-19-2015
20150052142	SYSTEM AND METHOD FOR GENERATION OF SIGNATURES FOR MULTIMEDIA DATA ELEMENTS - A method and system for generating a complex signature respective of a multimedia data element (MMDE) are provided. The method includes partitioning the MMDE into a plurality of different minimum size MMDEs; generating, for each of the different minimum MMDEs, at least one signature, wherein generation of each at least one signature is performed by a plurality of computational cores, each computational core having at least one configurable property characterizing the core, and wherein configuration of the at least one configurable property respective of each core results in statistical independence among the plurality of cores; and assembling at least a complex signature for the MMDE comprised of a plurality of the generated signatures.	02-19-2015
20150058344	METHODS AND SYSTEMS FOR MONITORING AND ANALYZING SOCIAL MEDIA DATA - A system and method for analyzing social media data by obtaining social media data from a social media platform, where the social media data includes documents from multiple users of the social media platform; classifying the documents using a sentiment classifier; tokenizing the documents into terms; associating a sentiment with each term; detecting a first event based on a number of occurrences of a first term in the documents; and providing information associated with the event to a user, where the information includes the first term and a sentiment associated with the first term.	02-26-2015
20150058345	REALTIME ACTIVITY SUGGESTION FROM SOCIAL AND EVENT DATA - Architecture that aggregates realtime geo-referenced data over areas such as physical world geographical areas and virtually-defined areas such as by geofences to provide users with a quick overview and suggestion of activities to do across an area of interest in the spatial extent. The geo-referenced data can be supplied by a provider and/or user. When in combination, event listings can be obtained from providers and social data (e.g., check-in) can be obtained from social websites and/or businesses that make check-in data available freely or under subscription, for example. At least one advantageous outcome of the disclosed aggregation approach is that privacy issues, which currently exist in the industry by showing exact locations of user-contributed data, are overcome. While aggregating over larger spatial extents having high activity, the events supplied by provider listings are assigned scores that show trending and/or high-user activity volumes, and therefore, can be suggested to users.	02-26-2015
20150058346	METHOD AND DEVICE FOR ASSIGNING TIME INFORMATION TO A MULTIMEDIA CONTENT - The invention concerns a device (D) for assigning time information to a main multimedia content related to a given object. To this end, said device comprises:	02-26-2015
20150058347	METHOD, SYSTEM AND SOFTWARE FOR ASSOCIATING ATTRIBUTES WITHIN DIGITAL MEDIA PRESENTATIONS - Disclosed are a system, method and software to associate attributes with digital media assets. Digital media contains specific assets, such as images, that can be replaced with other assets. The system, method and software permit the association of attributes with specific assets. The association of attributes and assets enables the provision of content that is enhanced and more impacting for a user.	02-26-2015
20150066925	Method and Apparatus for Classifying Data Items Based on Sound Tags - A method for grouping data items in a mobile device is disclosed. In this method, a plurality of data items and a sound tag associated with each of the plurality of data items are stored, and the sound tag includes a sound feature extracted from an input sound indicative of an environmental context for the data item. Further, the method may include generating a new data item, receiving an environmental sound, generating a sound tag associated with the new data item by extracting a sound feature from the environmental sound, and grouping the new data item with at least one of the plurality of data items based on the sound tags associated with the new data item and the plurality of data items.	03-05-2015
20150066926	METHOD AND SYSTEM OF MACHINE-TO-MACHINE VERTICAL INTEGRATION WITH PUBLISHER SUBSCRIBER ARCHITECTURE - An approach for machine-to-machine vertical integration with publisher-subscriber architecture is provided. The approach includes dynamically registering a plurality of sensors associated with a plurality of subscribers. The sensors are monitored as part of a sensor platform to maintain sensor data for the plurality of subscribers. Also, the sensor data are acquired from the plurality of sensors as a collective data set to determine one or more trends. The collective data set is tagged with metadata for the determined trend.	03-05-2015
20150066927	Generating a Non-Deterministic Finite Automata (NFA) Graph for Regular Expression Patterns with Advanced Features - In an embodiment, a method of compiling a pattern into a non-deterministic finite automata (NFA) graph includes examining the pattern for a plurality of elements and a plurality of node types. Each node type can correspond with an element. Each element of the pattern can be matched at least zero times. The method further includes generating a plurality of nodes of the NFA graph. Each of the plurality of nodes can be configured to match for one of the plurality of elements. The node can indicate the next node address in the NFA graph, a count value, and/or node type corresponding to the element. The node can also indicate the element representing a character, character class or string. The character can also be a value or a letter.	03-05-2015
20150066928	ASSOCIATED-DATA PROCESSOR, ASSOCIATED-DATA PROCESSING METHOD AND INFORMATION STORAGE MEDIUM - According to one embodiment of the present disclosure, a first data file, a second data file, a third data file and a file controlling logic portion are provided. In the first data file, basic data of a plurality of programs are stored. The basic data indicates a plurality of attributes related to each program. The second data file associates viewers with a plurality of programs. The third data file associates programs with a plurality of viewers. The file controlling logic portion is configured to access the first to third data files, generate a cluster in which data is associated depending on a theme and specify a viewer and/or a program included in the cluster.	03-05-2015
20150066929	METHOD FOR MAPPING MEDIA COMPONENTS EMPLOYING MACHINE LEARNING - The present document relates to cloud computing. In particular, the present document relates to methods and systems for cloud computing which enable the efficient and flexible placement of application components within a cloud. A computing device (	03-05-2015
20150066930	GENERATION OF METADATA AND COMPUTATIONAL MODEL FOR VISUAL EXPLORATION SYSTEM - Various mechanisms are described for generating metadata describing relationships among data sets. Quantitative data can be analyzed to determine relationships, and metadata representing the determined relationships can then be stored. Visualizations can then be generated from the metadata, and a navigational model can be defined based on the generated set of visualizations. The navigational models can provide robust visual mechanics for implementing intuitive navigational schemes that facilitate interaction with data on any suitable output device, include small screens as found on smartphones and/or tablets.	03-05-2015
20150066931	INFORMATION PROCESSING APPARATUS AND METHOD - According to one embodiment, an information processing apparatus includes a collection unit, a storage and a retrieval unit. The collection unit collects first metadata from information sources, the first metadata relating to information that has no common standard between the information sources and including first attributes and first attribute values. The storage stores each of the first attributes and first attribute values corresponding to each of the first metadata. The retrieval unit retrieves the first metadata, based on corresponding relations of the first attributes and the first attribute values with second attributes and second attribute values in second metadata newly obtained, to extract corresponding metadata that is one of the first metadata and corresponds to the second metadata.	03-05-2015
20150066932	AGRICULTURAL SPATIAL DATA PROCESSING SYSTEMS AND METHODS - Systems and methods are provided for correlating data from agricultural operations and displaying the resulting correlations. In some embodiments, data is gathered during two agricultural operations, a bitmap is rendered of the first operation, and second operation data from a live location is associated with a bitmap value at coordinates associated with the live location.	03-05-2015
20150066933	COMPUTER-IMPLEMENTED METHODS AND SYSTEMS FOR GENERATING VISUAL REPRESENTATIONS OF COMPLEX AND VOLUMINOUS MARKETING AND SALES AND OTHER DATA - Computer-implemented visualization techniques are disclosed for transforming data into graphical objects that can be viewed by a user on an electronic display. The visualization techniques enable users to more easily and quickly understand the data, which can be complex and voluminous. The data may include, but is not limited to, marketing and sales data.	03-05-2015
20150066934	AUTOMATIC CLASSIFICATION OF SEGMENTED PORTIONS OF WEB PAGES - Exemplary methods and apparatuses are provided which may be used for classifying and indexing segmented portions of web pages and providing related information for use in information extraction and/or information retrieval systems.	03-05-2015
20150066935	CROWDSOURCING AND CONSOLIDATING USER NOTES TAKEN IN A VIRTUAL MEETING - Arrangements relate to crowdsourcing and consolidating user notes taken within a virtual meeting. Notes from one or more meeting attendees can be received. The received user notes can be analyzed to identify a key element therein using natural language processing. The analysis of received user notes can be performed by a processor. A consolidated system notes can be generated. The consolidated system notes can include the key element.	03-05-2015
20150066936	Image Capture and Identification System and Process - A digital image of the object is captured and the object is recognized from plurality of objects in a database. An information address corresponding to the object is then used to access information and initiate communication pertinent to the object.	03-05-2015
20150066937	EFFICIENT STORAGE OF DATA ALLOWING FOR MULTIPLE LEVEL GRANULARITY RETRIEVAL - Data series are stored at multiple resolutions in a computer-readable data storage medium. In particular, time series data values of the data series are received with associated timestamps. Corresponding storage elements in the computer-readable data storage medium are identified based on the time stamps. Aggregate values are determined by summing the time series data values. The time series data values stored in the corresponding storage elements are replaced by the aggregate values. Combined data values of the aggregate values are stored in storage elements in the computer-readable storage medium at a first resolution and second resolution, where the second resolution is half of the first resolution.	03-05-2015
20150074109	CREATION AND USE OF CLOSELY-MATCHED GROUPS TO AID IN INITIATING AND SUSTAINING BEHAVIORAL CHANGE - A method and system for sharing data between a plurality of users in an online group on a communications system includes receiving data from a plurality of users. The data includes personal characteristics about the users. The personal characteristics are analyzed to determine groups of personal characteristics. The users are clustered into closely matched groups based on the groups of personal characteristics. A plurality of activity information is generated about the users in each of the closely matched groups. The activity information may include a physical activity, a location, and a time of day. User may be allowed access to the activity information about other users in each of the closely matched groups, respectively.	03-12-2015
20150074110	PLATFORM SYSTEM FOR OBJECT TAGGING AND METHOD THEREOF - The present invention relates to a platform system for object tagging and a method thereof. According to the platform system for object tagging of the present invention, since a user can add information in person, which he/she has about an individual object in contents provided by a website, into the contents of the website, as tag information, it is possible to enable an individual user to accurately, freely, and conveniently express information about an object in contents and to enable a plurality of users to actively create information and share more information by providing more users with information.	03-12-2015
20150074111	INFORMATION INTEGRATION CONTROL SYSTEM, SOCIAL INFRASTRUCTURE OPERATION SYSTEM, OPERATION METHOD, LOCAL APPARATUS AND SERVER APPARATUS - According to one embodiment, an information integration control system includes a collector, a storage module and a generator. The collector collects infrastructure information concerning a social infrastructure, user information about a user who uses the social infrastructure, and management information of a manager who manages the social infrastructure and the user. The storage module stores the infrastructure information, the user information, and the management information which are collected. The generator generates control information for the social infrastructure based on the infrastructure information, the user information, and the management information which are stored in the storage module.	03-12-2015
20150081706	EVENT TIMELINE GENERATION - A method selects events from an event log for presentation along a timeline. The method may receive information associated with the timeline to define an interval of interest and a partition size, and divide the timeline into a plurality of segments based on the partition size. The method may further identify each segment having at least one relevant event within the segment, where a relevant event is an event which starts within a segment and overlaps with the interval of interest. The method may determine parameters associated with at least one relevant event for each identified segment, and provide the determined parameters along with an index which designates each identified segment. The determined parameters may be provided to a client to generate the timeline of the at least one relevant event. An apparatus can implement the method to select events from an event log which are associated with a defined time interval.	03-19-2015
20150081707	MANAGING A GROUPING WINDOW ON AN OPERATOR GRAPH - Embodiments of the disclosure provide a method, system, and computer program product for managing a windowing operation. The method can include determining a sentinel value that defines a start of a grouping window for a stream of tuples and a terminating sentinel value that defines the end of the grouping window based upon an attribute contained in the stream of tuples. The stream of tuples can be monitored for the sentinel value and the terminating sentinel value by a stream operator. The stream operator can initiate a windowing operation that defines the start of the grouping window in response to a presence of the sentinel value and terminate the windowing operation in response to a presence of the terminating sentinel value.	03-19-2015
20150081708	MANAGING A GROUPING WINDOW ON AN OPERATOR GRAPH - Embodiments of the disclosure provide a method, system, and computer program product for managing a windowing operation. The method can include determining a sentinel value that defines a start of a grouping window for a stream of tuples and a terminating sentinel value that defines the end of the grouping window based upon an attribute contained in the stream of tuples. The stream of tuples can be monitored for the sentinel value and the terminating sentinel value by a stream operator. The stream operator can initiate a windowing operation that defines the start of the grouping window in response to a presence of the sentinel value and terminate the windowing operation in response to a presence of the terminating sentinel value.	03-19-2015
20150081709	PROCESSING DEVICE, PROCESSING METHOD, PROGRAM, AND RECORDING MEDIUM - An acquirer acquires, for the on-the path categories situated on the path from the topmost category of a hierarchical structure comprising categories into which products or serves are classified to each of a category of interest and the categories immediately below the category of interest, the frequencies of the names of the on-the-path categories and a keyword co-occurring in a search query given to a search device. An identifier identifies the category of interest as a category candidate immediately above a category of which the name is given by the keyword when the frequencies acquired for the on-the-path categories satisfy a candidate condition associated by the search device.	03-19-2015
20150081710	DATA TYPING WITH PROBABILISTIC MAPS HAVING IMBALANCED ERROR COSTS - A plurality of data keys are associated with a plurality of type values; query frequencies of the data keys are known. A computer memory is divided into a plurality of tranches, each tranche including a probabilistic or non-probabilistic data structure. The data keys are stored in the tranches in accordance with their query frequencies such that, e.g., frequently queried data keys are stored in data structures having higher accuracy and infrequently queried keys are stored in data structure having less accuracy (and consequently require less memory space).	03-19-2015
20150081711	LINKING ONTOLOGIES TO EXPAND SUPPORTED LANGUAGE - A computer-implemented method, system using at least one computing device, and computer program product are disclosed for linking an ontology provided by a content service with a word expansion ontology. The content service ontology is referred to as a category ontology and the word expansion ontology is referred to herein as a lexical ontology. A user may provide an input such as an input command to an application. The input command is processed by a natural language processing engine to derive the user's intent and to extract relevant entities embodied in the command. The NLP engine may create a composite concept set containing multiple permutations of the concepts (entities extracted) and provide the composite concept set to a concept mapper. The concept mapper applies searches an ontology map and applies one or more scoring operations to determine a best match between the composite concept set and at least one category provided by the category ontology. The content service is searched using the category and the results are displayed to the user.	03-19-2015
20150081712	Method And System For Attaching A Metatag To A Digital Image - A system and method for tagging an image of an individual in a plurality of photos is disclosed herein. A feature vector of an individual is used to analyze a set of photos on a social networking website such as www.facebook.com to determine if an image of the individual is present in a photo of the set of photos. Photos having an image of the individual are tagged preferably by listing a URL or URI for each of the photos in a database.	03-19-2015
20150088884	CROWDSOURCED RESPONSES MANAGEMENT TO CASES - A response management system and method configured to allow management of responses to a case submitted by an entity in a crowdsourced network. The system includes a processing circuit that receives, from the entity over the crowdsourced network, the case. The processing circuit federates the case, into a plurality of federated cases, based on one or more parameters. The processing circuit routes the federated cases to the crowd of respondents in the network. The processing circuit receives responses from the crowd for each of the federated cases. The processing circuit integrates responses for each of the federated cases to yield a single integrated response for the case. The processing circuit publishes the integrated response anonymously that is viewable publicly.	03-26-2015
20150088885	AGGREGATING DIMENSIONAL DATA USING DENSE CONTAINERS - Methods, computer systems, and stored instructions are described herein for densely grouping dimensional data and/or aggregating data using a data structure, such as one that is constructed based on dimensional data. When smaller tables are joined with a larger table, a server may analyze the smaller tables first to determine actual value combinations that occur in the smaller tables, and these actual value combinations are used to more efficiently process the larger table. A dense data structure may be generated by processing dimensional data before processing data from fact table. The dense data structure may be generated by compressing ranges of values that are possible in dimensions into a range of values that actually occurs in the dimensions. The compressed range of values may be represented by dense set identifiers rather than the actual compressed range of values.	03-26-2015
20150088886	SYSTEM AND METHOD FOR ENHANCING THE NORMALIZATION OF PARCEL DATA - System and methods are disclosed that attempt to verify data related to a parcel against mapping and addressing data. If the data cannot be verified, then an enhanced normalization process may be performed. The enhanced normalization process may attempt to group the parcel with some other parcels based on parcel characteristics. If the grouping is successful, then normalization may be performed on the grouping. Normalization may involve checking whether the grouping satisfies a set of grouping criteria.	03-26-2015
20150088887	MANAGING MULTIPLE WINDOWS ON AN OPERATOR GRAPH - Embodiments of the disclosure provide a method, system, and computer program product for managing a windowing operation. The method for grouping processing of a stream of tuples with each tuple containing one or more attributes can include receiving the stream of tuples to be processed by a plurality of processing elements operating on one or more computer processors. The method can also include processing, with a first processing method, a group of tuples from the stream of tuples into a grouping window. The method can also include processing, with a second processing method, a subgroup of tuples from the group of tuples into a subgrouping window. The second processing method can include identifying a sub-membership condition.	03-26-2015
20150088888	Concept Driven Automatic Section Identification - Mechanisms are provided for generating section metadata for an electronic document. These mechanisms receive a document and analyze the document to identify concepts present within textual content of the document. The mechanisms correlate concepts within the textual content with one another to identify concept groups based on the application of one or more rules defining related concepts or concept patterns. The mechanisms determine sections of text within the textual content based on the correlation of concepts within the textual content. Based on results of the determining, the mechanisms generate section metadata for the document and store the section metadata in association with the document for use by a document processing system.	03-26-2015
20150088889	MANAGING MULTIPLE WINDOWS ON AN OPERATOR GRAPH - Embodiments of the disclosure provide a method, system, and computer program product for managing a windowing operation. The method for grouping processing of a stream of tuples with each tuple containing one or more attributes can include receiving the stream of tuples to be processed by a plurality of processing elements operating on one or more computer processors. The method can also include processing, with a first processing method, a group of tuples from the stream of tuples into a grouping window. The method can also include processing, with a second processing method, a subgroup of tuples from the group of tuples into a subgrouping window. The second processing method can include identifying a sub-membership condition.	03-26-2015
20150088890	SYSTEM AND METHOD FOR EFFICIENTLY PROVIDING MEDIA AND ASSOCIATED METADATA - An electronic device with one or more processors, memory and a display obtains a file header for a file corresponding to a plurality of clusters, where the file header includes a cluster index. The device receives a request to seek to a respective position within the file and, in response to receiving the request: identifies a cluster of the plurality of clusters that includes content that corresponds to the respective position based on the cluster index; obtains a cluster header associated with the cluster based on information retrieved from the cluster index, where the cluster header includes a content index; and after obtaining the cluster header, identifies respective content within the cluster corresponding to the respective position based on the content index. The device provides at least a portion of content corresponding to the file to a presentation device for presentation to a user, starting with the respective content.	03-26-2015
20150088891	SYSTEM AND METHOD FOR IDENTIFYING AVAILABILITY OF MEDIA ITEMS - A system, computer-readable storage medium storing at least one program, and a computer-implemented method for identifying availability of media items is presented. A search query is received from a client device of a user. Instances of media items that satisfy the search query and that are available on content sources accessible to the client device of the user are identified. Aggregate information for the media items is determined based on the instances of the media items. The aggregate information for the media items is transmitted to the client device.	03-26-2015
20150088892	SYSTEM AND METHOD FOR CREATING, MANAGING, AND REUSING SCHEMA TYPE DEFINITIONS IN SERVICES ORIENTED ARCHITECTURE SERVICES, GROUPED IN THE FORM OF LIBRARIES - A computer-implemented system and method for creating, managing, and reusing schema type definitions in SOA services, grouped in the form of libraries are disclosed. The method in an example embodiment includes: grouping a plurality of Extensible Mark-up Language (XML) schema (XSD) types, each XSD type defined in an individual XSD file; using a processor to bundle the plurality of individual XSD types into a type library, the type library including a type information file to register the individual XSD types in the type library, the type library further including a type dependencies file to register dependencies between the individual XSD types in the same or different type library; importing types from a different type library, when defining derived types or aggregated types; generating Java artifacts from the XSD types; and associating the Java artifacts with corresponding XSD types in the type information file of the type library.	03-26-2015
20150088893	LOCALIZED DATA AFFINITY SYSTEM AND HYBRID METHOD - A method, system, and computer program for processing records is disclosed. The records are associated with record sets, based on a record number contained in the record. Record sets are associated with physically separate processor sets, which include one or more processors. Records are electronically routed to associated processor sets for processing, based on the record set associated with the record. Records are processed on processors in the processor sets. Furthermore, various localized affinities can be established. Process affinity can link server processes with processor sets. Cache affinity can link database caches with processor sets. Data affinity can link incoming data to processor sets.	03-26-2015
20150095332	AUTOMATIC LOG SENSOR TUNING - A process for automatic tuning a set of collectors and/or sensors includes: collecting first machine data by a first sensor in a collection framework, processing the first machine data by a first collector in the collection framework to yield first collected machine data, performing analytics on the first collected machine data to generate analytics output, and tuning, based, at least in part, on the analytics output, at least one of the following: the first sensor and the first collector.	04-02-2015
20150095333	Activity Based Analytics - An approach for filtering data into a geo-activity zone cell is presented. An area of interest specifying an individual, organization, or entity is selected. Data is extracted from streaming data and from data at rest. Metadata of the extracted data is determined. The metadata includes time and date stamp(s) and contextual information specifying the area of interest. A first portion of the metadata includes geospatial tag(s) specifying the area of interest, and a second portion of the metadata is initially missing geospatial tag(s). The missing geospatial tag(s) are determined and added to the second portion of the metadata by extracting a location from profile data and/or inferring the location based on a region-based geo-topic model. The extracted data is filtered into a geo-activity zone cell based on the first and second portions of metadata being within metadata boundaries.	04-02-2015
20150095334	DATA ANALYSIS SUPPORT SYSTEM - A data analysis support systems according to the present invention assumes any of multiple indices to be an objective variable, implements clustering and collectively outputs indices belonging to the identical cluster.	04-02-2015
20150095335	CHANGE MANAGEMENT SYSTEM IN A PROCESS CONTROL ARCHITECTURE - A computer-implemented system and method of managing changes to a process control system are provided. The method includes obtaining a plurality of changes to the process control system. The plurality of changes are categorized into a plurality of categories. Each change is assigned an initial status. The categorized changes are displayed with their associated status to a user to receive user action relative to at least one categorized change. A status of the at least one categorized change is stored.	04-02-2015
20150095336	Geo-Spatial Asset Clustering - A system and method are provided for visualizing a site having at least one cluster of assets. The method comprises determining a location for an asset; determine at least one additional asset within a particular region, based on the location of the asset; determine a subset of assets to be analyzed, the subset including the asset; performing an asset clustering of the subset to create at least one cluster of assets; and providing an output with at least one cluster of assets.	04-02-2015
20150095337	System And Method For Providing Visual Suggestions For Document Classification Via Injection - A system and method for providing visual suggestions for document classification via injection are provided. Clusters of unclassified documents and a set of reference documents, each associated with a classification code, are obtained. One or more of the unclassified documents within one such cluster are compared to the reference documents. The reference documents that are similar to the compared unclassified documents are identified for the cluster. The similar reference documents are then injected into the cluster. Each of the similar reference documents in the cluster are displayed with a visual indicator representative of the associated classification code. The unclassified documents of the cluster are also displayed. A suggestion for classification for one of the unclassified documents within each cluster is provided based on the visual indicators of the similar reference documents.	04-02-2015
20150100574	SYSTEMS AND METHODS FOR MAPPING AND ROUTING BASED ON CLUSTERING - Classifications associated with a plurality of nodes may be identified. The classifications may be grouped into first level communities based on edge weights between the classifications. The first level communities may be grouped into second level communities based on edge weights between the first level communities. A sorted list of the plurality of nodes may be generated based on the classifications, the first level communities, and the second level communities. Unique identifiers (IDs) may be assigned sequentially to the sorted list of the plurality of nodes.	04-09-2015
20150100575	ELECTRONIC COMPUTING DEVICE, PERSONALIZED DATA RECOMMENDING METHOD THEREOF, AND NON-TRANSITORY MACHINE-READABLE MEDIUM THEREOF - An electronic computing device, a personalized information providing method thereof, and a non-transitory machine-readable medium thereof are provided. The electronic computing device establishes a first and a second tree structure data according to a first data of a first user and a second data of a second user arranged in a period respectively by using an ontology construction algorithm, and calculates a similarity between the first and the second tree structure data by using a similarity evaluating algorithm, and then analyzes the similarity to subsume the first and the second tree structure data into a group by using a clustering algorithm. The electronic computing device determines difference between the first and the second tree structure data according to the group and generates recommending information corresponding to the first user which is arranged in the period according to the difference, and then enables a monitor to display the recommending information.	04-09-2015
20150100576	Default Network - A system and a method are disclosed for creating a default network from a social network computing system. The default network contains user profiles of users who have one or more attributes in common. The attributes on which a default network is based may be chosen for users or may be chosen by users according to interest. A default network is used to suggest that a user having a user profile in the default network establish a connection with another user whose user profile is in the same default network. Users with user profiles in the same default network may participate in interactions such as making referrals, sharing content, and posing questions. User profiles in a default network may be filtered for display according to characteristics of user profiles.	04-09-2015
20150100577	IMAGE PROCESSING APPARATUS AND METHOD, AND NON-TRANSITORY COMPUTER READABLE MEDIUM - An image processing apparatus includes the following elements. A memory stores therein plural items of image information. A selector selects a specific item of image information from among the plural items of image information as a selected item of image information. A specifying unit specifies, from among the plural items of image information, an item of image information indicating an image which has been captured on an identical date as a date on which an image indicated by the selected item of image information has been captured. A subject extracting unit extracts information concerning a subject of an item of image information. A destination setting unit sets a sending destination corresponding to information concerning a subject of an item of image information extracted by the subject extracting unit as a destination of the item of image information specified by the specifying unit.	04-09-2015
20150100578	SYSTEMS AND METHODS FOR ADDING DESCRIPTIVE METADATA TO DIGITAL CONTENT - Methods, computer-readable media, and systems for scheduling the association of metadata with content. In an embodiment, event information is obtained from a virtual calendar, wherein the event information comprises at least one event detail and one or more parameters defining a time period. First metadata is generated based on the event detail, and is stored, in association with the time period, in a memory. Then, subsequently, during the time period, the first metadata may be retrieved from the memory, and associated with one or more content items generated on the device.	04-09-2015
20150100579	MANAGEMENT METHOD AND INFORMATION PROCESSING APPARATUS - In an information processing apparatus for managing a system including apparatuses classified into clusters, an acquiring unit acquires history records from a memory unit based on scheduled change information indicating a scheduled change in configuration information of apparatuses accounting for a first rate amongst apparatuses belonging to a particular cluster. Each history record includes content related to a change in the configuration information of at least one or more apparatuses amongst apparatuses belonging to the same cluster. The acquiring unit acquires, from the memory unit, history records each associated with a change in the configuration information of apparatuses accounting for a second rate amongst apparatuses belonging to the same cluster. The second rate satisfies a predetermined similarity relationship with the first rate. A predicting unit predicts, based on the acquired history records, an impact on the system due to implementing the scheduled change.	04-09-2015
20150100580	METHOD FOR MANAGING COMMUNICATION RECORDS AND ELECTRONIC DEVICE THEREOF - A method for aggregating communication records is provided and includes obtaining a communication report corresponding to each of one or more communication events. The communication events indicate an occurrence of communication between a first user and a second user using a communication mode from among a plurality of communication modes. Each of the communication records are classified into one or more communication categories based on one or more classification parameters. The communication records are aggregated based on the classifying weightage assigned to one or more contextual parameters corresponding to each of the plurality of communication records. The contextual parameters indicate relevance of the corresponding communication record and the communication mode corresponding to the communication record.	04-09-2015
20150100581	METHOD AND SYSTEM FOR PROVIDING ASSISTANCE TO A RESPONDER - A method and system are described for providing assistance to a person responding to a request for subjective information. A type which may be used to provide information to a responder is associated with a request based on content of the request and user information. A user interface and suggested responses are provided to a responder based on a type and a subject matter of the request which taken together with the systems and interfaces provide enhanced communications between users while improving network efficiency.	04-09-2015
20150106373	GENERATING SEARCH DATABASE BASED ON EARTH'S MAGNETIC FIELD MEASUREMENTS - There is provided a database entity for generating a search database, comprising: at least one processor and at least one memory including a computer program code, wherein the at least one memory and the computer program code are configured, with the at least one processor, to cause the database entity at least to: acquire, from each of the plurality of mobile devices, an indication of at least one object; acquire a reference Earth's magnetic field, EMF, fingerprint representing at least one of magnitude and direction of the EMF in a location and/or environment to which the at least one object is related to; associate each object with the corresponding reference EMF fingerprint; and generate a database of associations between the reference EMF fingerprints and the objects.	04-16-2015
20150106374	Recommendation System, Method and Non-Transitory Computer Readable Storage Medium for Storing Thereof - A recommendation method includes providing an ontology database, in which the ontology database includes a plurality of entities, and the entities are arranged in an ontology hierarchy structure with N hierarchy levels; storing a plurality of j	04-16-2015
20150106375	POLICY BASED AUTOMATIC PHYSICAL SCHEMA MANAGEMENT - Provided are techniques for cyclic based data partitioning policy with automatic physical schema management. A data partitioning policy for data is received, wherein the data partitioning policy identifies a condition for automatically implementing the data partitioning policy and criteria for modifying a set of partitions. In response to the condition occurring, the data partitioning policy is automatically applied to select at least one partition from the set of partitions based on the criteria. An operation is performed on the at least one partition to modify the set of partitions.	04-16-2015
20150112989	OPUS ENTERPRISE REPORT SYSTEM - An approach for managing much information of many of controllers at multiple locations. A configuration map regarded as a dataset may be used for identifying and retrieving a group of data being sought in the form of instances of the dataset. The instances may be stored. A report may be used to select certain instances of the data according to a format of the report. The report may be manually or automatically provided. A profile may be developed to obtain instances of a dataset that match the profile and show instances that do not necessarily match the profile. The instances that do not match the profile may be reset to settings of the profile or be noted as approved exceptions and should not be reset.	04-23-2015
20150112990	Cross Application Framework for Aggregating Data Relating to People, Locations, and Entities - Some embodiments provide a cross application framework that supports a number of different applications and/or services to aggregate data relating to people, locations, and entities. The framework of some embodiments aggregates, from various data sources, different types of data, such as multimedia, communications, social media data, and location data. Once the data is aggregated, the framework provides the data to each requesting application. When an application is used to search for a person, the framework may provide the application with the person's emails, text messages, videos, photos, and social network activities.	04-23-2015
20150112991	INLINE HIERARCHY METHOD AND SOFTWARE, AND BUSINESS METHODS THEREFORE - An inline hierarchical method and software is disclosed that includes expanding a conventional data hierarchy beyond a parent-child relationship that opens up new searching, correlating, analyzing and displaying options, creating new levels of grouping various identifiers. This software derives a multiple tiered hierarchical structure from user-defined and/or computer determined inline tags within an element or multiple elements of an object, as well as business methods therefore. Grouping these tags together in new ways provides increased flexibility in data management not achievable before.	04-23-2015
20150112992	METHOD FOR CLASSIFYING CONTENTS AND ELECTRONIC DEVICE THEREOF - An electronic device and method for classifying contents using updated post information are provided. The electronic device may be configured to implement the method, and includes a communication module configured to upload to a server at least one content item, and receive from the server a set of information about at least one content item from a server. A processor of the electronic device may be configured to extract a subset of information associated with the at least one content item from the set of information, the subset including at least one reference value, and classify the at least one content item with other content items when the at least one reference value correlates with reference values of the other content items.	04-23-2015
20150112993	Method and Apparatus for Importing and Exporting Contact - A method and an apparatus for importing and exporting a contact. The method includes: receiving a user identifier of a user and a data package sent by a terminal, where the data package includes a configuration file and a contact file corresponding to a contact group, the configuration file is used to store a correspondence between a group identifier (ID) and a file ID of the contact file, and the contact file is used to store contact information of a contact included in the contact group; acquiring, from the data package, the group ID and the file ID corresponding to the contact group; acquiring, from the data package according to the file ID, the contact information included in the contact group; and storing a correspondence between the user identifier, the group ID, and the contact information. In the present invention, group information can be retained.	04-23-2015
20150120730	TEXT SAMPLE ENTRY GROUP FORMULATION - Storing text samples in a manner that the text samples may be quickly searched. The text samples are assigned a text sample identifier and are each parsed to thereby extract text components from the text samples. Text components that have the same content are assigned the same text component identifier. For each parsed text component, a text component entry is created that includes the assigned text component identifier as well as the text sample identifier for the text sample from which the text component was parsed. A text sample entry group is created for each text sample that contains the text component entries in sequence for the text components found within the text sample. The text sample entry groups are stored so as to be scannable during a future search.	04-30-2015
20150120731	PREFERENCE BASED CLUSTERING - To cluster objects associated with a dataset, a selection of criteria is received. For the received criteria, preference information is received to perform a preference-based clustering of the objects. Based on the preference information, a uni-criterion preference degree corresponding to each of the selected criterion is computed. The uni-criterion preference degrees of all the selected criteria are aggregated to compute a universal preference degree. Based on a preference-type and the computed preference degree, a relationship matrix is generated. The matrix representing similarity measure between the objects is generated. The objects are clustered according to the relationship matrix. A visualization of the clustered objects is rendered on an associated user interface.	04-30-2015
20150120732	PREFERENCE-BASED DATA REPRESENTATION FRAMEWORK - Described herein is a technology for facilitating preference-based data representation. In accordance with one aspect of the technology, preference information is acquired from a user. Rank scores of objects are generated based at least in part on the user preference information. The objects are grouped into one or more clusters of objects based on the rank scores. A visualization of the one or more clusters of objects is then generated.	04-30-2015
20150120733	SYSTEMS AND METHODS FOR IMPROVED COVERAGE OF INPUT MEDIA IN CONTENT SUMMARIZATION - The disclosed technology includes techniques for improved content coverage in automatically-generated content summaries. The technique may include clustering a set of input content, determining diffusion for each cluster, and selecting representatives of each cluster to optimize other secondary metrics. Various types of input content may be used, including groups of images, video clips, or other multimedia content. Contiguous content may be manually or programmatically divided into discrete portions before clustering, for example, a lengthy video divided into a number of short clips. In some implementations, the disclosed technique may be implemented effectively on a mobile device. In other words, the processing required may be computationally feasible for execution on a smartphone or similar device.	04-30-2015
20150120734	APPARATUS AND METHOD FOR MANAGING DATA CLUSTER - Disclosed are an apparatus and method for managing data clusters. The data cluster management apparatus may include: a cluster selection unit configured to calculate a similarity of each of the data clusters with respect to input data, and select, based on the similarity, a data cluster from among the data clusters; and a cluster update unit configured to determine, based on the selected data cluster and the input data, whether the input data is included in the selected data cluster, and use the input data in accordance with the determination to create a new data cluster or update the selected data cluster.	04-30-2015
20150127646	BIG DATA ANALYTICS - Data may be processed based on a type of application, a user, a type of device, a user profile, and/or a device profile. Data may be processed, in real time as it is received. The data may be segmented and/or partitioned into portions. For each portion, a user identifier associated with the portion may be determined. For each portion, an application associated with the portion may be determined. For each portion, a device associated with the portion may be determined. It may be determined whether a portion of the data is to be further processed based on the user identifier, the application, and/or the device. A portion of data that is to be further processed may be prioritized, directed to specific processing, and/or classified. Subsequent procession, and/or storage, of a portion of data may be based a result of at least one of the prioritizing, the directing, and/or the classifying.	05-07-2015
20150127647	ELEMENT COMPUTATION-COMMUNICATION PARALLELIZATION METHOD IMPLEMENTED ON CUBED-SPHERE GRIDS BASED ON SPECTRAL ELEMENT METHOD AND HARDWARE DEVICE PERFORMING THE SAME - A method of parallelizing computation in an element and communication between elements in a cubed-sphere coordinates system based on a spectral element method is disclosed. The method is performed in a hardware device including a computation part, a memory and a communication buffer. A first grid value at a first grid point in a first element among elements within group of a first group is computed according to a predetermined numerical equation substantially at the same time as a second grid value at a second grid point in a second element of the first group is sent to or received from the communication buffer.	05-07-2015
20150127648	IMAGE DESCRIPTOR FOR MEDIA CONTENT - A method for generating image descriptors for media content of images represented by a set of key-points, fn, is recommended which determines for each key-point of the image, designated as a central key-point, a neighbourhood of other key-points, fml, whose features are expressed relative to those of the central key-point. A sparse photo-geometric descriptor, SPGD, of each key-point in the image being a representation of the geometry and intensity content of a feature and its neighbourhood is provided to perform an efficient image querying for efficient searches. The approach demonstrates that incorporating geometrical constraints in image registration applications does not need to be a computationally demanding operation carried out to refine a query response short-list.	05-07-2015
20150127649	EFFICIENT IMPLEMENTATIONS FOR MAPREDUCE SYSTEMS - In some embodiments, a processor configured to function as at least a first Reducer in a MapReduce system may receive a set of mapped [key, value] pairs output from a Mapper in the MapReduce system, identify within the set of mapped [key, value] pairs one or more [key, value] pairs for whose keys the first Reducer is not responsible, and transfer those [key, value] pairs to one or more other Reducers in the MapReduce system. In some embodiments, a system including at least one processor may receive a data packet including a set of mapped [key, value] pairs corresponding to a plurality of keys handled by a plurality of Reducers in a MapReduce system. For each mapped [key, value] pair, the system may identify the corresponding key and one of the Reducers responsible for that key, and provide the mapped [key, value] pair to the Reducer for processing.	05-07-2015
20150127650	SYSTEMS AND METHODS FOR METRIC DATA SMOOTHING - An exemplary method may comprise receiving a matrix for a set of documents, each cell of the matrix including a frequency value indicating a number of instances of a corresponding text segment in a corresponding document, receiving an indication of a relationship between two text segments, each of the two text segments associated with a first column and a second column, respectively, of the matrix, adjusting, for each document, a frequency value of the second column based on the frequency value of the first column, projecting each frequency value into a reference space to generate a set of projection values, identifying a plurality of subsets of the reference space, clustering, for each subset of the plurality of subsets, at least some documents that correspond to projection values, and generating a graph of nodes, each of the nodes identifying one or more of the documents corresponding to each cluster.	05-07-2015
20150127651	Computer-Implemented System And Method For Grafting Cluster Spines - A computer-implemented system and method for grafting cluster spines is provided. Cluster spines, each having two or more clusters of documents, are obtained. A score vector is generated for each of the cluster spines based on the documents within the clusters for that spine. The score vectors of the cluster spines are compared. Those cluster spines that are sufficiently dissimilar from the other cluster spines based on the comparison are placed into a display. At least one remaining cluster spine is grafted onto one of the displayed spines such that no overlap of the placed spine and the remaining spine occurs.	05-07-2015
20150134659	DETERMINING COLLECTIONS CAPABLE OF INCLUDING AN OBJECT PRESENTED BY A SOCIAL NETWORKING SYSTEM - A social networking system allows users to create collections including objects associated with products, services, games, videos, books or other similar items. An object is associated with a type and one or more actions are associated with the type to identify actions capable of being performed on the object. When an object is presented to a user, the type of the object is compared to types of objects capable of being included in a collection. If the type of the object is capable of being included in a collection, one or more collections associated with actions associated with the type of the object are identified. Information describing the identified collections is generated and presented to the user. By selecting information identifying a collection, the user includes the object in the collection corresponding to the selected information.	05-14-2015
20150134660	DATA CLUSTERING SYSTEM AND METHOD - A system includes identification of a first dataset comprising n data samples, identification of b data samples of the n data samples of the first dataset, wherein b is less than n, creation of a first plurality of datasets, each of the first plurality of datasets comprising m data samples, where m is greater than b, and wherein each of the m data samples of each of the first plurality of datasets is selected from the b data samples, identification of c data samples of the n data samples of the first dataset, wherein c is less than n, and wherein the c data samples are not identical to the b data samples, creation of a second plurality of datasets, each of the second plurality of datasets comprising p data samples, where p is greater than c, and wherein each of the p data samples of each of the second plurality of datasets is selected from the c data samples, identification, for each of the b data samples, of a cluster based on the first plurality of datasets, and identification, for each of the c data samples, of a cluster based on the second plurality of datasets.	05-14-2015
20150134661	Multi-Source Media Aggregation - A user interface to match a requested set of media items for display in a media item arrangement requires an efficient method of obtaining properties of the requested media items. The requested media items may span across multiple connected sources and be associated with multiple users. A first cache layer of a multi-layer cache system stores a flat representation of metadata items corresponding to media items available from connected sources. A second cache layer stores compiles metadata items from the first cache layer into sets of metadata items for various media item groupings. A third cache layer compiles sets of metadata items from the second cache layer into ordered sets of metadata items. The ordered sets of metadata items may be used to identify an appropriate media item arrangement in which to display the associated media items.	05-14-2015
20150134662	Systems And Methods For Transmission And Pre-Processing Of Sequencing Data - “Omic” digital data transport systems and methods are disclosed. The disclosed systems and methods employ a transport server that assembles a transport group larger numbers of omic output files on the basis of machine specific annotation from one or more sequencing devices and user input related to one or more attributes for the omic output files.	05-14-2015
20150142800	GENERATING ELECTRONIC SUMMARIES OF ONLINE MEETINGS - An improved technique of organizing content of online meetings involves generating an electronic summary based on a textual metadata derived from content presented in an online meeting. An online meeting server collects content such as audio, video, and slide files presented in a particular online meeting. From metadata associated with such content, the online meeting server generates an electronic summary of the particular online meeting which includes a textual description of the content. The online meeting server then stores the electronic summary and the content presented in the particular online meeting in a repository that is configured to store content from other online meetings.	05-21-2015
20150142801	System and Method for Intelligently Categorizing Data to Delete Specified Amounts of Data Based on Selected Data Characteristics - A a data processing system and a computer program product assigns stored documents within a distributed storage system (DSS) to various document categories to enable a target number of documents to be deleted. An intelligent storage management (ISM) utility identifies a data storage threshold value used to control data storage within the DSS. If a current storage usage exceeds the data storage threshold value, the ISM utility calculates, based on the current storage usage, a target number of documents that can be deleted from the DSS. The ISM utility utilizes a recursive process which includes assigning stored documents to groups including a set of document categories based on data characteristics of the stored documents. The ISM utility further utilizes the recursive process to delete, based on an established ordering of the groups, all of the stored documents assigned to a subset of the groups in order to remove the target number of stored documents.	05-21-2015
20150142802	INDUSTRIAL GEOSPATIAL ANALYSIS TOOL FOR ENERGY EVALUATION - An industrial analytic system processes industrial data. A database engine provides access to a plurality of database management systems that serve energy consumption and product sales data. An input filter that selectively passes the filtered data streams that comprise energy sales data, location data, and a business classification code data in datasets by removing selected datasets that do not include energy information. A standard deviation filter removes datasets from the filtered data streams that fall outside of a predetermined variation from an average value. A computation module analyzes the correlation between electrical energy consumption within a standard industrial classification code represented in the datasets and a programmable criterion.	05-21-2015
20150142803	SYSTEM AND METHOD FOR COMMUNICATION BETWEEN REPOSITORIES - At least one of the embodiments described herein relate generally to a method of communicating between a first repository and a second repository. The method can be performed at a second repository, and may include the acts of: identifying a content object stored in the first repository; identifying metadata for the content object stored in the first repository, the metadata comprising a link to an interface associated with the content object, the interface being provided by the first repository; retrieving the metadata from the first repository; and storing a harvested content object corresponding to the content object, the harvested content object comprising the metadata that includes the link to the interface, wherein the interface is accessible by the second repository to communicate information related to the harvested content object to the first repository.	05-21-2015
20150142804	METHODS, APPARATUSES AND COMPUTER PROGRAM PRODUCTS FOR UTILIZING SUBTYPING TO SUPPORT EVOLUTION OF DATA TYPES - An apparatus utilizing subtyping to evolve data types includes a processor and memory storing executable computer program code causing the apparatus to at least perform operations defining a subtype relationship in an object model supporting types of instances to share data. The program code further causes the apparatus to define a constraint specifying an instance of child type is also an instance of parent type such that instances of child type are instances of parent type. The program code further causes the apparatus to define a constraint specifying the child type is a subtype of the parent type which is the parent of the child type. The program code further causes the apparatus to evaluate an instance(s) from an application(s) or device(s) to determine whether the instance(s) is valid based on detecting whether the instance(s) complies with the constraints. Corresponding methods and computer program products are also provided.	05-21-2015
20150142805	DATA PROCESSING APPARATUS AND DATA PROCESSING METHOD - To obtain a clustering result which is less unnatural to the user, a data processing apparatus clusters at least one data arranged in time-series; generates, if new data is added, a reference value for defining a group division criterion in the clustering using at least one existing data that exists forward of the new data in time-series after a forward boundary; and determines based on the reference value whether a group division boundary exists between the new data and existing data positioned immediately before the new data.	05-21-2015
20150142806	SYSTEM AND METHOD FOR PRESENTING USER GENERATED DIGITAL INFORMATION - A system and method for presenting user generated digital information. The system includes geo-located objects (GLOBs), GLOB Data Sheets (GDSs), a grouping of GLOBs, an implicit location module and a display device. The plurality of geo-located objects (GLOBs) includes at least one GLOB having a geographic location tag generated by a location component. Each GLOB Data Sheets (GDSs) among the plurality of GDSs is associated with a GLOB among the plurality of GLOBs. The grouping of GLOBs is configured to be organized based on location. The implicit location module geocodes at least one digital information objects that does not have a corresponding location by interpolating along a path between a first GLOB that has a first location determined with a first geographic location tag and a second GLOB that has a location determined with a second geographic location tag. The display device presents the organized grouping of GLOBs.	05-21-2015
20150142807	METHODS, SYSTEMS AND COMPUTER PROGRAM PRODUCTS FOR USING A DISTRIBUTED ASSOCIATIVE MEMORY BASE TO DETERMINE DATA CORRELATIONS AND CONVERGENCE THEREIN - Associative memory systems, methods and computer program products are provided. An associative memory system includes a distributed associative memory base including a network of networks of associative memory networks. A respective associative memory network includes associations among a respective observer memories and a plurality of observed memories that are observed by the respective observer memory. Ones of the associative memory networks are physically and/or logically independent from other ones of the associative memory networks. A processing system is configured to observe associations into and imagine associations from, the distributed associative memory base using multiple streaming queues that correspond to respective ones of multiple rows in the associative memory networks. The processing system is further configured to determine a cognitive distance between a term and a class of terms, the cognitive distance being returned responsive to a query of the distributed associative memory base.	05-21-2015
20150142808	SYSTEM AND METHOD FOR EFFICIENTLY DETERMINING K IN DATA CLUSTERING - A system is configured to perform an iterative method for efficiently determine a value k in k-means data clustering. The method includes performing a k-means algorithm for each number k of a set of numbers in a range of 1 to K	05-21-2015
20150142809	SYSTEMS AND METHODS FOR PROVIDING A CONTENT ITEM DATABASE AND IDENTIFYING CONTENT ITEMS - Systems and methods are provided for identifying unsolicited or unwanted electronic communications, such as spam. The disclosed embodiments also encompass systems and methods for selecting content items from a content item database. Consistent with certain embodiments, computer-implemented systems and methods may use a clustering based statistical content matching anti-spam algorithm to identify and filter spam. Such a anti-spam algorithm may be implemented to determine a degree of similarity between an incoming e-mail with a collection of one or more spam e-mails stored in a database. If the degree of similarity exceeds a predetermined threshold, the incoming e-mail may be classified as spam. Further, in accordance with other embodiments, systems and methods may be provided to determine a degree of similarity between a query or search string from a user and content items stored in a database. If the degree of similarity exceeds a predetermined threshold, the content item from the database may be identified as a content item that matches the query or search string provided by the user.	05-21-2015
20150149461	System and method for analyzing unstructured data on applications, devices or networks - A system, method, computer program and apparatus for facilitating the automated reading, decryption, retrieval, gathering, analyzing, indexing, segmentation, classification, grouping, comparing and storing of unstructured data from a set of one or more highly related computer programs, web applications or products which service a particular data transaction or system need.	05-28-2015
20150149462	ONLINE THREAD RETRIEVAL USING THREAD STRUCTURE AND QUERY SUBJECTIVITY - Methods and arrangements for handling queries for a discussion thread. A contemplated method includes: receiving a query; automatically classifying the query as subjective, objective or neither; and upon classifying the query as subjective or objective: calculating, for discussion threads of the query, at least one of: a subjectivity score and an objectivity score; determining a degree of relevance to the query of at least one of: the discussion threads, and at least one post in the at least one discussion thread; and ranking the discussion threads based on said calculating and determining. Other variants and embodiments are broadly contemplated herein.	05-28-2015
20150149463	METHOD AND SYSTEM FOR PERFORMING TOPIC CREATION FOR SOCIAL DATA - Disclosed is a system, method, and computer program product for performing theme analysis and creating topics with regards to social data. A user interface is provided that allows the user to view and interact with to view and control the process/mechanism or creating topics. The topic creation process can be facilitated and automated using a volatility index.	05-28-2015
20150149464	PROCESSING OF DATA RELATING TO ENTITIES - A method is provided for data processing, which includes generating a signal for control of display on a screen of a graphical interface including a graph composed of links and of nodes demarcated by vignettes, each vignette representing an entity. The graph contains a first vignette representing a first entity. The method selects a set of entities as a function of at least one second criterion, from among a plurality of entities meeting a first selection criterion in relation to the first entity. The first vignette of the graph is linked in the graph directly to one or more second vignettes, each representing a second entity of the set of entities. The number of second vignettes is dependent on a current threshold value determined on the basis of the plurality of entity. Also provided is a device implementing respectively a method of processing.	05-28-2015
20150293930	PORTABLE MEMORY DEVICE DATA MODELING FOR EFFECTIVE PROCESSING FOR A GAS TURBINE ENGINE - A method to process Portable Memory Device (PMD) files from an electronic engine control system includes mapping each of a multiple of Health Report Code (HRC) records from a Portable Memory Device (PMD) such that each Health Report Code (HRC) record is accessible through a specific HRC number related to each of the multiple of Health Report Code (HRC) records.	10-15-2015
20150293956	INDEXING OF LARGE SCALE PATIENT SET - Systems and methods for indexing data include formulating an objective function to index a dataset, a portion of the dataset including supervision information. A data property component of the objective function is determined, which utilizes a property of the dataset to group data of the dataset. A supervised component of the objective function is determined, which utilizes the supervision information to group data of the dataset. The objective function is optimized using a processor based upon the data property component and the supervised component to partition a node into a plurality of child nodes.	10-15-2015
20150293967	GROUP-BY PROCESSING FOR DATA CONTAINING SINGLETON GROUPS - According to one embodiment of the present invention, a system performs a grouping operation for a database query. The system assigns data elements to groups and aggregates information for a group in response to assigning the group two or more data elements. The system passes the aggregated information for a group of two or more data elements for processing in accordance with the query, and passes information for a data element of a single-member group in a received form for processing in accordance with the query. Embodiments of the present invention further include a method and computer program product for grouping data elements in substantially the same manners described above.	10-15-2015
20150293989	Computer-Implemented System And Method For Generating An Interest Profile For A User From Existing Online Profiles - A computer-implemented system and method for generating an interest profile for a user from user generated content on existing online profiles is provided. Interest categories are maintained in a database and each interest category is associated with an initial interest index score. User generated items are selected from an existing online profile and each user generated item is associated with a weight. For each user generated item, similarity mapping is performed with each of the interest categories in the database by extracting artifacts from each user generated item, calculating an artifact similarity score for each of the extracted artifacts, obtaining a user generated item similarity score for the user generated item, and determining a current interest index score for the user generated item. An interest profile for the user is generated by applying the current interest index score to the interest category.	10-15-2015
20150293990	PRESENTING A TRUSTED TAG CLOUD - A method for presenting a trusted tag cloud to a user. The method includes associating a number of tags with a first user who applies the tags, calculating a weight of the tags being examined by a second user. The weight may be based on the identity of the second user, the identity of the first user, and examining the relationship between the two. The tags may then be presented to the user in accordance with the value of the weight.	10-15-2015
20150301799	PARALLELIZED IN-PLACE RADIX SORTING - Methods for sorting a data set. Data items each having a first portion and a second portion is stored. The first and second portions are stored separately and each has a separate set of keys. The first portion has a pointer indicating the second portion. At least some of the first set of keys for each data item is stored in a local memory of a first processor. At least one data stripe set is defined with one stripe within each bucket. An in-place partial bucket radix sort is performed on data items within one data stripe set with a first processor using an initial key. Incorrectly sorted data items are grouped into respective incorrect data item groups within each bucket. A radix sort is then performed using the initial radix on the incorrect data item groups. A first level sorted output is produced.	10-22-2015
20150302078	DETERMINING LOGICAL GROUPS WITHOUT USING PERSONAL INFORMATION - Systems and methods for the forming of user device groups are presented. In one example, logical relationship information describing logical relationships among a plurality of user devices is accessed. Scores for each of a plurality of possible groups are generated based at least partially on the logical relationship information and information about a first user device, but the scores not being based on any personally identifiable information about the first user of the first user device. A first group is selected from the plurality of possible groups based on the scores. Then the first user device is added to the first group.	10-22-2015
20150302081	MERGING OBJECT CLUSTERS - A determination is made as to whether to merge clusters of objects. Semantic information is input for at least one of the objects. A compactness of a candidate cluster to be formed when a first cluster and a second cluster are merged is evaluated. A cluster quality of the candidate cluster is evaluated, based on the semantic information. The first cluster and the second cluster are merged in a case that the compactness of the candidate cluster relative to a compactness of the first and second clusters exceeds a compactness threshold, and the cluster quality of the candidate cluster relative to a cluster quality of the first and second clusters exceeds a cluster quality threshold.	10-22-2015
20150310047	System and Method for Composing a Multidimensional Index Key in Data Blocks - Embodiments are provided for composing multidimensional keys for data blocks organized according to space filling curve approaches in database systems. An embodiment method includes organizing multidimensional data in a storage using a space filling curve algorithm. A plurality of data access paths for allowing access to the data are generated in a hierarchical index topology including an intermediate index page and a plurality of leaf pages. A plurality of odometer-type keys, which point to corresponding data blocks of the multidimensional data in the storage, are digitally composed in the leaf pages using bit clustering in a dimension-by-dimension manner of the multidimensional data. The odometer-type keys have numerical values that determine access to the data blocks according to the space filling curve algorithm. The composition of the odometer-type keys is independent of the numerical values of the odometer-type keys.	10-29-2015
20150310085	System for decomposing clustering events from managed infrastructures - An event clustering system is configured to generate reports. An extraction engine is in communication with an infrastructure. The extraction engine in operation receives data from the infrastructure and produces events. An alert engine receives the events and creates alerts mapped into a matrix, M. A sigalizer engine includes one or more of an NMF engine, a k-means clustering engine and a topology proximity engine. The sigalizer engine determines one or more common steps from events and produces clusters relating to the alerts and or events. A reporting engine is configured to be coupled to the event clustering system.	10-29-2015
20150310086	System for decomposing clustering events from managed infrastructures coupled to a data extraction device - An event clustering system is provided. An extraction engine is in communication with an infrastructure. The extraction engine in operation receives data from the infrastructure and produces events. An alert engine receives the events and creates alerts mapped into a matrix, M. A sigalizer engine includes one or more of an NMF engine, a k-means clustering engine and a topology proximity engine. The sigalizer engine determines one or more common steps from events and produces clusters relating to the alerts and or events. A data extraction device is configured to be coupled to the event clustering system.	10-29-2015
20150310090	Clustered Information Processing and Searching with Structured-Unstructured Database Bridge - Systems and methods for indexing information and for performing searches are disclosed. In these systems and methods information is “ingested” into the system by clustering the information using a clustering algorithm such as k-means or k-medoids clustering. During the clustering process, a hybrid distance measurement is used that allows the systems and methods to determine similarity across a number of different types of information. Once the information is clustered, it is stored and “mirrored” both in a structured (e.g., relational) data repository and in an unstructured data repository. Methods according to the invention allow the retrieval of both direct search results and search results including related concepts. After clustered information is stored, future searches can be performed by searching the stored results in whichever data repository is most appropriate for the context.	10-29-2015
20150310093	METHOD OF PROVIDING CONTENTS OF AN ELECTRONIC DEVICE - An electronic device that uses a method of providing contents by an electronic device is provided. The method includes analyzing at least one element of log information, generating at least one emotion content and at least one log content based on the analyzed at least one element of log information, determining whether there are emotion contents generated based on the log information of the at least one log content, generating at least one combined content by using the at least one log content and the determined emotion contents, and grouping the at least one log content and the at least one combined content, and displaying at least one content group.	10-29-2015
20150310110	METHOD AND APPARATUS FOR SEARCHING NON-PUBLIC DATA USING A SINGLE SEARCH QUERY - Method and apparatus for facilitating real-time searching of non-public data using a single search query are provided. Method includes facilitating reporting of availability of companion application of remote source unit to auto discovery module to enable client device to automatically discover remote source unit and to enable client device to search, in real-time, non-public data on remote source unit using single search query, Companion application is non-public application. Single search query comprises a search term. Method includes enabling automatic access to non-public data on remote source unit, by single search query. Method includes facilitating receipt, at remote source unit, of single search query with search term. Method includes, in response to single search query, searching, in real-time, non-public data on remote source unit using search term; retrieving, in real-time, non-public search result comprising one or more file names or folder names; and transmitting non public search result in real-time.	10-29-2015
20150317124	AUDIO INTERACTION METHOD, APPARATUS, AND SYSTEM - Implementation manners of the present disclosure provide an audio interaction method, apparatus, and system. The method includes: determining a user attribute tag, and grouping users into N groups based on the user attribute tag, N being a positive integer that at least is 2; recording an audio file of a user, and extracting decibel information of the audio file of the user from the recorded audio file of the user; and comparing the extracted decibel information of the audio file of the user with decibel information of a user in another group different from a group the user is in, and presenting a comparison result. In the implementation manners of the present disclosure, interaction between grouped users is achieved in an audio manner, therefore, interactive effect is better, and interactive efficiency is improved.	11-05-2015
20150317376	METHOD, SYSTEM AND COMPUTER PROGRAM PRODUCT FOR AUTOMATING EXPERTISE MANAGEMENT USING SOCIAL AND ENTERPRISE DATA - A method includes performing contextual association of entities using multi-source data. For each context the method performs co-clustering to identify distinct expert-skill associations; constructing single-entity unipartite graph representations and performing a random walk within each single-entity unipartite graph; for each single-entity unipartite graph, obtaining steady state distributions using the random walks to obtain clusters of experts and skills; performing a weighted two-way random walk across entity graphs (graph edges), giving preference to traversal within members of the same co-cluster; and performing link prediction for each context by dynamically adding edges, and obtaining overall skills predictions, analyses and inferences by merging the contexts and weighting the links of each context. The method can also use the context-specific weights obtained from the co-association information in a matrix completion procedure, and finally merge the context-specific outputs to obtain overall skills predictions, analyses and inferences. A computer program product and a system are also disclosed for performing the method.	11-05-2015
20150317377	AUTOMATIC CREATION OF RULES FOR IDENTIFYING EVENT BOUNDARIES IN MACHINE DATA - Methods and apparatus consistent with the invention provide the ability to organize and build understandings of machine data generated by a variety of information-processing environments. Machine data is a product of information-processing systems (e.g., activity logs, configuration files, messages, database records) and represents the evidence of particular events that have taken place and been recorded in raw data format. In one embodiment, machine data is turned into a machine data web by organizing machine data into events and then linking events together.	11-05-2015
20150324408	HYBRID STORAGE METHOD AND APPARATUS - A hybrid storage apparatus including a table generator for generating a table; a column group generator for generating a column group by collecting at least one column among one or more columns included in the table; and a segment allocation unit for allocating a base segment to the table and a group segment to the column group including the at least one column of the table. The base segment includes group segment link information regarding the group segment.	11-12-2015
20150324441	SYSTEM AND METHOD FOR HIGH PERFORMANCE K-MEANS CLUSTERING ON GPU WITH SMART KERNELS - Provided is a high-performance implementation of the k-means clustering algorithm on a graphics processing unit (GPU), which leverages a set of GPU kernels with complimentary strengths for datasets of various dimensions and for different numbers of clusters. The concepts of non-dominated GPU kernels and efficient strategies to select high-throughput kernels that match the arguments of the clustering problem with the underlying GPU hardware for maximum speedup are provided.	11-12-2015
20150324443	STORAGE CLUSTERING SYSTEMS AND METHODS FOR PROVIDING ACCESS TO CLUSTERED STORAGE - A storage clustering system comprises storage front-ends and clustering modules. At least one clustering module receives an access command from a client. When the access command instructs that a data item be stored, a clustering module invokes at least one computing module to compute at least one derivative value of the data item, and at least one clustering module stores, based on an index, the derivative value or at least part of the data item through a storage front-end, and accordingly updates an instance of metadata. When the access command instructs that a data item be fetched, a clustering module examines the metadata to select a storage front-end, through which a clustering module fetches the data item. When the storage front-end returns a derivative value instead, the fetching clustering module examines the index according to the derivative value to synthesize the data item for the client.	11-12-2015
20150324447	HYBRID DATABASE MANAGEMENT SYSTEM AND METHOD OF MANAGING TABLES THEREIN - A method of managing tables in a hybrid database management system includes: classifying data constituting tables with respect to each of partitions; classifying the data constituting the partitions into hot data and cold data, based on data attribute, with respect to each of the partitions; storing the hot data and the cold data in different logical storage spaces; checking data attributes of the hot data and the cold data at preset periods and reclassifying the hot data and the cold data based on the checked data attributes; and updating logical storage spaces of the reclassified hot data and the reclassified cold data.	11-12-2015
20150324450	SYNTACTIC LOCI AND FIELDS IN A FUNCTIONAL INFORMATION SYSTEM - The invention relates to systems and methods using a logical data model for aggregating data entities in a functional information system supported upon a computing platform, and also for providing systems and methods for analyzing economic information using a functional coordinate system.	11-12-2015
20150324452	METHOD OF CREATING OR UPDATING A CONTAINER FILE FOR STORING IMAGE FILES - A method of managing a container file of data files is provided. The method includes creating a container file having a container file metadata section by creating one or more empty records in the storage device. Each record of the one or more empty records includes a data file section reserved for storing a data file, a file metadata section reserved for storing metadata about the data file, and the file metadata section precedes or follows the data file section, a record metadata section including information about the record and having at least a record status mark indicating that the record is empty. The method further includes setting a container status mark in the container file metadata section to available, after creating the container file.	11-12-2015
20150324464	SEARCHING METHOD AND APPARATUS - A searching method and a searching apparatus are provided. The method includes: obtaining a search term input from a client device; determining a type of the search term, and obtaining a knowledge graph corresponding to the type of the search term; and returning the knowledge graph corresponding to the type of the search term to the client device, such that the client device displays information contained in the knowledge graph in a structured form.	11-12-2015
20150324486	GROUPING RECORDS IN BUCKETS DISTRIBUTED ACROSS NODES OF A DISTRIBUTED DATABASE SYSTEM TO PERFORM COMPARISON OF THE GROUPED RECORDS - Provided are a computer program product, system, and method for grouping records in buckets distributed across nodes a distributed database system to perform comparison of the grouped records. Upon receiving a record, data in the received record is processed to determine at least one containing bucket having attributes matching those of the received record, wherein the at least one containing bucket comprises at least one of a plurality of buckets, and wherein the buckets are assigned to the local node and the external nodes. A determination is made of at least one of the containing buckets assigned to at least one of the external nodes. At least a portion data in the received record is forwarded to each of the determined at least one external node to perform comparison matching with other records in the containing bucket at the external node.	11-12-2015
20150331980	APPARATUS AND METHOD FOR CLASSIFYING CONTEXT TYPES FOR MULTIVARIATE MODELING - A method is provided for determining two or more context types having an associated fault to be modeled by the same multivariate model. The method includes selecting a fault and selecting two or more context types associated with the fault. The method further includes accessing data stored for the selected context types. The method further includes generating rankings of process data tags for each selected context type. Each ranking includes process data tags ranked according to relative contributions of each process data tag in the ranking to the fault. The method further includes classifying the context types into one or more classes based on the process data tags included in each ranking. The one or more classes include a first class of the context types. The method further includes deploying a multivariate model operable to monitor processing equipment for the selected fault for the first class of context types.	11-19-2015
20150339319	FILE MANAGEMENT AMONG DIFFERENT ZONES OF STORAGE MEDIA - Apparatus and methods for managing files among different zones of storage media in at least one non-volatile storage device. At least a first zone is associated with a first type of storage media and a second zone is associated with a second type of storage media. A file having at least one attribute is accepted with the at least one attribute describing a characteristic of the file. It is determined whether the at least one attribute meets an attribute criteria and the file is stored in the first zone and/or the second zone based on the determination of whether the at least one attribute meets the attribute criteria.	11-26-2015
20150339364	VISUALIZATION DEVICE, VISUALIZATION METHOD AND VISUALIZATION PROGRAM - A visualization device includes: an evaluation index calculation unit	11-26-2015
20150339371	METHOD AND APPARATUS FOR CLASSIFYING SIGNIFICANT PLACES INTO PLACE CATEGORIES - An approach is provided for classifying significant places (stay points) into place categories. A classification platform determines user contextual information associated with at least one significant place. The classification platform further causes, at least in part, a comparison of the user contextual information against reference contextual information associated with one or more place categories. The classification platform also causes, at least in part, a classification of the at least one significant place into the one or more place categories based, at least in part, on the comparison.	11-26-2015
20150339373	GRAPHICAL INTERFACE FOR RELEVANCE-BASED RENDERING OF ELECTRONIC MESSAGES FROM MULTIPLE ACCOUNTS - A method and a device are disclosed including a user interface software component configured to dynamically display and manage grouped graphical representations, such as tiles, of electronic data and messages, such as files and emails, that are rendered based on relevance scores, and are associated with two or more user accounts. The messages may be grouped visually or logically. The user interface may automatically categorize messages by assigning certain attributes based heuristics or other information. The user interface is further configured to allow zooming in and out for more or less details, respectively. The user interface is further configured to allow automatic and/or dynamic changes to the appearance and contents of tiles based on the relevance scores and other factors, and further allow searching for, dispositioning, and taking various actions on one or a group of messages.	11-26-2015
20150347468	GROUPING DATA IN A DATABASE - According to embodiments of the present invention, two or more attributes that are included in a plurality of attributes are aggregated into a group defined by a first data definition language syntax. The first data definition language syntax defines the group as having a groupID and one or more of an attribute definition defined in a comma-separated list and a group definition. The attribute definition is defined by a second data definition syntax. The first data definition language syntax includes the second data definition language syntax. The first data definition language syntax is structured in a manner to allow a database operation associated with the group to be applied to all attributes and/or groups included therein.	12-03-2015
20150347556	SUGGESTING PRE-CREATED GROUPS BASED ON A USER WEB IDENTITY AND ONLINE INTERACTIONS - In one aspect, a method for generating groupings of users at a social networking service is provided, the method includes determining identifying information for a user, identifying one or more other users having a set of identifying information in common with the user, generating a group including the user and the one or more other users, associating the set of identifying information common between the user and the one or more other users with the group and providing recommendations to the user for activity with respect to the one or more other users based on the set of identifying information.	12-03-2015
20150347559	STORAGE CLUSTER DATA SHIFTING - A method performed by a computing system includes creating a graph, wherein nodes of the graph represent data objects of a data storage cluster, wherein edges of the graph represent joins between data objects represented by both nodes of respective edges, wherein node values of the nodes and weights of the edges are based on statistics related to use of the data objects. The method further includes assigning a first subset of the data objects to a relational database storage node within the data storage cluster, the first subset of data objects being represented by nodes within a cluster of the graph, and assigning a second subset of the data objects to a non-relational database storage node within the data storage cluster, the second subset of data objects being represented by nodes within the graph that are not part of a cluster.	12-03-2015
20150347564	CATEGORY NAME EXTRACTION DEVICE, CATEGORY NAME EXTRACTION METHOD, AND CATEGORY NAME EXTRACTION PROGRAM - A category name extraction device includes a specifying means configured to specify a word contained in a plurality of item information respectively belonging to a plurality of categories in parallel structure, qualifying or being qualified by a name of a category where each item information belongs, and being in common to a plurality of item information belonging to a plurality of different categories, as a reference word, an extraction means configured to extract a word contained in a phrase contained in item information belonging to any of the plurality of categories, qualifying or being qualified by the reference word, and being different from names of the plurality of categories, as a candidate category name, and an output means configured to output the candidate category name extracted by the extraction means.	12-03-2015
20150347567	METHOD FOR CREATION AND EDITING OF A MASSIVE CONSTRAINT NETWORK - A method for editing a position of a selected design element in a constraint network. The method includes receiving a selection of a design element in a geometric model from a user. The method also includes searching a database for a positioning group related to the selected design element. The method then includes displaying the positioning group related to the selected design element to the user. The method further includes receiving an updated positioning group from the user. The method finally includes storing the updated positioning group to the database.	12-03-2015
20150347574	INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, PROGRAM, AND INFORMATION PROCESSING SYSTEM - There is provided an information processing device including an event cluster creation unit configured to create an event cluster including, among a plurality of types of content, reference content serving as a reference and related content, the related content having a different type from the reference content and indicating the same event as the reference content, and a meta information appending unit configured to create meta information about the event on the basis of the event cluster and append the meta information to the event cluster.	12-03-2015
20150347579	MEDIA FILE MARKING METHOD AND APPARATUS - Embodiments of the present invention provide a media file marking method and apparatus. The media file marking method includes: receiving reaction information that is generated when a user watches a media file and is sent by at least one first terminal device, where the reaction information is collected by the first terminal device when the user watches the media file; generating marking information of the media file according to the reaction information sent by the at least one first terminal device; and sending the marking information of the media file to a second terminal device that plays the media file, so that the second terminal device displays the marking information of the media file. The media file marking method and apparatus provided in the embodiments of the present invention are used to improve accuracy of choosing a media file by a user.	12-03-2015
20150356164	METHOD AND DEVICE FOR CLUSTERING FILE - In a method and a device for clustering files of the present application, to cluster files to be processed, information fingerprints of the files to be processed are obtained by processing information fingerprints of features of a plurality of information blocks contained in the file to be processed and are compared, and files to be processed with the same information fingerprint are taken as one cluster, so as to realize the clustering of files. The features of the information blocks in the files to be processed are identified by means of information fingerprints in this way, and then clustering is performed according to identifiers. Compared to prior art method using similarity comparisons, the method and device of the present application, which calculate and cluster an identifier of a feature, greatly reduce the data to be calculated and the degree of complexity.	12-10-2015
20150356165	METHOD AND SYSTEM FOR PROVIDING CONTENT TO USERS BASED ON FREQUENCY OF INTERACTION - A system and method for providing content to a user based on at least one prior user experience are provided. First content is transmitted to a user, wherein at least some of the first content is transmitted in response to one or more user content selections. Frequency information based on the inputs and/or the first content is stored. A request for content is received from the user. Second content is selected based on the frequency information.	12-10-2015
20150370885	METHOD AND SYSTEM FOR CLUSTERING EVENT MESSAGES AND MANAGING EVENT-MESSAGE CLUSTERS - The current document is directed to methods and systems for processing, classifying, and efficiently storing large volumes of event messages generated in modern computing systems. In a disclosed implementation, received event messages are assigned to event-message clusters based on non-parameter tokens identified within the event messages. A parsing function is generated for each cluster that is used to extract data from incoming event messages and to prepare event records from event messages that more efficiently and accessible store event information. The parsing functions also provide an alternative basis for assignment of event massages to clusters.	12-24-2015
20150370886	INFORMATION PROCESSING DEVICE WHICH CARRIES OUT RISK ANALYSIS AND RISK ANALYSIS METHOD - An information processing device includes: a unit configured to compute a service influence degree for each risk factor with respect to each service, on the basis of information which indicates a relation between components which have the risk factors and other components which are influenced by the state of the components, information which denotes characteristics of the respective risk factors, and information which denotes a correspondence between the services and these components; and a unit configured to compute, on the basis of the computed service influence degrees, similarities between specific risk factors and other risk factors, and for generating and outputting a set of component identification information on the basis of the computed similarities.	12-24-2015
20150379022	Integrating Execution of Computing Analytics within a Mapreduce Processing Environment - Embodiments of the disclosure can include MapReduce systems and methods with integral mapper and reducer compute runtime environments. An example system with an integral reducer compute runtime environment can include mappers and reducers executable on a computer cluster. The mappers can be operable to receive raw input data and generate first input data based on the raw input data. The mappers can be operable to generate first result data based on the first input data. Based on the first result data, the mappers can be operable to generate (K, V) pairs. The reducers can be operable to receive the (K, V) pairs and generate second input data based on the (K, V) pairs. The reducers can be operable to transmit the second input data to integral compute runtime environment being run within the reducers and operable to generate second result data based on the second input data. Based on the second result data, the reducers can be operable to generate output data.	12-31-2015
20150379109	METHOD AND SYSTEM FOR CATEGORIZING ITEMS IN BOTH ACTUAL AND VIRTUAL CATEGORIES - A method of facilitating location of a data item describing, for example, a product or service that is the subject of on-line auction, commences with the presentation of a category navigation interface. The category navigation interface allows a user to navigate a “virtual” hierarchy of categories and to select a target virtual category of the virtual hierarchy. The target virtual category is then identified as being linked to an actual category, within an actual hierarchy of categories. Database items are classified only in terms of the real hierarchy of categories, and not the virtual hierarchy of categories. Having then identified an actual category to which the virtual category is mapped, data items of the real category are identified responsive to the user selection of the virtual category of the virtual hierarchy of categories.	12-31-2015
20150379111	CROWDSOURCING AUTOMATION SENSOR DATA - A computer-implemented method for crowdsourcing automation sensor data is described. In one embodiment, the method includes receiving data generated by a plurality of building automation systems and categorizing each of the plurality of building automation systems. The data generated by the plurality of building automation systems includes patterns of behavior identified by each of the plurality of building automation systems. The method includes sorting each of the plurality of building automation systems by category and analyzing the data generated by the plurality of building automation systems according to the category of each building automation system.	12-31-2015
20150379114	DATA TRANSMISSION DEVICE, DATA SHARING SYSTEM, DATA SHARING METHOD, AND MESSAGE EXCHANGING SYSTEM - A data transmission device that transmits data to another node, the data transmission device including: data storing unit for storing data; summary information storing unit for classifying data stored in the data storing unit into prescribed groups and for storing summary information that represents the number of pieces of data for each group; receiving unit for receiving summary information from the other node; selecting unit for selecting data to be transmitted based on the summary information received from the other node; and transmitting unit for transmitting the data selected by the selecting unit. The selecting unit favorably preferentially selects data included in a group with a smaller number of pieces of data based on the summary information received from the other node. Due to such a configuration, information with a high possibility of not being possessed by a communication partner can be selected and transmitted in a data sharing system.	12-31-2015
20150379116	NETWORK SYSTEM, MEMBERSHIP-BASED SOCIAL NETWORK SERVICE SYSTEM, IMAGE DISPLAY METHOD, AND STORAGE MEDIUM STORING PROGRAM - There is provided a network system in which image data items are uploaded from a plurality of user terminals to a server and images are opened to public among the users. The system includes a category division unit configured to divide the works classified into the categories into a first group of works with each of which the counted browse request number of times is greater than or equal to a predetermined number, and a second group of works other than the works in the first group, and classify one of the first and second groups of the divided works as another category different from the categories.	12-31-2015
20150379117	METHOD AND SYSTEM FOR DETERMINING SETS OF VARIANT ITEMS - Various embodiments of a method and system for determining sets of variant items are described. Various embodiments may include a system configured to generate multiple item pairs each corresponding to a particular item and another item determined to be similar to the particular item. For the particular item and the other item, each item pair may include a respective sequence of text strings (e.g., a title). For each item pair, the system may perform a corresponding text alignment and determine one or more misalignments of the item pair. The system may also assign a similarity score to each item pair; the similarity score may be dependent on the misalignment(s) determined for the particular item pair. Based on each aligned item pair and the similarity score assigned to that aligned item pair, the system may generate an indication specifying that each of a set of items are variants of each other.	12-31-2015
20150379197	SELECTION DEVICE FOR CANDIDATE SEQUENCE INFORMATION FOR SIMILARITY DETERMINATION, SELECTION METHOD, AND USE FOR SUCH DEVICE AND METHOD - The present invention provides a device for determining the similarities between sequence information pieces easily. The candidate selection device	12-31-2015
20150381517	SYSTEM AND METHOD FOR GENERATING RANDOM LINKED DATA ACCORDING TO AN RDF DATASET PROFILE - A method, computer program product, and computer system for gathering statistics, by a computing device, for a set of resources associated with a framework. A profile is generated based upon, at least in part, the gathered statistics. A data set is selected for generation of a new resource. The new resource is generated using the profile generated based upon the gathered statistics.	12-31-2015
20160004740	ORGANIZATION OF DATA WITHIN A DATABASE - A computer implemented method is provided for processing data representing a data entity having sub entities. The method includes analyzing queries to the data entity for deriving information about sets of the sub entities frequently queried together, and grouping the sub entities to a number of banks, each bank having a maximum width, based on the information about sets of sub entities frequently queried together, in order to reduce an average number of banks to be accessed for data retrieval.	01-07-2016
20160004762	Hilbert Curve Partitioning for Parallelization of DBSCAN - DBSCAN clustering analyses can be improved by pre-processing of a data set using a Hilbert curve to intelligently identify the centers for initial partitional analysis by a partitional clustering algorithm such as CLARANS. Partitions output by the partitional clustering algorithm can be process by DBSCAN running in parallel before intermediate cluster results are merged.	01-07-2016
20160004764	SYSTEM AND METHOD FOR NEWS EVENTS DETECTION AND VISUALIZATION - Systems and methods are disclosed for news events detection and visualization. In accordance with one implementation, a method is provided for news events detection and visualization. The method includes, for example, obtaining a document, obtaining from the document a plurality of tokens, obtaining a document vector based on a plurality of frequencies associated with the plurality of tokens, obtaining one or more clusters of documents, each cluster associated with a plurality of documents and a cluster vector, determining a matching cluster from the one or more clusters based at least on the similarity between the document vector and the cluster vector of the matching cluster, and updating a database to associate the document with the matching cluster.	01-07-2016
20160004765	Predictive Cluster Analytics Optimization - Cluster analysis of data points in a data set can be optimized by identification of a preferred cluster analysis method. This identification can be based on indexing the data using a Hilbert curve and determining whether the data points are predominantly in spherical or non-spherical clusters. Methods, systems, and articles of manufacture are described.	01-07-2016
20160019243	TEMPLATE METADATA - A template metadata system and a process for attaching descriptive information to metadata of a template are described. The template metadata process can include, but is not limited to, characterizing template properties for purposes of finding, filtering, organizing, and processing a database of templates. The template metadata system includes a server, a template database, a network, and an end user device. In one embodiment, the template metadata system can generate input parameters for discovering a pre-existing template from among a collection of templates stored in a template database.	01-21-2016
20160019254	TIERED DATA STORAGE ARCHITECTURE - The disclosure is directed to storing data in different tiers of a database based on the access pattern of the data. Immutable data, e.g., data that does not change or changes less often than a specified threshold, is stored in a first storage tier of the database, and mutable data, e.g., data that changes more often than immutable data, is stored in a second storage tier of the database. The second storage tier of the database is more performant than the first storage tier, e.g., the second storage tier has a higher write endurance and a lower write latency than the first storage tier. All writes to the database are performed at the second storage tier and reads on both storage tiers. The storage tiers are synchronized, e.g., the set of data is copied from the second to the first storage tier based on a trigger, e.g., a specified schedule.	01-21-2016
20160026625	Global optimization strategies for indexing, ranking and clustering multimedia documents - A method for a system that indexes, ranks, and clusters multimedia documents using assessment means, scoring means, stochastic means, and organizing means that optimizes parameter sets comprising of object parameters. The method creates a plurality of individual parameter sets, the parameter sets comprising information sharing system object parameters for describing a model, structures, shape, design, process, search query sets, and dynamic search spaces to be optimized using selective variations, constructive variations, clustering variations, and stochastic variations. The optimizations are guided by document query terms of the search query set object parameter that are initially optimized by assessment means, scoring means, stochastic means, and organizing means that lead to selective variations, constructive variations, clustering variations, and stochastic variations of the parameter sets. The global optimization of parameter sets leads to stochastically improvements to all object parameters by selective variations, constructive variations, clustering variations, and stochastic variations.	01-28-2016
20160026705	MANAGEMENT OF A VIRTUAL SPACE - According to an example embodiment of the present invention, there is provided an apparatus comprising at least one receiver configured to receive sensor information and indoor positioning information, at least one processing core configured to select information from a group comprising the sensor information and the indoor positioning information based at least in part on a determination concerning the sensor information, and at least one transmitter configured to cause transmission of either the selected information, or of information derived from the selected information	01-28-2016
20160026708	SELECTION OF DATA STORAGE SETTINGS FOR AN APPLICATION - The present disclosure is directed towards a data storage setting arrangement in a federated database system that includes applications configured to handle data in corresponding databases. The arrangement includes a communication interface for obtaining application requirement data (AIC) related to database usage, a database determining unit for predicting database type using the application requirement data and selecting database based on the database type prediction, a processing type determining unit for predicting database processing type based on the application requirement data and selecting database processing type based on the processing type prediction, a data model creating unit for creating a data model for storing of data in the database based on the application requirement data and an instantiating unit for instantiating a connection between the application and the selected database based on the selected database processing type and data model.	01-28-2016
20160034451	DIGITAL ASSET DOCK (DAD) - A method, system, apparatus, article of manufacture, and computer program product provide the ability to ingest a media content file. The media content file to be uploaded and managed in an enterprise media framework (EMF) is selected. Media content file(s) to be tagged are also selected. A mask matcher identifies a mask (having multiple parts) that identifies a file structure of information associated with the media content file. For each of the multiple parts and based on the information associated with the media content file, metadata is calculated and applied to the media content file.	02-04-2016
20160034512	CONTEXT-BASED METADATA GENERATION AND AUTOMATIC ANNOTATION OF ELECTRONIC MEDIA IN A COMPUTER NETWORK - Computerized systems for automating content annotation (e.g., tag creation and/or expansion) for low-content items within a computer network by leveraging intelligence of other data sources within a network to generate secondary content (e.g., a “context”) for items (e.g., documents) for use in a tagging process. For example, based on user assigned tags for an item, secondary content information can be generated and used to determine a new list of candidate tags for the item. Additionally, the context of an input item may be compared against the respective contexts of a plurality of other items to determine respective levels of similarity between the input item and each of the plurality of other items in order to annotate the input item. Techniques involving web-distance based clustering and leveraging crowd-sourced information sources to remove noisy data from annotated results are also described.	02-04-2016
20160034524	SYSTEMS AND METHODS FOR ENHANCING USER DATA DERIVED FROM DIGITAL COMMUNICATIONS - A computer-implemented method for enhancing and utilizing user data derived from digital interactions includes receiving a submission generated by input into a client side application interface by a first user on a first computing device, and determining, based on attributes of the submission, that the submission is in response to an issue-specific communication advertising information concerning a first issue, the issue-specific communication indicating a request for a financial transaction, and that that the financial transaction related to the issue-specific communication is requested. The method includes generating a first dataset associated with the first user, searching one or more additional datasets for additional data to be associated with data elements of the first dataset, associating the additional data from the one or more additional datasets with the first user, and generating a data model corresponding to the first user.	02-04-2016
20160034525	GENERATION OF A SEARCH QUERY TO APPROXIMATE REPLICATION OF A CLUSTER OF EVENTS - A processing device performs a preliminary grouping of data items in a dataset to define one or more clusters and for each cluster, identifies a set of search terms for a search query that would retrieve data items in the cluster upon execution of the search query against the dataset.	02-04-2016
20160034554	LARGE-SCALE DATA CLUSTERING WITH DYNAMIC SOCIAL CONTEXT - A system and method for dynamic, semi-supervised clustering comprises receiving data attributes, generating a set of ensemble partitions using the data attributes, forming a convex hull using the set of ensemble partitions, generating a simplex vector by performing ensemble clustering on the convex hull, receiving dynamic links, deriving an optimal simplex vector using the simplex vector and the dynamic links, computing a current optimal clustering result using the optimal simplex vector, and outputting the current optimal clustering result.	02-04-2016
20160034556	SYSTEM AND METHOD FOR COMPUTERIZED BATCHING OF HUGE POPULATIONS OF ELECTRONIC DOCUMENTS - A method for computerized batching of huge populations of electronic documents, including computerized assignment of electronic documents into at least one sequence of electronic document batches such that each document is assigned to a batch in the sequence of batches and such that there is no conflict between batching requirements, the following batching requirements being maintained by a suitably programmed processor: a. pre-defined subsets of documents are always kept together in the same batch, b. batches are equal in size, c. the population is partitioned into clusters, and all documents in any given batch belong to a single cluster rather than to two or more clusters.	02-04-2016
20160034557	INFORMATION PROCESSING DEVICE, DATA PROCESSING METHOD THEREFOR, AND RECORDING MEDIUM - An information processing device includes: a feature quantity obtaining unit which obtains a feature quantity of an object to be extracted, which is extracted from a retrieval target, and specific information to be specified an appearing location of it; a feature quantity holding unit which, when storing the feature quantity in a feature quantity table, adds new identification information to the feature quantity and holds the feature quantity in the feature quantity table when a similar feature quantity in which a similarity with the feature quantity is no less than a threshold is not included in the feature quantity table, and outputs identification information of the similar feature quantity as identification information of the feature quantity when the similar feature quantity is included in the feature quantity table; and a retrieval table holding unit which holds the specific information associated with the added identification information or the outputted identification information.	02-04-2016
20160034561	LANDMARK POINT SELECTION - An exemplary method comprises receiving data points, selecting a first subset of the data points to generate an initial set of landmarks, each data point of the first subset defining a landmark point and for each non-landmark data point: calculating first data point distances between a respective non-landmark data point and each landmark point of the initial set of landmarks, identifying a first shortest data point distance from among the first data point distances between the respective non-landmark data point and each landmark point of the initial set of landmarks, and storing the first shortest data point distance as a first landmark distance for the respective non-landmark data point. The method further comprising identifying a non-landmark data point with a longest first landmark distance in comparison with other first landmark distances and adding the identified non-landmark data point associated as a first landmark point to the initial set of landmarks.	02-04-2016
20160042018	SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR PROVIDING A TEAM OBJECT IN ASSOCIATION WITH AN OBJECT - In accordance with embodiments, there are provided mechanisms and methods for providing a team object in association with an object. These mechanisms and methods for providing a team object in association with an object can allow for centralized management of a team in association with an object. For example, members of the team may be automatically identified (e.g. without manual intervention) for receiving notifications in association with an object.	02-11-2016
20160042052	METHOD FOR CHARACTERIZING DATA SETS - A method is disclosed for characterizing a first data set of digital values and a second data set of digital values. The method includes determining a first similarity measure indicating a similarity of the digital values within the first data set; determining a second similarity measure indicating a similarity of the digital values within the second data set; determining a correlation value on the basis of the first data set and the second data set; and electronically outputting the correlation value, the first similarity measure and the second similarity measure.	02-11-2016
20160042055	METHOD AND DEVICE FOR ESTABLISHING LABEL LIBRARY AND SEARCHING FOR USER - A method for establishing a label library includes receiving label information made by a labeling user regarding a labeled user in a social network, storing the labeling user, the labeled user, and the label information as a relationship, and establishing a label library based on the relationship.	02-11-2016
20160042056	System and Method for Storing Data in Clusters Located Remotely from Each Other - A system for storing data includes a plurality of clusters located remotely from each other in which the data is stored. Each cluster has a token server that controls access to the data with only one token server responsible for any piece of data. Each cluster has a plurality of Cache appliances. Each cluster has at least one backend file server in which the data is stored. The system includes a communication network through which the servers and appliances communicate with each other. A Cache Appliance cluster in which data is stored in back-end servers within each of a plurality of clusters located remotely from each other. A method for storing data.	02-11-2016
20160042086	HYBRID BINARY XML STORAGE MODEL FOR EFFICIENT XML PROCESSING - A method for storing XML documents a hybrid navigation/streaming format is provided to allow efficient storage and processing of queries on the XML data that provides the benefits of both navigation and streaming and ameliorates the disadvantages of each. Each XML document to be stored is independently analyzed to determine a combination of navigable and streamable storage format that optimizes the processing of the data for anticipated access patterns.	02-11-2016
20160048515	SPATIAL DATA PROCESSING - A system and method for spatial data processing are described. Path bundle data packages from a viewing device are accessed and processed. The path bundle data packages identify a user interaction of the viewing device with an augmented reality content relative to and based on a physical object captured by the viewing device. The path bundle data packages are generated based on the sensor data using a data model comprising a data header and a data payload. The data header comprises a contextual header having data identifying the viewing device and a user of the viewing device. A path header having data identifies the path of the interaction with the augmented reality content. A sensor header having data identifies the plurality of sensors. The data payload comprises dynamically sized sampling data from the sensor data. The path bundle data packages are normalized and aggregated. Analytics computation is performed on the normalized and aggregated path bundle data packages.	02-18-2016
20160048544	SYSTEM AND METHOD FOR HIERARCHICAL CATEGORIZATION OF COLLABORATIVE TAGGING - Systems and methods for hierarchically categorizing collaborative indexing tags includes, in one aspect, receiving data (e.g., metadata) relating to a post, including a user-selected first category identifier, and a user-defined second category identifier, and content. The systems and methods further include correlating, or mapping, the user-selected first category identifier with a first category identifier database, correlating the user-defined second category identifier with a second category identifier database, and populating a data structure with such information for storage and later retrieval.	02-18-2016
20160048573	DYNAMIC FEATURE SET MANAGEMENT - In an example, a network is described with a plurality of data sources. Each data source may provide a feature, such as a data type that the data source collects or generates. A data aggregator may be connected to the network, and configured to collect, classify, and merge features as appropriate. The data aggregator includes a discriminator for classifying features, a merger, unmerger, converter, and evaluator. Features are provided to one or more expert systems configured to control one or more systems based on the features. Feedback to the data aggregator is used to evaluate the success of a merge. When a merge is found to be unhelpful, features may be unmerged.	02-18-2016
20160048576	GROUPING APPARATUS AND GROUPING METHOD - A grouping apparatus includes a processor configured to execute a process including calculating, at a plurality of times between a first time and a second time, the strength of correlation between each of a plurality of pieces of information and a moving body as a level of relationship for each of the plurality of pieces of information, wherein each of the plurality of pieces of information is provided to each of a plurality of objects arranged in a space and the moving body moves within the space, on the basis of the level of relationship for each of the plurality of objects calculated at the plurality of times, calculating similarities, integrating the similarities, and calculating integrated similarity, and grouping the plurality of pieces of information on the basis of the integrated similarity.	02-18-2016
20160048577	CLUSTER COMPUTATION USING RANDOM SUBSETS OF VARIABLES - A computing device to compute clusters using random subsets of variables is provided. Each data point of a plurality of data points is associated with a variable to define a plurality of variables. A subset of the plurality of variables is randomly selected. The subset does not include all of the plurality of variables. A number of clusters into which to segment the received data is determined. Cluster data that defines each cluster of the determined number of clusters is determined by executing a clustering algorithm with the received data using only the plurality of data points defined for each observation that are associated with the randomly selected subset of the plurality of variables. The determined cluster data is stored to cluster second data into the determined number of clusters. The second data is different from the received data.	02-18-2016
20160048578	DETERMINATION OF COMPOSITE CLUSTERS - A computing device to compute composite clusters is provided. A first and a second plurality of centroid locations are computed by executing a clustering algorithm with a first portion of data and a first input parameter and a second portion of the data and a second input parameter, respectively. The first portion is different from the second portion or the first input parameter is different from the second input parameter. A plurality of composite centroid locations is computed using the computed first and second plurality of centroid locations to define a composite set of clusters. An observation is selected. A cluster of the composite set of clusters to which to assign the observation is determined using the plurality of composite centroid locations. The selecting and the determining is repeated with each observation of the plurality of observations as the observation to define cluster assignments for the plurality of observations.	02-18-2016
20160048579	PROBABILISTIC CLUSTER ASSIGNMENT - A computing device to assign observations to clusters based on a statistical probability is provided. A first cluster assignment is defined by assigning the plurality of observations to a first set of clusters. A second cluster assignment is defined by assigning the plurality of observations to a second set of clusters. A set of composite clusters is defined based on the defined first set of clusters and the defined second set of clusters. For each observation, a statistical probability value for assigning an observation to each composite cluster of the defined set of composite clusters is computed based on the first and second cluster assignments and a composite cluster assignment is defined by assigning the observation to a cluster of the set of composite clusters based on the computed statistical probability value. The defined composite cluster assignment is stored.	02-18-2016
20160048580	METHOD AND SYSTEM FOR PROVIDING DELEGATED CLASSIFICATION AND LEARNING SERVICES - An approach is provided for delegated classification and learning services. The approach involves receiving a request for a deployment of a sensor classification service to an environment, wherein the environment includes one or more sensors. The approach also involves packaging one or more classification operations associated with the sensor classification service into a virtualization container, wherein the one or more classification operations are for processing sensor data collected from the one or more sensors. The approach further involves deploying the virtualization container to a node that is local to the environment to perform the one or more classification operations locally in the environment.	02-18-2016
20160048581	PRESENTING CONTEXT FOR CONTACTS - An embodiment provides a method, including: detecting, using a processor, an electronic communication between a user device and an entity device; thereafter accessing, using a processor, a contextual information store including automatically selected text data derived from past communications associated with the entity device; and providing, using an output element of the device, contextual information obtained from the contextual information store during the electronic communication between the user device and the entity device. Other aspects are described and claimed.	02-18-2016
20160048586	CLASSIFYING URLS - According to an example, a Trie is formed from URLs and nodes of the Trie are assigned a weight. A node is selected based on its weight and child nodes of the selected node merged together. A URL classification is output based on a path in the Trie.	02-18-2016
20160048633	SYSTEMS AND METHODS FOR GENOMIC VARIANT ANNOTATION - A system for annotating genomic variant files includes an application server, an annotation database, a genomic database, and an annotation processing computer system. The genomic database may be graph-oriented. The annotation processing computer system processes can process variant files in batch modes and includes annotation modules designed to improve the speed of the annotation process. The batch modes may include batch transmission, and/or batch annotation.	02-18-2016
20160055157	DIGITAL INFORMATION ANALYSIS SYSTEM, DIGITAL INFORMATION ANALYSIS METHOD, AND DIGITAL INFORMATION ANALYSIS PROGRAM - A digital information analysis system includes: a relevance information acquiring unit that acquires relevance information attached by a classifier to each of multiple pieces of digital information; a relevance score calculating unit that calculates a relevance score determined according to relevance between each of the multiple pieces of digital information and the predetermined specific matter; a ratio calculation unit that calculates a ratio of the number of pieces of relevance information, attached to the digital information included in the range, to the total number of pieces of digital information having the relevance scores included in each range; and a display unit.	02-25-2016
20160055231	Application Representation For Application Editions - A disclosed system, method, and computer-readable storage medium automatically identify, cluster, and cross-reference various editions of an application. The editions are clustered and associated with a canonical application structure describing the general functionality of each edition in the cluster. When an application search query is received from a client device, one or more canonical applications corresponding to the query are identified and provided to the client device. Enhancing the relevancy of search results by merging several editions of an application into one canonical application structure reduces unwanted and redundant results on a search result page.	02-25-2016
20160055238	DOCUMENT ANALYSIS APPARATUS AND DOCUMENT ANALYSIS PROGRAM - According to one embodiment, a document analysis apparatus is an apparatus comprising first document storage circuit for storing first documents that include words, belong to respective categories constituting a hierarchical structure, and only comprise opinion documents for a desirable object, and a second document storage circuit for storing second documents that include words, belong or do not belong to the categories constituting the hierarchical structure and comprise opinion documents for the desirable object and documents other than the opinion documents, and the apparatus is configured to classify, into one of the categories constituting the hierarchical structure, the second documents that do not belong to the respective categories among the second documents stored in the second document storage circuit.	02-25-2016
20160062993	METHOD AND ELECTRONIC DEVICE FOR CLASSIFYING CONTENTS - A method of classifying contents comprising configuring one or more categories in a hierarchical structure, mapping one or more contents and the one or more categories based on at least one piece of information on the one or more contents and information on the one or more categories, and updating the hierarchical structure of the categories based on a preset condition when content-related information of each category determined according to the mapping meets the preset condition.	03-03-2016
20160063042	COMPUTER-READABLE RECORDING MEDIUM, DATA PLACEMENT METHOD, AND DATA PLACEMENT DEVICE - A data placement device creates a similarity index for each of computational resources based on a similarity between each of the pieces of acquired data and each of the pieces of data stored in the computational resources. The data placement device allocates on the basis of the similarity index of each of the computational resources with respect to the pieces of the data, the pieces of the data to each of the computational resources by using a matching system in which the similarity index associated with each allocation becomes stable in a direction in which the similarity index is small. The placement device places the pieces of the acquired data into the computational resources on the basis of the allocation result.	03-03-2016
20160063085	SYSTEM AND METHOD FOR ANTICIPATING CRIMINAL BEHAVIOUR - A system and method for anticipating criminal behaviour. The system includes or is connectable to a database including records, each record including data representative of a criminal incident. The system includes a pre-processing unit arranged for scanning each record for identifying data-items relating to a plurality of predetermined data types, wherein the plurality of predetermined data types includes all or a sub-set of: Arena, Time(frame), Context, Protagonist, Antagonist, Motivation, Primary Objective, Means, modus operandi, Resistance, Symbolism, and Red herring of the criminal incident. The system includes a classifying unit arranged for assigning to each identified data-item a category value of one of a plurality of predetermined category values associated with said predetermined data-type. The system includes a processing unit arranged for constructing a matrix containing a row for each record, and containing columns related to the predetermined data-types, the cells of the matrix containing the determined category values. The system includes an input unit, arranged for receiving user input, the user input including category values of a criminal incident for some, but not all, of the predetermined data types. The system includes a scenario generator arranged for estimating, on the basis of the user input and on the basis of the matrix, a category value for the predetermined data type(s) not included in the user input. The system includes an output unit arranged for outputting the estimated category value for the predetermined data type(s) not included in the user input.	03-03-2016
20160063088	METHOD AND SYSTEM FOR DETERMINING RELATIONSHIP BETWEEN USERS BASED ON PHYSICAL ADDRESSES OF WIRELESS SIGNAL SOURCES - Systems and methods are provided for determining relationship between users. The systems and methods may include receiving, from a target user terminal and one or more other user terminals, one or more physical addresses associated with one or more wireless signal sources. The system may determine one or more relationships between a target user associated with the target user terminal and one or more other users associated with the one or more other user terminals based on the received one or more physical addresses. The one or more wireless signal sources are within search ranges of the target user terminal and the one or more other user terminals.	03-03-2016
20160063089	LEVERAGING ENTERPRISE CONTENT - A method and system for leveraging content is provided. The method includes receiving, data associated with a subscriber and registering the subscriber with an ECM computing system. Devices belonging to the subscriber are connected to the ECM computing system and metadata associated with content retrieved from the devices is generated. The content in the devices are classified into formal content and informal content. Multiple searches for additional content are monitored and multifaceted search results associated with the formal content and the informal content are generated and presented to the subscriber. The subscriber has an option to request informal content on additional end user devices from respective end users based on metadata presented by search results.	03-03-2016
20160063097	Data Clustering System, Methods, and Techniques - Data having some similarities and some dissimilarities may be clustered or grouped according to the similarities and dissimilarities. The data may be clustered using agglomerative clustering techniques. The clusters may be used as suggestions for generating groups where a user may demonstrate certain criteria for grouping. The system may learn from the criteria and extrapolate the groupings to readily sort data into appropriate groups. The system may be easily refined as the user gains an understanding of the data.	03-03-2016
20160070763	PARALLEL FREQUENT SEQUENTIAL PATTERN DETECTING - Techniques for parallel frequent sequential pattern detection are provided. A sequence database is split into separate datasets and each node is given a specific dataset to resolve specific frequent items occurring in its specific dataset based on counts. Then, each node groups its item frequent items into “n” (varying) length sequences representing sequential patterns present in the original sequence database. The nodes process in parallel with one another and collectively produce a complete set of the sequential patterns defined in the original sequence database.	03-10-2016
20160070776	LOGICAL OPERATION METHOD AND INFORMATION PROCESSING DEVICE - A logical operation among plural sets in large-scale data (big data) is performed efficiently. The sets targeted for the logical operation is classified into predetermined common segments, each with a size allocatable to a memory, and the logical operation is performed with respect to each segment on the memory. The common segment is configured in such a manner that all the records of the sets are classified without duplications. Then, a direct sum of results of the logical operation on the respective segments is calculated, thereby obtaining a result of the logical operation. The size of the common segment is determined so that the records being classified are loadable on the memory.	03-10-2016
20160070778	TECHNIQUES FOR DYNAMIC PARTITIONING IN A DISTRIBUTED PARALLEL COMPUTATIONAL ENVIRONMENT - An apparatus includes an organization component to retrieve from task instructions an indication of a type of organization of data set subportions prior to performance of a computation and a data item by which the data set subportions are to be organized, organize the data set subportion among others based on the data item and type of organization, monitor availability of a first processing resource and a first storage resource of a node device employed to organize the data set subportions, and based on insufficient availability of at least one of the first processing resource or the first storage resource, interrupt the organization of the data set subportions, and dispatch a first set of one or more organized data set subportions to be processed; and a performance component to execute the task instructions to process the organized data set subportion.	03-10-2016
20160070779	METHOD, APPARATUS, AND COMPUTER-READABLE MEDIUM FOR EFFICIENTLY PERFORMING OPERATIONS ON DISTINCT DATA VALUES - An apparatus, computer-readable medium, and computer-implemented method for efficiently performing operations on distinct data values, including receiving a query directed to a column of data, the query defining one or more group sets for grouping the data retrieved in response to the query, and for each of the one or more group sets, generating one or more entity map vectors, the length of each entity map vector being equal to the number of unique data values in a domain which corresponds to the column of data, the position of each bit in the entity map vector corresponding to the lexical position of a corresponding unique data value in a lexically ordered list of the unique data values, and the value of each bit in the entity map vector indicating the presence or absence of the corresponding unique data value in the group set.	03-10-2016
20160070780	MANAGEMENT OF FILE STORAGE LOCATIONS - The embodiments described may be directed toward a file management system for managing a file folder location, a method for managing one or more data clusters, and a method of recommending a file storage location. The method of recommending a file storage location may also include plotting one or more data points onto one or more vectors. A received data point may be obtained from a file save request. The method may also include creating one or more data clusters from the vector data points using a clustering mechanism.	03-10-2016
20160078032	METHOD OF ON-LINE MEDIA SCORING - The present invention relates to a method of on-line media scoring technique comprising calculating view share, clip match duration percentage and website balance to generate on-line media score for the media and to determine popularity of the media across different Internet sites.	03-17-2016
20160078033	Physical Visual ID as Means to Tie Disparate Media Collections - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for automatically associating collections of media files are provided. A first tying-identifier is determined from at least one media file in a first media collection. A second tying-identifier is determined from at least one media file in a second media collection. The first tying-identifier is then compared to the second tying-identifier. The first media collection and the second media collection are then associated together based on the similarity between the first tying-identifier and the second tying-identifier.	03-17-2016
20160078120	EXTRACTING AND PROCESSING METRICS FROM SYSTEM GENERATED EVENTS - In an example, a processing system of a database system may categorize event data taken from logged interactions of users with a multi-tenant information system to provide a metric. The processing system of the database system may periodically calculate the metric for a particular one of the tenants, and electronically store the periodically calculated metrics for accessing responsive to a query of the particular tenant.	03-17-2016
20160078124	USING DISTINGUISHING PROPERTIES TO CLASSIFY MESSAGES - A system and method are disclosed for classifying a message. The method includes receiving the message, identifying in the message a distinguishing property; generating a signature using the distinguishing property; and comparing the signature to a database of signatures generated by previously classified messages.	03-17-2016
20160078125	PARTICIPANT GROUPING FOR ENHANCED INTERACTIVE EXPERIENCE - Representative embodiments of a method for grouping participants in an activity include the steps of: (i) defining a grouping policy; (ii) storing, in a database, participant records that include a participant identifier, a characteristic associated with the participant, and/or an identifier for a participant's handheld device; (iii) defining groupings based on the policy and characteristics of the participants relating to the policy and to the activity; and (iv) communicating the groupings to the handheld devices to establish the groups.	03-17-2016
20160085750	STORAGE APPARATUS AND STORAGE APPARATUS CONTROL METHOD - A storage apparatus includes a storage unit configured to include a plurality of tiers, a group management unit configured to manage files stored in any of the tiers in the storage unit by dividing the files into a plurality of groups, a movement target selection unit configured to select a group, out of the groups, which satisfies a tier movement condition as a movement target of the tier, and a file movement unit configured to move a subordinate file belonging to the group selected by the movement target selection unit, from a tier in which the subordinate file is stored to another tier.	03-24-2016
20160085768	INFORMATION PROCESSING SYSTEM, AND INFORMATION PROCESSING METHOD - An information processing system includes first and second information processing terminals; and an information processing device that is connected to the first and second information processing terminals. The information processing system includes a receiver for receiving message data and file specifying data for specifying a file that are transmitted from the first and second information processing terminals; a storage unit for classifying the file that is specified by the file specifying data based on access right data representing access rights that are set for users of the first and second information processing terminals and classification data, and for storing the classified file; and a transmitter for transmitting, to the first or second information processing terminal, the message data that is received by the receiver and message data that includes file storage data that represents the file that is stored in the storage unit.	03-24-2016
20160085801	SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR UPDATING DATABASE OBJECTS WITH REPORT AGGREGATIONS - In accordance with embodiments, there are provided mechanisms and methods for updating database objects with report aggregations. These mechanisms and methods for updating database objects with report aggregations can enable high-level configurations defining the use of report aggregations in specified database objects. This can allow user-friendly configurations of data aggregations and uses thereof, efficient generation of reports having data aggregations, and/or aggregations that are not limited based on user access privileges but that summarize all desired data.	03-24-2016
20160085842	SYSTEM AND METHOD FOR AVOIDING OBJECT IDENTIFIER COLLISIONS IN A PEERED CLUSTER ENVIRONMENT - A system and method for avoiding object identifier collisions in a cluster environment is provided. Upon creation of the cluster, volume location databases negotiate ranges for data set identifiers (DSIDs) between a first site and a second site of the cluster. Any pre-existing objects are remapped into an object identifier range associated with the particular site hosting the object.	03-24-2016
20160085849	Computer-Implemented System And Method For Generating Clusters For Placement Into A Display - A computer-implemented system and method for generating clusters for placement into a display is provided. A set of clusters is generated from a document set. A single cluster of related documents from the document set is obtained and at least one new cluster is added. One such document in the set is compared to the cluster. A difference in distance between the document and a common origin and the cluster and the common origin is determined. The document is designated as the new cluster when the difference fails to satisfy a predetermined threshold. One or more cluster spines each having two or more clusters placed along a vector are placed into a display. The clusters along each spine are identified as similar and the clusters of one such spine are also similar to further clusters located along a further spine having a small cosine rotation from that cluster spine.	03-24-2016
20160092504	Recognition of Free-form Gestures from Orientation Tracking of a Handheld or Wearable Device - A user performs a gesture with a hand-held or wearable device capable of sensing its own orientation. Orientation data, in the form of a sequence of rotation vectors, is collected throughout the duration of the gesture. To construct a trace representing the shape of the gesture and the direction of device motion, the orientation data is processed by a robotic chain model with four or fewer degrees of freedom, simulating a set of joints moved by the user to perform the gesture (e.g., a shoulder and an elbow). To classify the gesture, a trace is compared to contents of a training database including many different users' versions of the gesture and analyzed by a learning module such as support vector machine.	03-31-2016
20160092550	AUTOMATED SEARCH INTENT DISCOVERY - A first named entity, in a first query, may be identified. A first type, of the first named entity, may be determined and a first prefix and a first postfix, associated with the first named entity in the first query, may be identified. The first prefix and the first postfix may be assigned to a first group. The first group may designate one or more prefixes and one or more postfixes as being associated with the first type. A second named entity, associated with the first prefix and the first postfix in the first group, may be identified in a second query. Responsive to the second named entity being associated with the first type, a first search intent case comprising the first prefix, the first postfix, and the first type may be added to a database.	03-31-2016
20160092552	METHOD AND SYSTEM FOR IMPLEMENTING EFFICIENT CLASSIFICATION AND EXPLORATION OF DATA - Disclosed is a system, method, and computer program product for analyzing sets of data in an efficient manner, such that analytics can be effectively performed over that data. Classification operations can be performed to generate groups of similar log records. This permits classification of the log records in a cohesive and informative manner.	03-31-2016
20160092558	Hybrid Cluster-Based Data Intake and Query - Various embodiments describe multi-site cluster-based data intake and query systems, including cloud-based data intake and query systems. Using a hybrid search system that includes cloud-based data intake and query systems working in concert with so-called “on-premises” data intake and query systems can promote the scalability of search functionality. In addition, the hybrid search system can enable data isolation in a manner in which sensitive data is maintained “on premises” and information or data that is not sensitive can be moved to the cloud-based system. Further, the cloud-based system can enable efficient leveraging of data that may already exist in the cloud.	03-31-2016
20160092566	CLUSTERING REPETITIVE STRUCTURE OF ASYNCHRONOUS WEB APPLICATION CONTENT - A processor determines whether a DOM includes a repetitive pattern of a combination, formed by a tag of a leaf node and a tag of a parent node of the leaf node. Determining the repetitive pattern of the combination, the processor identifies a first inner cluster is identified by collapsing multiple instances of the repetitive pattern into a single instance. The processor generates a LSH signature for the single instance of the repetitive pattern. The processor determines an outer cluster, based on grouping one or more inner clusters, as part of a section rooted at a source node of the DOM, in which the source node is a parent node of the one or more inner clusters. Determining that a pair of outer clusters are near repetitive, the processor limits web content exploration to one of the pair of outer clusters.	03-31-2016
20160098471	Fast OLAP Query Execution in Main Memory on Large Data in a Cluster - Techniques are described for efficient execution of analytical queries on large amounts of data in a parallel database cluster while making maximal use of the available hardware.	04-07-2016
20160098472	MAP-REDUCE JOB VIRTUALIZATION - A method for map-reduce job virtualization is disclosed. The method includes receiving a map-reduce job written in a first map-reduce language. The map-reduce job is to be performed in parallel on a plurality of nodes of a plurality of clusters. The method also includes selecting one or more clusters to run the map-reduce job. The method further includes identifying a second map-reduce language associated with the selected clusters. The method also includes converting the first map-reduce language of the map-reduce job into the second map-reduce language. The method further causes the map-reduce job in the second map-reduce language to be run on the plurality of nodes of the selected clusters.	04-07-2016
20160098473	GROUPING METHOD AND APPARATUS - A grouping apparatus acquires a plurality of messages output from a plurality of output source apparatuses, respectively. The grouping apparatus acquires a plurality of explanatory texts respectively relating to the plurality of messages, from documents respectively relating to the output source apparatuses. The grouping apparatus generates a plurality of message groups that include related messages respectively, based on the plurality of explanatory texts for the respective messages.	04-07-2016
20160103842	SKELETON DATA POINT CLUSTERING - Methods, systems, and apparatus, including computer programs encoded on computer storage media, for clustering data points. One of the methods includes maintaining data representing a respective ordered tuple of skeleton data points for each of a plurality of clusters. One or more intersecting clusters are determined for a new data point. An updated tuple of skeleton data points is generated for an updated cluster by selecting updated skeleton data points, including selecting the new data point or an existing jth skeleton data point of one of the one or more intersecting clusters according to which random value, of the jth random value for the new data point or the random value for the jth existing skeleton data point, is closest to a limiting value. The new data point is then assigned to the updated cluster.	04-14-2016
20160103905	INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM - There is provided an information processing device including a display control unit configured to display pieces of content at a first position of a screen, a condition setting unit configured to set a clustering condition for the pieces of content in accordance with a user operation, and a clustering unit configured to classify the pieces of content into a cluster in accordance with the clustering condition. The display control unit moves a display of the pieces of content from the first position toward a second position corresponding to the cluster.	04-14-2016
20160103958	SYSTEMS, METHODS, AND COMPUTER PROGRAM PRODUCTS FOR MERGING A NEW NUCLEOTIDE OR AMINO ACID SEQUENCE INTO OPERATIONAL TAXONOMIC UNITS - The present disclosure provides a method for filtering sequence clusters during a process of merging a newly generated nucleotide or amino acid sequence with a set of previously clustered sequences. In another aspect, the disclosure provides a method for assigning newly generated nucleotide or amino acid sequences to presumptive species called operational taxonomic units. In yet another embodiment, the sequences are derived from the cytochrome c oxidase I gene.	04-14-2016
20160110363	METHOD AND SYSTEM FOR MEASURING AND MATCHING INDIVIDUAL CULTURAL PREFERENCES AND FOR TARGETING OF CULTURE RELATED CONTENT AND ADVERTISING TO THE MOST RELEVANT AUDIENCE - A system and method for measuring individual cultural preferences and matching them with those of other individuals, and/or with culture related content. The system and method are designed to increase ROI in marketing, as well as to promote more frequent communications between internet users by: a) providing internet users with the ability to find their “peers”, i.e., people with the closest culture related preferences; b) delivering to internet users precisely targeted and highly relevant recommendations regarding culture related content and products, automatically generated based on selections made by the users' “peers”; c) defining the most appropriate target audience for a set of cultural content items; d) defining the most appropriate set of cultural content items for a user segment of a given social network; and e) increasing the exposure, and thus effectiveness, of advertisements by motivating internet users to establish new relationships with their “peers”.	04-21-2016
20160110442	Segmentation Discovery, Evaluation and Implementation Platform - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, are described that enable clustering and evaluation of data. A data set is identified for which to evaluate cluster solutions, the data set including a plurality of records each including a plurality of attributes. Different attributes are identified, including target driver attributes, cluster candidate attributes, and profile attributes. One or more clustering algorithms are identified and applied to the data set to generate cluster solutions. Each cluster solution groups records in the data set into different clusters based on the cluster candidate attributes. A score is calculated for each cluster solution based at least on the target driver attributes, the cluster candidate attributes, and the profile attributes. A user interface is generated for presentation to a user showing the generated cluster solution organized according to the calculated score for each cluster solution.	04-21-2016
20160110443	MULTIDIMENSIONAL DATA REPRESENTATION - A system for data aggregation and representation, comprising a data aggregator that may request or receive input from a plurality of data sources, a visualization engine that may generate representations of data, and an interaction manager that may handle interactions from an analyst, and a method for multidimensional representation of data.	04-21-2016
20160110444	METHOD, APPARATUS, AND COMPUTER-READABLE MEDIUM FOR OPTIMIZED DATA SUBSETTING - An apparatus, computer-readable medium, and computer-implemented method for data subsetting, including receiving a request for a subset of data from a plurality of tables, generating an entity graph corresponding to the plurality of tables, expanding the entity graph if the entity graph does not have any cycles, and performing acyclic subset processing on the expanded entity graph if the entity graph does not have any cycles and the expanded entity graph does not have any cycles.	04-21-2016
20160124992	METHOD FOR THE CONTINUOUS PROCESSING OF TWO-LEVEL DATA ON A SYSTEM WITH A PLURALITY OF NODES - A method for continuous processing of two-level data on a system with a plurality of nodes for processing the data includes determining a system state representing at least one of actual or possible performance capabilities of the system, determining already processed data on the nodes, splitting and assigning high-level input data for processing with lower level data on one or more of the plurality of nodes according to the determined system state, processing requirements of the data and already processed data in a form of at least of lower level data on the nodes such that in case of the already processed data, data to be processed is compared with the already processed data, and input data is split and assigned to the plurality of nodes such that an amount of data to be exchanged for processing the input data on the respective nodes is minimized.	05-05-2016
20160125061	SYSTEM AND METHOD FOR CONTENT SELECTION - A server and method for providing a content selection is provided. The server receives content targeting parameters and obtains content items from at least one content site based on the content targeting parameters. The server can further identify content descriptors for the content items and generate a first content cluster from a subset of the content items based on the content descriptors. The server can further generate a second content cluster from a second subset of the content items based on the content descriptors and rank the first and the second content clusters in an order of usefulness. The ranking of the content clusters can be based on at least one of an importance of content, a recentness of the content items and a size of the content cluster.	05-05-2016
20160125062	MULTI-SCALE TIMELING PHOTOGRAPH ALBUM MANAGEMENT WITH INCREMENTAL SPECTRAL PHOTOGRAPH CLUSTERING - Described embodiments utilize a multi-scale timeline approach in which photographs in a photograph album are organized into photograph clusters associated with multiple timeline scales. Each photograph cluster can be represented on a display screen as a thumbnail indicator in a photograph album window for efficient browsing. The thumbnail indicator can comprise one or more images automatically selected to represent a large number of photographs from the photograph cluster. Activation of one of the thumbnail indicators can trigger a change to another timeline scale. A multi-scale timeline one-step incremental spectral photograph clustering algorithm can also be utilized to quickly add or delete photographs from the cluster photographs at multiple timeline scales with a complexity approaching O(n). Near-duplicate photographs can also be automatically detected based on both time and visual features, and collapsed to a single representation to save space on the relatively small display screens typically included in mobile devices.	05-05-2016
20160125074	CUSTOMIZED CONTENT FOR SOCIAL BROWSING FLOW - Provided are techniques for providing customized content for social browsing flow. In response to accessing existing content, a group is identified from a plurality of groups created from behavioral and profile analysis. Additional content is created for the existing content to provide a customized browsing experience based on the identified group. The additional content is displayed with the existing content.	05-05-2016
20160132504	COMPUTER-IMPLEMENTED METHODS AND SYSTEMS FOR CLUSTERING USER REVIEWS AND RANKING CLUSTERS - Computer-implemented methods and systems are disclosed for organizing user reviews, especially computer app reviews, into clusters and ranking the clusters so that the reviews may be more meaningfully analyzed.	05-12-2016
20160132520	METHOD AND APPARATUS FOR FINDING FILE IN STORAGE DEVICE AND ROUTER - The present disclosure relates to a method and an apparatus for finding a file in a storage device, and a router. The method includes: establishing an index for a specified file in the storage device according to file classifications or an identification of a device that uploaded the file, the storage device being a storage device configured in the router; and providing a classification view of the specified file for a user device, the classification view including identifications of the respective file classifications. The present disclosure provides the classification of the specified file for the user device by according to file types or the identification of the device for uploading the file, so that different user devices may more flexibly perform classification finding when viewing files in the storage device of the router, whereby it is convenient for a user to find the file, the operation is simple and efficient, and the user's experience is good.	05-12-2016
20160132546	Data Analytics System - A data analytics system for manipulating and analyzing data and usable in preventing instances of incompatibility as desired, is disclosed.	05-12-2016
20160140115	Strategies for indexing, ranking and clustering multimedia documents - A method for a system that indexes, ranks, and clusters multimedia documents using organizing means, scoring means, and stochastic means that optimizes parameter sets comprising of object parameters. The method creates a plurality of individual parameter sets, the parameter sets comprising information sharing system object parameters for describing a model, structures, shape, design, process, search query sets, and dynamic search spaces to be optimized using selective variations, constructive variations, clustering variations, and stochastic variations. The optimizations of the search space are guided by document query terms of the search query set object parameter. The search space is defined in terms of the terminal set, function set, and fitness measures (NN computations). The quality and speed of an EC application is controlled by the algorithm control parameters (the rates associated with node selection and document migration) and terminal criterion (continuous search for optimal distribution of multimedia documents). Satisfying the terminal criteria for the optimization of the dynamic search space object parameter which provides a synchronization point in which additional multimedia documents can be added to the global multimedia document object parameter.	05-19-2016
20160140207	SYSTEMS AND METHODS FOR AGGREGATING INFORMATION-ASSET CLASSIFICATIONS - The disclosed computer-implemented method for aggregating information-asset classifications may include (1) identifying a data collection that includes two or more information assets, (2) identifying a classification for each of the information assets, (3) deriving, based at least in part on the classifications of the information assets, an aggregate classification for the data collection, and (4) associating the aggregate classification with the data collection to enable a data management system to enforce a data management policy based on the aggregate classification. Various other methods, systems, and computer-readable media are also disclosed.	05-19-2016
20160140208	Fast Grouping of Time Series - In some examples, a time-series data set can be analyzed and grouped in a fast and efficient manner. For instance, fast grouping of multiple time-series into clusters can be implemented through data reduction, determining cluster population, and fast matching by locality sensitive hashing. In some situations, a user can select a level of granularity for grouping time-series into clusters, which can involve trade-offs between the number of clusters and the maximum distance between two time-series in a cluster.	05-19-2016
20160140210	SYSTEMS AND METHODS FOR AUTOMATIC IDENTIFICATION OF POTENTIAL MATERIAL FACTS IN DOCUMENTS - Systems and methods to identify potential material fact sentences in electronic legal documents obtained from electronic repositories are disclosed. A system includes a processing device and a storage medium in communication with the processing device. The storage medium includes programming instructions that cause the processing device to obtain a document and parse text within the document to determine whether each paragraph in the document is a fact paragraph, a discussion paragraph, or an outcome paragraph based on at least one of a heading associated with the paragraph and features of the paragraph. The storage medium further includes programming instructions that cause the processing device to extract each sentence in the fact paragraph, direct a trained sentence classifier to determine whether each sentence is a potential material fact sentence or a non-material fact sentence based on features of the sentence, and identify potential material fact sentences.	05-19-2016
20160140211	SEGMENTATION AND STRATIFICATION OF DATA ENTITIES IN A DATABASE SYSTEM - A stratified or segmented composite data structure can be formed by selecting a group of data entities, stratifying or segmenting them according to attributes, and assigning relative weights to the components based on their stratified or segmented positions. The attributes are selected from a universe of possible values. Further positive and negative biases can be applied at any arbitrary point or position, including to individual data entities, groups of arbitrarily selected data entities, or arbitrary positions.	05-19-2016
20160140222	EXAMPLE-BASED ITEM CLASSIFICATION - Item classification rules are created based on examples selected by a user, such as by selecting a subset of emails, and the rule is used across a larger set of items to obtain automatic classification of similar items according to the rule. Based on an analysis, a candidate classification rule is generated identifying text-based features shared among the items of the subset. The user can review the candidate rule as well as a resultant subset of items generated by the rule, and either accept the candidate rule or make an adjustment to the examples and then perform one or more iterations of the analysis to refine the rule. Adjustments can be made by removing items incorrectly included in a resultant subset and/or adding items incorrectly excluded from a resultant subset, and using the adjusted subset in a next iteration.	05-19-2016
20160147816	SAMPLE SELECTION USING HYBRID CLUSTERING AND EXPOSURE OPTIMIZATION - According to some embodiments, a system includes a communication device operative to communicate with a user to receive a data set including a plurality of samples at a clustering module; a clustering module to receive the data set, store the data set, and calculate one or more clusters of samples using a clustering strategy; an optimization module to receive and store the one or more clusters of samples from the clustering module and generate one or more samples from the one or more clusters of samples using an optimization strategy; a memory for storing program instructions; at least one sample selection platform processor, coupled to the memory, and in communication with the clustering module and the optimization module and operative to execute program instructions to: calculate one or more clusters of samples based on the clustering strategy by executing the clustering module; analyze the data associated with the one or more clusters received from the clustering module using the optimization strategy associated with the optimization module to automatically select one or more samples from the one or more clusters; and provide one or more samples generated by the optimization module for replication in a validation model. Numerous other aspects are provided.	05-26-2016
20160147865	DETECTING FEACAL AND URINE EVENTS BY REFERENCE TO COLLECTIONS OF DATA - A method for analysing incoming data, comprising the steps of processing the incoming data in segments to output a sequence of segment types by extracting one or more properties of an incoming data segment and forming an Unknown Property Vector for each segment of data in the incoming data, and processing the sequence of segment types to identify events in the incoming data. The sequence of segment types is determined, for each segment, by analysing the Unknown Property Vector by reference to one or more collections of vectors obtained from a set of Reference Property Vectors. This may each of the one or more collections of vectors being selected from the set of Reference Property Vectors randomly or based on relevance or clustering.	05-26-2016
20160147867	INFORMATION MATCHING APPARATUS, INFORMATION MATCHING METHOD, AND COMPUTER READABLE STORAGE MEDIUM HAVING STORED INFORMATION MATCHING PROGRAM - An information matching apparatus includes a target DB corresponding to a check target that stores therein records; a narrow-down condition creating unit that combines, in accordance with values of check items in a check source record using AND, a search condition defined by a search definition indicating a condition for excluding candidates in check target records that are less likely to have a similarity to or a relationship with a name identification source record and each grouping condition defined by a grouping definition indicating a condition for limiting a checking area of the check target records to create a narrow-down condition for narrowing down the check target records; and a searching unit that searches the target DB corresponding to the check target for a check target record in accordance with the created narrow-down condition.	05-26-2016
20160147877	INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD AND INFORMATION PROCESSING PROGRAM - The present invention provides an information processing apparatus which can direct a user to a playlist different from a playlist being reproduced. There is provided the information processing apparatus including a content storage unit storing a plurality of contents therein, a playlist storage unit storing a plurality of playlists which is related to at least some of the plurality of contents, a reproducing unit sequentially reproducing a plurality of contents belonging to a first playlist in a plurality of playlists, a candidate content extracting unit extracting one or more candidate contents relating to a content being reproduced by the reproducing unit from the content storage unit, a playlist extracting unit extracting a second playlist to which the extracted candidate contents belong from the playlist storage unit, and a playlist switching unit switching a playlist to be reproduced by the reproducing unit from the first playlist into the second playlist.	05-26-2016
20160154793	NORMALIZING NON-NUMERIC FEATURES OF FILES	06-02-2016
20160154813	METHODS AND SYSTEMS FOR MANAGING DATA	06-02-2016
20160154859	System, Method and Software for Providing Persistent Entity Identification and Linking Entity Information in a Data Repository	06-02-2016
20160154877	ANOMALY, ASSOCIATION AND CLUSTERING DETECTION	06-02-2016
20160154878	System For Linked And Networked Document Objects	06-02-2016
20160162563	INTELLIGENT XML FILE FRAGMENTATION - An XML fragmenting mechanism uses an XML schema for the XML file to split up the XML file in a hierarchal structure of data blocks for storage in a storage system with a limited block size such as a cluster coordination service. The XML fragmenting mechanism creates an XML file map to document the structure of the XML file in the storage system. The XML fragmenting mechanism stores the data blocks in the storage system according to the XML file map and supports retrieval of all or part of the data in a format that supports XML validation.	06-09-2016
20160162564	INTELLIGENT XML FILE FRAGMENTATION - An XML fragmenting mechanism uses an XML schema for the XML file to split up the XML file in a hierarchal structure of data blocks for storage in a storage system with a limited block size such as a cluster coordination service. The XML fragmenting mechanism creates an XML file map to document the structure of the XML file in the storage system. The XML fragmenting mechanism stores the data blocks in the storage system according to the XML file map and supports retrieval of all or part of the data in a format that supports XML validation.	06-09-2016
20160162565	METHOD AND DEVICE FOR GENERATING MUSIC PLAYLIST - A method for generating a music playlist includes: classifying a plurality of songs into first songs and second songs, the first songs being sample songs with mood vectors, and the second songs being new songs with no mood vectors; comparing physical attributes of the first songs to physical attributes of each second song; determining which first song of the first songs has physical attributes most similar to the physical attributes of each second song; assigning the mood vector of the determined first song having the most similar physical attributes to each second song; and generating a music playlist containing songs, all with mood vectors, by combining the second songs with mood vectors assigned thereto and the first songs.	06-09-2016
20160162566	Model Navigation Constrained by Classification - A method, system and computer-usable medium are disclosed for efficient searching of a semantic model of resources and resource relationships. A query is received from an application. In turn the query is processed to determine an application usage classification for the application, which is then used to reference an index of subsets of the semantic model to identify a subset of the semantic model associated with the application usage classification. The identified subset of the semantic model is then used to modify the query, which is then used as a modified query to query the semantic model. In response, a sub-graph of the semantic model corresponding to the subset of the semantic mode is received, which in turn is provided to the application.	06-09-2016
20160162570	SYSTEM FOR EXTRACTING CUSTOMER FEEDBACK FROM A MICROBLOG SITE - A system for extracting customer feedback from a microblog site includes a retrieval unit coupled to the microblog site to capture microblog updates. A filter unit coupled to the retrieval unit filters the captured microblog updates according to filter criteria that remove non-actionable items from the captured microblog updates. A learning unit coupled to the filter unit prioritizes the filtered microblog updates, and a classification unit coupled to the learning unit classifies the filtered and prioritized microblog updates. An action unit coupled to the classification unit performs appropriate actions based on the classified, filtered and prioritized microblog updates.	06-09-2016
20160162572	ALERTING SYSTEM BASED ON NEWLY DISAMBIGUATED FEATURES - The present disclosure relates to a method of alerting users regarding newly disambiguated features. More specifically, a newly disambiguated feature may pass through different filters/restrictions, such as, the known knowledge base. The disclosed known knowledge base may filter the newly disambiguated feature, comparing the newly disambiguated features to the existing features to discover a new feature of interest. Particularly, the disclosed new feature of interest may include a new person, a new phone number, a new place, a new company, among others. Finally, if there is a new feature that did not match with the existing disambiguated features in the known knowledge base, then an alert may be emitted to a user.	06-09-2016
20160162573	Controlling Access to Resources Based on Affinity Planes and Sectors - A first person (which may be a natural person, organization, brand, or other entity) has one or more affinity planes. Each affinity plane represents a distinct closeness of relationship with the first person. The first person also has one or more sectors, each of which may be associated with a domain. Each of the other people may be associated with zero or more of the first person's affinity planes and zero or more of the first person's sectors. Each of the first person's resources may be associated with zero or more of the first person's affinity planes and zero or more of the first person's sectors. A request by one of the other people to access one of the first person's resources is granted based on the overlap between the affinity planes and sectors associated with the requestor and the affinity planes and sectors associated with the requested resource.	06-09-2016
20160164975	METHOD AND APPARATUS FOR MASHING UP HETEROGENEOUS SENSORS, AND RECORDING MEDIUM THEREOF - Disclosed is a method for mashing up heterogeneous sensors, which includes collecting an activity log from at least one sensor, converting the collected activity log into a common markup format, extracting an activity of a user from the activity log, based on an activity model which defines relations among activities, and outputting the extracted activity of the user in a semantic unit. Thus, it is possible to integrally utilize information of heterogeneous sensors, provide meaningful and intuitive information to a user, and share information of various applications and devices.	06-09-2016
20160171080	SYSTEM AND METHOD FOR ENHANCING THE NORMALIZATION OF PARCEL DATA	06-16-2016
20160171083	SYSTEM AND METHOD FOR NEWS EVENTS DETECTION AND VISUALIZATION	06-16-2016
20160171085	SYSTEM AND METHOD FOR NEWS EVENTS DETECTION AND VISUALIZATION	06-16-2016
20160171086	GROUPING DATA IN A DATABASE	06-16-2016
20160171087	METHOD AND DEVICE FOR USING CONTENT OF A CONTENT LIBRARY	06-16-2016
20160171105	SYSTEMS AND METHODS FOR LOCATING USER AND ACCOUNT INFORMATION	06-16-2016
20160179847	ITERATIVE IMAGE SEARCH ALGORITHM INFORMED BY CONTINUOUS HUMAN-MACHINE INPUT FEEDBACK	06-23-2016
20160179905	MAPPING DATA INTO AN AUTHORIZED DATA SOURCE	06-23-2016
20160179922	TECHNIQUES FOR REAL-TIME GENERATION OF TEMPORAL COMPARATIVE AND SUPERLATIVE ANALYTICS IN NATURAL LANGUAGE FOR REAL-TIME DYNAMIC DATA ANALYTICS	06-23-2016
20160182678	Method, Device and System for Carrying Out Telecommunication Capability Group Sending	06-23-2016
20160188579	PSEUDO INTERNAL NUMBERING MECHANISM - Various embodiments of systems and methods to provide pseudo internal numbering for uniquely and continuously numbering of legally bound documents are described herein. In one aspect, an external numbering range object (NRO) is generated in a computer system. The range of numbers assignable by the external NRO is split into a set of intervals based on a prefix. In another aspect, an internal NRO is generated corresponding to a subset of the intervals of the external NRO. The correspondence between the internal NRO and the subset of intervals is determined by a part of the prefix. In yet another aspect, the unique and continuous numbers generated by the internal NRO are correlated with the numbers in the intervals of the subset of intervals of the external NRO based on a correspondence between values of the prefix of the external NRO and a prefix of the internal NRO.	06-30-2016
20160188674	EMOTION-BASED CONTENT RECOMMENDATION APPARATUS AND METHOD - An apparatus and method capable of recommending content suitable for a user using emotion annotation information is provided. The emotion-based content recommendation apparatus includes a content annotation information database (DB) configured to store basic annotation information and emotion information for each content; a user profile information DB configured to store preferred emotion information in addition to basic profile information for each user; and a content recommendation management module configured to recommend a content list suitable for an emotion of a user based on the emotion information for each content and the preferred emotion information for each user.	06-30-2016
20160188692	BEHAVIORALLY CONSISTENT CLUSTER-WIDE DATA WRANGLING BASED ON LOCALLY PROCESSED SAMPLED DATA - Example embodiments involve a system, computer-readable storage medium storing at least one program, and computer-implemented method for behaviorally consistent data wrangling. A local client device selects a set of raw sample data from a remote datastore. A local execution engine then applies one or more local data wrangling operations to the raw sample data. If the results of the local data wrangling operations are satisfactory, the local data wrangling operations may then be transferred to a remote data wrangling cluster. A remote execution engine being executed by the remote data wrangling cluster then applies the data wrangling operations to the larger set of raw data from which the sample raw data was obtained. As the remote execution engine and the local execution engine are of the same type, the data wrangling behavior exhibited by the local execution engine is reflected in the data wrangling behavior of the remote execution engine.	06-30-2016
20160188693	HIERARCHICAL DE-DUPLICATION TECHNIQUES FOR TRACKING FITNESS METRICS - A method is provided for tracking fitness data. A server device receives a first instance of fitness data from a first device and a second instance of fitness data from a second device. The fitness data comprises information about exercise activity of a user. The server determines that the first instance of the fitness data was received from the first device and that the second instance of the fitness data was received from the second device. The server selects a preferred instance of the fitness data comprising one of the first instance of the fitness data or the second instance of the fitness data based on a classification of the first instance and the second instance. The server incorporates the preferred instance of the fitness data into a fitness profile of the user.	06-30-2016
20160188697	DESIGNING A CHOROPLETH MAP - The invention notably relates to a computer-implemented method of designing a choropleth map, wherein the method comprises providing a map, and a number (n) of numerical values (x	06-30-2016
20160188699	INDEXING OF LARGE SCALE PATIENT SET - Systems and methods for indexing data include formulating an objective function to index a dataset, a portion of the dataset including supervision information. A data property component of the objective function is determined, which utilizes a property of the dataset to group data of the dataset. A supervised component of the objective function is determined, which utilizes the supervision information to group data of the dataset. The objective function is optimized using a processor based upon the data property component and the supervised component to partition a node into a plurality of child nodes.	06-30-2016
20160188710	METHOD AND SYSTEM FOR MIGRATING DATA TO NOT ONLY STRUCTURED QUERY LANGUAGE (NoSOL) DATABASE - A method, non-transitory computer readable medium, and data migration computing device that retrieves database metadata information, query statements information and query scripts information of each of database tables from relational database system. Then, query patterns of each of database tables from query statements information and query workload of each of database tables from the query scripts information is identified. Next, table key information and table index information of each of the database tables based on correlation between the database metadata information and the query patterns of corresponding database tables is determined Then, data model of a Not Only Structured Query Language (NoSQL) database is generated using database metadata information, the query patterns, query workload, table key information and table index information. Then, data model of NoSQL database is verified. Lastly, data from relational database is migrated to NoSQL database.	06-30-2016
20160196323	METHODS AND APPARATUS FOR ASSESSING PATHWAYS FOR BIO-CHEMICAL SYNTHESIS	07-07-2016
20160196328	CROSS-DOMAIN CLUSTERABILITY EVALUATION FOR CROSS-GUIDED DATA CLUSTERING BASED ON ALIGNMENT BETWEEN DATA DOMAINS	07-07-2016
20160196330	SYSTEM AND METHOD FOR DOCUMENT GROUPING	07-07-2016
20160196332	METHOD AND SYSTEM FOR DISAMBIGUATING INFORMATIONAL OBJECTS	07-07-2016
20160196333	INFORMATION EXCHANGE ENGINE PROVIDING A CRITICAL INFRASTRUCTURE LAYER AND METHODS OF USE THEREOF	07-07-2016
20160203169	DATA PARTITION AND TRANSFORMATION METHODS AND APPARATUSES	07-14-2016
20160203206	Method and Apparatus for Storing Sparse Graph Data as Multi-Dimensional Cluster	07-14-2016
20160203207	ELECTRONIC APPARATUS AND CLASSIFYING METHOD	07-14-2016
20160203210	Method for classifying a data segment with regard to its further processing	07-14-2016
20160203212	SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR DETERMINING PREFERENCES OF AN ENTITY	07-14-2016
20160203213	ACCOUNT ASSOCIATION SYSTEMS AND METHODS	07-14-2016
20160253405	ALGORITHM TO APPLY A PREDICATE TO DATA SETS	09-01-2016
20160253407	Clustering Method for a Point of Interest and Related Apparatus	09-01-2016
20160253408	Computer-Implemented System And Method For Providing Classification Suggestions	09-01-2016
20160253412	AUTO-CLASSIFICATION SYSTEM AND METHOD WITH DYNAMIC USER FEEDBACK	09-01-2016
20160253750	SYSTEMS AND USER INTERFACES FOR DYNAMIC AND INTERACTIVE INVESTIGATION OF BAD ACTOR BEHAVIOR BASED ON AUTOMATIC CLUSTERING OF RELATED DATA IN VARIOUS DATA STRUCTURES	09-01-2016
20160378755	SYSTEM AND METHOD FOR CREATING USER PROFILES BASED ON MULTIMEDIA CONTENT - A system and method for creating user profiles based on multimedia content. The method may include identifying a plurality of multimedia content elements associated with a user; generating at least one signature for each of the plurality of multimedia content elements; analyzing the at least one signature to identify at least one concept matching the multimedia content elements; generating, based on the at least one matching concept, at least one contextual insight, wherein each contextual insight indicates a preference of the user; and generating, based on the at least one contextual insight, a user profile for the user.	12-29-2016
20160378774	Predicting Geolocation Of Users On Social Networks - A system and method for predicting the location of a user of social media utilizing information related to the interaction of the user with other users of the social media is described.	12-29-2016
20160378776	Identifying Groups For Recommendation To A Social Networking System User Based On User Location And Locations Associated With Groups - A social networking system selects a set of groups for presentation to a user of the social networking system. To select groups, the social networking system identifies candidate groups and selects the set of groups from the candidate groups. To identify certain candidate groups, the social networking system determines a location associated with various groups based on locations associated with users included in the group. For example, the social networking system determines a centroid of a group based on locations associated with users included in the group and associates the centroid with the group if at least a threshold percentage of distances between locations associated with users included in the group and the centroid do not exceed a threshold distance. Groups associated with locations within a threshold distance of a location associated with the user are identified as candidate groups.	12-29-2016
20160378858	CLUSTERING OF SEARCH RESULTS - One particular embodiment clusters a plurality of documents using one or more clustering algorithms to obtain one or more first sets of clusters, wherein: each first set of clusters results from clustering the documents using one of the clustering algorithms; and with respect to each first set of clusters, each of the documents belongs to one of the clusters from the first set of clusters; accesses a search query; identifies a search result in response to the search query, wherein the search result comprises two or more of the documents; and clusters the search result to obtain a second set of clusters, wherein each document of the search result belongs to one of the clusters from the second set of clusters.	12-29-2016
20160378882	INFLUENCE MAP GENERATOR MACHINE - A map generator machine generates influence maps based on profiles of entities, such as members of an online social networking service. The entities can be treated as nodes within a social graph, and each node may be represented by a corresponding node profile. The machine is configured to access a database of node profiles and rank the nodes according to seniority information contained in the node profiles. The machine is further configured to group nodes into clusters based on skill similarity based on skill descriptors included in their corresponding node profiles. The machine is also configured to generate one or more maps to depict one or more the subsets of the nodes. As generated, such a map is a graphical presentation of at least some of the nodes of the social graph, and the map may be generated with visual indicators of seniority and skill similarity.	12-29-2016
20160379239	USER-CONTROLLED PARENTAL CONTROL RATINGS - According to a general aspect, a method includes maintaining rating groups, each rating group providing a rating for content compiled based on information received from a user evaluating the content. The method also includes receiving, from a first user, a selection of a first rating group to be applied to a set of users associated with the first user. The method also includes receiving, from a user, a request for a piece of content. The method also includes determining that the user from which the request was received belongs to the set of users associated with the first user. The method also includes, based upon the determination that the user belonged to the set of users associated with the first user, accessing information associated with the first rating group and determining whether the first rating group includes a rating for the requested piece of content.	12-29-2016
20170235728	INFORMATION PROCESSING APPARATUS, METHOD, PROGRAM AND STORAGE MEDIUM	08-17-2017
20170235777	Method for Effective Dating Object Models	08-17-2017
20170235811	LOCATING DATA IN A SET WITH A SINGLE INDEX USING MULTIPLE PROPERTY VALUES	08-17-2017
20170235812	AUTOMATED AGGREGATION OF SOCIAL CONTACT GROUPS	08-17-2017
20170235814	Method and a System for Efficient Data Sorting	08-17-2017
20170235846	DATA PROCESSING SYSTEM AND METHOD OF ASSOCIATING INTERNET DEVICES BASED UPON DEVICE USAGE	08-17-2017
20170235847	DATA PARTIONING BASED ON END USER BEHAVIOR	08-17-2017
20170236023	Fast Pattern Discovery for Log Analytics	08-17-2017
20180025008	SYSTEMS AND METHODS FOR HOMOGENEOUS ENTITY GROUPING	01-25-2018
20180025020	SYSTEM AND METHOD FOR CONTEXTUALLY ENRICHING A CONCEPT DATABASE	01-25-2018
20180025036	BUILDING OF OBJECT INDEX FOR COMBINATORIAL OBJECT SEARCH	01-25-2018
20180025064	SYSTEM AND METHOD FOR SYNCHRONIZING IDENTIFIERS ASSOCIATED WITH USERS	01-25-2018
20190146919	PROVIDING A DYNAMIC DIGITAL CONTENT CACHE	05-16-2019
20190146982	CLUSTER EVALUATION IN UNSUPERVISED LEARNING OF CONTINUOUS DATA	05-16-2019
20190146986	COMPUTER-IMPLEMENTED METHOD AND SYSTEM FOR COMPETENCY INFORMATION MANAGEMENT	05-16-2019
20190147063	METHOD AND APPARATUS FOR GENERATING INFORMATION	05-16-2019
20190147093	DATA COLLECTION METHOD, INFORMATION PROCESSING APPARATUS, AND DISTRIBUTED PROCESSING SYSTEM	05-16-2019
20190147103	Automatic Hierarchical Classification and Metadata Identification of Document Using Machine Learning and Fuzzy Matching	05-16-2019
20190147104	METHOD AND APPARATUS FOR CONSTRUCTING ARTIFICIAL INTELLIGENCE APPLICATION	05-16-2019
20220138230	SYSTEM AND METHOD FOR OPERATING A DIGITAL STORAGE SYSTEM - A system and method for managing a storage system may include generating, for a data block, a set of tags and a unique name. A set of tags may represent a context. A service related to the data block may be provided in response to receiving at least one of: a tag, a set of tags and a unique name.	05-05-2022
20220138232	VISUALIZATION METHOD, VISUALIZATION DEVICE AND COMPUTER-READABLE STORAGE MEDIUM - A visualization device visualizes plural clustering results. The clustering result ordering unit orders plural clustering results based on quality criteria. Each of the clustering results includes covariate clusters. The hierarchical arrangement unit creates hierarchical tree structure including the covariate clusters as nodes. The created hierarchical structure is displayed.	05-05-2022
20220138238	MASSIVE SCALE HETEROGENEOUS DATA INGESTION AND USER RESOLUTION - This disclosure relates to data association, attribution, annotation, and interpretation systems and related methods of efficiently organizing heterogeneous data at a massive scale. Incoming data is received and extracted for identifying information (“information”). Multiple dimensionality reducing functions are applied to the information, and based on the function results, the information are grouped into sets of similar information. Filtering rules are applied to the sets to exclude non-matching information in the sets. The sets are then merged into groups of information based on whether the sets contain at least one common information. A common link may be associated with information in a group. If the incoming data includes the identifying information associated with to the common link, the incoming data is assigned the common link. In some embodiments, incoming data are not altered but assigned into domains.	05-05-2022
20220138280	Digital Platform for Trading and Management of Investment Securities - A stratified or segmented composite data structure can be formed by selecting a group of data entities, stratifying or segmenting them according to attributes, and assigning relative weights to the components based on their stratified or segmented positions. The attributes are selected from a universe of possible values. Further positive and negative biases can be applied at any arbitrary point or position, including to individual data entities, groups of arbitrarily selected data entities, or arbitrary positions.	05-05-2022

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Clustering and grouping

Subclass of:

707 - Data processing: database and file management or data structures

707705000 - DATABASE AND FILE ACCESS

707736000 - Preparing data for information retrieval

Patent class list (only not empty are listed)

Deeper subclasses: