Patent application number | Description | Published |
20090080526 | DETECTING VISUAL GESTURAL PATTERNS - A processing device and method are provided for capturing images, via an image-capturing component of a processing device, and determining a motion of the processing device. An adaptive search center technique may be employed to determine a search center with respect to multiple equal-sized regions of an image frame, based on previously estimated motion vectors. One of several fast block matching methods may be used, based on one or more conditions, to match a block of pixels of one image frame with a second block of pixels of a second image. Upon matching blocks of pixels, motion vectors of the multiple equal-sized regions may be estimated. The motion may be determined, based on the estimated motion vectors, and an associated action may be performed. Various embodiments may implement techniques to distinguish motion blur from de-focus blur and to determine a change in lighting condition. | 03-26-2009 |
20090091802 | Local Image Descriptors Using Linear Discriminant Embedding - To render the comparison of image patches more efficient, the data of an image patch can be projected into a smaller-dimensioned subspace, resulting in a descriptor of the image patch. The projection into the descriptor subspace is known as a linear discriminant embedding, and can be performed with reference to a linear discriminant embedding matrix. The linear discriminant embedding matrix can be constructed from projection vectors that maximize those elements that are shared by matching image patches or that are used to distinguish non-matching image patches, while also minimizing those elements that are common to non-matching image patches or that distinguish matching image patches. The determination of such projection vectors can be limited such that only orthogonal vectors comprise the linear discriminant embedding matrix. The determination of the linear discriminant embedding matrix can likewise be constrained to avoid overfitting to training data. | 04-09-2009 |
20090252413 | IMAGE CLASSIFICATION - Images are classified as photos (e.g., natural photographs) or graphics (e.g., cartoons, synthetically generated images), such that when searched (online) with a filter, an image database returns images corresponding to the filter criteria (e.g., either photos or graphics will be returned). A set of image statistics pertaining to various visual cues (e.g., color, texture, shape) are identified in classifying the images. These image statistics, combined with pre-tagged image metadata defining an image as either a graphic or a photo, may be used to train a boosting decision tree. The trained boosting decision tree may be used to classify additional images as graphics or photos based on image statistics determined for the additional images. | 10-08-2009 |
20100246969 | COMPUTATIONALLY EFFICIENT LOCAL IMAGE DESCRIPTORS - Described is a technology in which an image (or image patch) is processed into a highly discriminative and computationally efficient image descriptor that has a low storage footprint. Feature vectors are generated from an image (or image patch), and further processed via a polar Gaussian pooling approach (a DAISY configuration) into a descriptor. The descriptor is normalized, and processed with a dimension reduction component and a quantization component (based upon dynamic range reduction) into a finalized descriptor, which may be further compressed. The resulting descriptors have significantly reduced error rates and significantly smaller sizes than other image descriptors (such as SIFT-based descriptors). | 09-30-2010 |
20100284577 | POSE-VARIANT FACE RECOGNITION USING MULTISCALE LOCAL DESCRIPTORS - Representing a face by jointly quantizing features and spatial location to perform implicit elastic matching between features. A plurality of the features are extracted from a face image and expanded with a corresponding spatial location in the face image. Each of the expanded features is quantized based on one or more randomized decision trees. A histogram of the quantized features is calculated to represent the face image. The histogram is compared to histograms of other face images to identify a match, or to calculate a distance metric representative of a difference between faces. | 11-11-2010 |
20100310134 | ASSISTED FACE RECOGNITION TAGGING - The described implementations relate to assisted face recognition tagging of digital images, and specifically to context-driven assisted face recognition tagging. In one case, context-driven assisted face recognition tagging (CDAFRT) tools can access face images associated with a photo gallery. The CDAFRT tools can perform context-driven face recognition to identify individual face images at a specified probability. In such a configuration, the probability that the individual face images are correctly identified can be higher than attempting to identify individual face images in isolation. | 12-09-2010 |
20120141020 | IMAGE CLASSIFICATION - Images are classified as photos (e.g., natural photographs) or graphics (e.g., cartoons, synthetically generated images), such that when searched (online) with a filter, an image database returns images corresponding to the filter criteria (e.g., either photos or graphics will be returned). A set of image statistics pertaining to various visual cues (e.g., color, texture, shape) are identified in classifying the images. These image statistics, combined with pre-tagged image metadata defining an image as either a graphic or a photo, may be used to train a boosting decision tree. The trained boosting decision tree may be used to classify additional images as graphics or photos based on image statistics determined for the additional images. | 06-07-2012 |
20120159404 | DETECTING VISUAL GESTURAL PATTERNS - A processing device and method are provided for capturing images, via an image-capturing component of a processing device, and determining a motion of the processing device. An adaptive search center technique may be employed to determine a search center with respect to multiple equal-sized regions of an image frame, based on previously estimated motion vectors. One of several fast block matching methods may be used, based on one or more conditions, to match a block of pixels of one image frame with a second block of pixels of a second image. Upon matching blocks of pixels, motion vectors of the multiple equal-sized regions may be estimated. The motion may be determined, based on the estimated motion vectors, and an associated action may be performed. Various embodiments may implement techniques to distinguish motion blur from de-focus blur and to determine a change in lighting condition. | 06-21-2012 |
20150055856 | IMAGE CLASSIFICATION - Images are classified as photos (e.g., natural photographs) or graphics (e.g., cartoons, synthetically generated images), such that when searched (online) with a filter, an image database returns images corresponding to the filter criteria (e.g., either photos or graphics will be returned). A set of image statistics pertaining to various visual cues (e.g., color, texture, shape) are identified in classifying the images. These image statistics, combined with pre-tagged image metadata defining an image as either a graphic or a photo, may be used to train a boosting decision tree. The trained boosting decision tree may be used to classify additional images as graphics or photos based on image statistics determined for the additional images. | 02-26-2015 |
Patent application number | Description | Published |
20080226174 | Image Organization - A system for organizing images includes an extraction component that extracts visual information (e.g., faces, scenes, etc.) from the images. The extracted visual information is provided to a comparison component which computes similarity confidence data between the extracted visual information. The similarity confidence data is an indication of the likelihood that items of extracted visual information are similar. The comparison component then generates a visual distribution of the extracted visual information based upon the similarity confidence data. The visual distribution can include groupings of the extracted visual information based on computed similarity confidence data. For example, the visual distribution can be a two-dimensional layout of faces organized based on the computed similarity confidence data—with faces in closer proximity faces computed to have a greater probability of representing the same person. The visual distribution can then be utilized by a user to sort, organize and/or tag images. | 09-18-2008 |
20080279423 | RECOVERING PARAMETERS FROM A SUB-OPTIMAL IMAGE - A subregion-based image parameter recovery system and method for recovering image parameters from a single image containing a face taken under sub-optimal illumination conditions. The recovered image parameters (including albedo, illumination, and face geometry) can be used to generate face images under a new lighting environment. The method includes dividing the face in the image into numerous smaller regions, generating an albedo morphable model for each region, and using a Markov Random Fields (MRF)-based framework to model the spatial dependence between neighboring regions. Different types of regions are defined, including saturated, shadow, regular, and occluded regions. Each pixel in the image is classified and assigned to a region based on intensity, and then weighted based on its classification. The method decouples the texture from the geometry and illumination models, and then generates an objective function that is iteratively solved using an energy minimization technique to recover the image parameters. | 11-13-2008 |
20080310687 | Face Recognition Using Discriminatively Trained Orthogonal Tensor Projections - Systems and methods are described for face recognition using discriminatively trained orthogonal rank one tensor projections. In an exemplary system, images are treated as tensors, rather than as conventional vectors of pixels. During runtime, the system designs visual features—embodied as tensor projections—that minimize intraclass differences between instances of the same face while maximizing interclass differences between the face and faces of different people. Tensor projections are pursued sequentially over a training set of images and take the form of a rank one tensor, i.e., the outer product of a set of vectors. An exemplary technique ensures that the tensor projections are orthogonal to one another, thereby increasing ability to generalize and discriminate image features over conventional techniques. Orthogonality among tensor projections is maintained by iteratively solving an ortho-constrained eigenvalue problem in one dimension of a tensor while solving unconstrained eigenvalue problems in additional dimensions of the tensor. | 12-18-2008 |
20090251594 | VIDEO RETARGETING - Videos are retargeted to a target display for viewing with little to no geometric distortion or video information loss. Salient regions of video frames may be determined using scale-space spatiotemporal information. Video information loss may be a result of spatial loss, due to cropping, and resolution loss, due to resizing. A desired cropping window may be determined using a coarse-to-fine searching strategy. Video frames may be cropped with a window that matches an aspect ratio of the target display, and resized isotropically to match a size of the target display. | 10-08-2009 |
20090316986 | FEATURE SELECTION AND EXTRACTION - Image feature selection and extraction (e.g., for image classifier training) is accomplished in an integrated manner, such that higher-order features are merely developed from first-order features selected for image classification. That is, first-order image features are selected for image classification from an image feature pool, initially populated with pre-extracted first-order image features. The selected first-order classifying features are paired with previously selected first-order classifying features to generate higher-order features. The higher-order features are placed into the image feature pool as they are developed or “on-the-fly” (e.g., for use in image classifier training). | 12-24-2009 |
Patent application number | Description | Published |
20110142298 | FLEXIBLE IMAGE COMPARISON AND FACE MATCHING APPLICATION - Two faces may be compared by calculating distances between different regions of the windows, and choosing one of the distances as the difference between the images. Two images are examined to detect the location of the face in the images. The faces may then be geometrically and photometrically rectified. A sliding window that is smaller than the whole face may be positioned at various locations over the images, and a descriptor is calculated for each window position. The descriptor for a window at one location in one image is compared with descriptors for windows in the neighborhood of that location in the other image. The lowest distance between window descriptors is chosen. The process is repeated for all window positions, resulting in a set of distances. The distances are sorted, and one of the distances is chosen to represent the difference between the two faces. | 06-16-2011 |
20110142299 | RECOGNITION OF FACES USING PRIOR BEHAVIOR - Face recognition may be performed using a combination of visual analysis and social context. In one example, a web site such as a social networking site or photo-sharing site allows users to upload photos, and allows faces that appear in the photo to be tagged with users' names. When user A uploads a new photo, two analyses may be performed. First, a face in the photo is compared with known faces of users to determine similarity. Second, it is determined which other users user A frequently uploads photos of. Two probability distributions are created. One distribution assigns high probabilities to users whose photos are similar to the new photo. The other assigns high probabilities to users who frequently appear in photos uploaded by user A. These probability distributions are combined, and the person in the photo is identified as being the person with the highest probability. | 06-16-2011 |
20110191271 | IMAGE TAGGING BASED UPON CROSS DOMAIN CONTEXT - A method described herein includes receiving a digital image, wherein the digital image includes a first element that corresponds to a first domain and a second element that corresponds to a second domain. The method also includes automatically assigning a label to the first element in the digital image based at least in part upon a computed probability that the label corresponds to the first element, wherein the probability is computed through utilization of a first model that is configured to infer labels for elements in the first domain and a second model that is configured to infer labels for elements in the second domain. The first model receives data that identifies learned relationships between elements in the first domain and elements in the second domain, and the probability is computed by the first model based at least in part upon the learned relationships. | 08-04-2011 |
20140129489 | IMAGE TAGGING BASED UPON CROSS DOMAIN CONTEXT - A method described herein includes receiving a digital image, wherein the digital image includes a first element that corresponds to a first domain and a second element that corresponds to a second domain. The method also includes automatically assigning a label to the first element in the digital image based at least in part upon a computed probability that the label corresponds to the first element, wherein the probability is computed through utilization of a first model that is configured to infer labels for elements in the first domain and a second model that is configured to infer labels for elements in the second domain. The first model receives data that identifies learned relationships between elements in the first domain and elements in the second domain, and the probability is computed by the first model based at least in part upon the learned relationships. | 05-08-2014 |
Patent application number | Description | Published |
20120086863 | Method and Apparatus for Determining Motion - An apparatus comprising a processor and a memory that cause the apparatus to perform receiving a video indicating a motion, generating a set of scalar representations of movement based, at least in part, on at least part of the video, and identifying at least one predetermined motion that correlates to the set of scalar representations of movement is disclosed. | 04-12-2012 |
20120086864 | Method and Apparatus for Determining Motion - An apparatus, comprising a processor and memory configured to cause the apparatus to perform at least the following: receiving a video indicating a motion, generating a set of normalized representations of movement based, at least in part, on the video, evaluating a reference set of representations with respect to the set of normalized representations of the movement, and determining that at least one predetermined motion correlates to the set of normalized representations of the movement based, at least in part, on the evaluation is disclosed. | 04-12-2012 |
Patent application number | Description | Published |
20140161360 | Techniques for Spatial Semantic Attribute Matching for Location Identification - Techniques for spatial semantic attribute matching on image regions for location identification based on a reference dataset are provided. In one aspect, a method for matching images from heterogeneous sources is provided. The method includes the steps of: (a) parsing the images into different semantic labeled regions; (b) creating a list of potential matches by matching the images based on two or more of the images having same semantic labeled regions; and (c) pruning the list of potential matches created in step (b) by taking into consideration spatial arrangements of the semantic labeled regions in the images. | 06-12-2014 |
20140161362 | Techniques for Spatial Semantic Attribute Matching for Location Identification - Techniques for spatial semantic attribute matching on image regions for location identification based on a reference dataset are provided. In one aspect, a method for matching images from heterogeneous sources is provided. The method includes the steps of: (a) parsing the images into different semantic labeled regions; (b) creating a list of potential matches by matching the images based on two or more of the images having same semantic labeled regions; and (c) pruning the list of potential matches created in step (b) by taking into consideration spatial arrangements of the semantic labeled regions in the images. | 06-12-2014 |
20140205186 | Techniques for Ground-Level Photo Geolocation Using Digital Elevation - Techniques for generating cross-modality semantic classifiers and using those cross-modality semantic classifiers for ground level photo geo-location using digital elevation are provided. In one aspect, a method for generating cross-modality semantic classifiers is provided. The method includes the steps of: (a) using Geographic Information Service (GIS) data to label satellite images; (b) using the satellite images labeled with the GIS data as training data to generate semantic classifiers for a satellite modality; (c) using the GIS data to label Global Positioning System (GPS) tagged ground level photos; (d) using the GPS tagged ground level photos labeled with the GIS data as training data to generate semantic classifiers for a ground level photo modality, wherein the semantic classifiers for the satellite modality and the ground level photo modality are the cross-modality semantic classifiers. | 07-24-2014 |
20140205189 | Techniques for Ground-Level Photo Geolocation Using Digital Elevation - Techniques for generating cross-modality semantic classifiers and using those cross-modality semantic classifiers for ground level photo geo-location using digital elevation are provided. In one aspect, a method for generating cross-modality semantic classifiers is provided. The method includes the steps of: (a) using Geographic Information Service (GIS) data to label satellite images; (b) using the satellite images labeled with the GIS data as training data to generate semantic classifiers for a satellite modality; (c) using the GIS data to label Global Positioning System (GPS) tagged ground level photos; (d) using the GPS tagged ground level photos labeled with the GIS data as training data to generate semantic classifiers for a ground level photo modality, wherein the semantic classifiers for the satellite modality and the ground level photo modality are the cross-modality semantic classifiers. | 07-24-2014 |